Monitoring and Error Recovery

Monitor Everything from Your Web Browser

Using Flux’s web-based Operations Console, watch over and monitor all your workflows and file transfers. Designed for the needs of operational staff and usable by developers too.

Control Workflows and Jobs

Submit workflows to run as jobs on a Flux engine or cluster. Pause, restart, resume, interrupt, and recover Flux jobs. Remove a job (cancel it) from an engine or cluster. Expedite jobs that are waiting on a human intervention.

Near Real-Time Updates

The Flux Operations Console uses automated, asynchronous, Ajax technology to keep your browser window updated with the status of all your Flux activities - (replace the comma with a em dash) automatically updated in near real-time.

Watch File Transfers in Real-Time

Watch file transfers taking place within workflows. Monitor file transfer statistics like source and destination hosts, file transfer protocol, transfer rate, and estimated time remaining.

Monitor Thousands of Workflows and File Transfers Simultaneously

The Flux Operations Console can monitor thousands of running workflows (and their file transfers) at the same time. Filter the results and drill into specific sets of work and issues.

When Something Goes Wrong

React accordingly when a workflow encounters an error or a file transfer fails. Configure automated responses to errors or call out for human attention.

Automated and Manual Error Recovery

At any point in a workflow, define automated or manual error handlers to resolve, retry, or report the situation.

Sophisticated Error Handling

Flux’s error handlers can be defined in terms of a workflow, allowing you to create sophisticated error recovery mechanisms.

Reuse Error Handlers

Error handlers can be defined once and used throughout a workflow or throughout an entire collection of workflows.

Agents

Using a Flux Agent on a remote computer, you can watch for files on remote computers, copy files to and from agents, and run processes on those agents.

Take Advantage of Special Resources

Sometimes a process needs to run on a specific computer that contains unique resources. Install a Flux Agent on that computer and run tasks on that computer using Flux.

Pools

You can define a group of computers into a pool, and then schedule processes to run on any one of the machines within that pool.

Agentless Scheduling

Sometimes, you don’t want to install special agent software on your computers. Flux supports so-called agentless scheduling that uses SSH to run processes on remote machines.