Flux Manual

Scheduling Options

Once you have created your flow charts, scheduling it is simple. Call engine.put(flowChart). the flow chart will be added to the Flux engine and scheduled for immediate execution.

To remove your job from the engine, save the job ID returned from engine.put() and call engine.remove(jobId).

To delete all jobs, messages, and publishers from the engine in one step, call engine.clear().

Concurrency Throttles

Flow Chart firings can be single threaded. That means that at most one flow chart can be firing at a time. To allow more than one flow chart to execute simultaneously, you will need to increase the engine's concurrency throttles.

You can define a concurrency throttle that governs a single Flux instance, or you can define a concurrency throttle that governs all flow charts in all Flux instances on your cluster. These coarse-grained concurrency throttles control the parallel execution of large numbers of jobs in one step.

You can also define fine-grained concurrency throttles. For each branch in your flow chart tree, you can define a unique concurrency throttle that governs only the flow charts in that branch.

For example, assume that two long-running flow charts are scheduled to fire one after the other. With a concurrency level of one, the second flow chart cannot run until the first flow chart completes. With a higher concurrency level, on the other hand, both flow charts will execute simultaneously.

Choose a suitable concurrency throttle for your application. Lower concurrency throttles conserve system resources while higher concurrency throttles can finish tasks faster. To prevent your flow charts from consuming all of your resources, Flux limits the concurrency throttle to 150. If you attempt to set a concurrency throttle higher than this, Flux will automatically substitute a concurrency level of 150.

As a general rule of thumb, if your flow charts are mostly I/O bound you should set a concurrency throttle of 2-3 times the number of CPUs in the macine that will execute your flow charts. If, on the other hand, your jobs are mostly CPU bound, you should choose a concurrency throttle of just 1-2 times the number of CPUs in your machine. Generally, it is a good idea to set the concurrency throttle 2 to 3 higher than the number of CPUs in the machine. For example, a quad-CPU machine should have a concurrency throttle of 6 or 7 for CPU bound flow charts, but could have a concurrency throttle of 12-15 for I/O bound flow charts.

Of course, concurrency throttles are tunable parameters, and you will probably need to experiment until you find the right value for your machine.

The default concurrency throttle is 1. To inrease the concurrency throttle to 10, you would adjust the RuntimeConfigurationNode in your engine configuration like so:

// Lookup the engine.

Engine madeEngine = factory.lookupRmiEngine("enginehost");

// Retrieve the runtime configuration node.
RuntimeConfigurationNode rootNode = madeEngine.getRuntimeConfiguration();

// Update the root node with the new concurrency throttle.
rootNode.setConcurrencyThrottle("10");

Or if your engine did not have a previously created configuration, you would use the follow code:

Factory factory = Factory.makeInstance();



// Create a Runtime Configuration Factory.
RuntimeConfigurationFactory runtimeFactory = factory.makeRuntimeConfigurationFactory();

// Create a Runtime Configuration root node.
RuntimeConfigurationNode rootNode = runtimeFactory.makeRuntimeConfiguration();

// Set the concurrency throttle to 10.
rootNode.setConcurrencyThrottle("10");

// Set your Runtime Configuration to your Configuration Object.
config.setRuntimeConfiguration(rootNode);

To set a cluster-wide standard concurrency throttle, replace rootNode.setConcurrencyThrottle("10"); from the code above with the following:

rootNode.setConcurrencyThrottleClusterWide("10");

This call specifies that no more than 10 flow charts across the entire cluster are allowed to fire concurrently.

Different Concurrency Throttles for Different Kinds of Jobs

As mentioned before, flow chart names are hierarchical. They form a tree of jobs.

Suppose your tree of jobs has three branches off the root:

  • /heavyweight For long running, CPU-intensive jobs.

  • /lightweight For short jobs or I/O bound jobs.

  • /misc For a variety of other jobs.

With a standard concurrency throttle as described above, these three kinds of jobs are treated the same. For example, suppose your standard concurrency throttle is 5. Then it is possible to be running 5 heavyweight jobs at once, which could slow your computer to a crawl.

Instead, you need a different concurrency throttle for each kind of job. For heavyweight jobs, a concurrency throttle of 1 is appropriate. For lightweight jobs, a concurrency throttle of 10 is appropriate. Finally, for all other jobs, a concurrency throttle of 5 is appropriate.

With different concurrency throttles for different kinds of jobs, you will not have 5 heavyweight jobs running at once. At most, you will have one such job running at a time, however, it will not prevent lightweight jobs from having an opportunity to run.

To configure the concurrency throttles for your jobs, you must define a runtime configuration. This runtime configuration annotates the tree of jobs with configuration properties.

For example, the /heavyweight branch of the tree of jobs is annotated with a concurrency throttle of 1. Similarly, the /lightweight branch of the tree of jobs is annotated with a concurrency throttle of 5.

For example code, see the Concurrency Level example, which provides a complete code example of how to create and initialize flux.RuntimeConfigurationNode objects, which are used to configure these different concurrency levels. Also, see the Javadoc for flux.Configuration.RUNTIME_CONFIGURATION.

Note that if you configure your concurrency throttles on a per engine basis, the concurrency limitations are enforced on a per engine basis. However, if you configure your concurrency throttles on a cluster-wide basis using the CONCURRENCY_THROTTLE_CLUSTER_WIDE configuration property in your runtime configuration, then the concurrency limitations are enforced cluster-wide.

This flexible enforcement allows you to limit the number of concurrent jobs that are executed across the entire cluster or within a single engine.

Concurrency Throttles Based on Other Running Jobs

Finally, you can define concurrency throttles that are based on other running jobs. For example, suppose you have a "data generation" job and a "data analysis" job. Further suppose that in your system, it is not possible to run the data generation job at the same time as the data analysis job.

You can define a concurrency throttle to enforce this constraint.

Suppose the data generation job is named "/data generation/producer job" and the data analysis job is named "/data analysis/consumer job".

To prevent the producer and consumer jobs from running at the same time, set two concurrency throttles like so:

  • Producer job concurrency throttle: /data analysis/ <= 0

  • Consumer job concurrency throttle: /data generation/ <= 0

The producer job stipulates that it will run only if there are no data analysis jobs currently running. Similarly, the consumer job indicates that it will run only if there are no data generation jobs currently running.

The syntax of this kind of concurrency throttle includes the name of the branch in the job tree, followed by the relational operator "<" or "<=", and followed by a non-negative number.

Note that the relational operator "=" is not supported in order to avoid creating concurrency throttles that are impossible to satisfy.

Pinning Jobs on Specific Flux Instances

You can specify that certain kinds of jobs must run on certain Flux instances in your Flux cluster.

For example, if only one of the computers in your Flux cluster has report generation software installed on it, you can specify that all of your reporting jobs have to be executed on that computer only.

In general, you can specify that a particular Flux instance will, or will not, run certain kinds of jobs. By pinning certain jobs on certain computers, you can make sure that your jobs run on the computers that have the resources that those jobs need.

For example, suppose there are Unix and Windows computers in your Flux cluster. Further suppose that you have three kinds of jobs:

  • Windows batch files that can run on Windows only.

  • Unix shell scripts that can run on Unix only.

  • Java and JEE jobs that can run on any computer.

You can create concurrency throttles to enforce these constraints.

First, organize your jobs into different branches in the tree of jobs. For example:

/windows/batch file 1

/windows/batch file 2

/unix/shell script A

/unix/shell script B

/java/java job 1

/java/java job 2

On the Windows computers, use a Flux runtime configuration like so:

/windows/CONCURRENCY_THROTTLE=5

/unix/CONCURRENCY_THROTTLE=0

/java/CONCURRENCY_THROTTLE=5

The above runtime configuration allows up to 5 Windows jobs to run at the same time, zero Unix jobs to run, and up to 5 Java jobs to run concurrently on the Windows computers.

Similarly, on the Unix computers, use a Flux runtime configuration like so:

/windows/CONCURRENCY_THROTTLE=0

/unix/CONCURRENCY_THROTTLE=5

/java/CONCURRENCY_THROTTLE=5

The above Unix runtime configuration is the opposite of the Windows runtime configuration. It allows up zero Windows jobs to run, up to 5 Unix jobs to run at the same time, and up to 5 Java jobs to run concurrently on the Unix computers.

By using a different runtime configuration on each kind of computer in your cluster, you can control the kinds of jobs that the different kinds of computers will execute.

See the "job pinning" example in the examples/end_users/job_pinning directory for details on how to pin certain jobs on certain computers in your cluster. Also, see the “concurrency_throttle” and “concurrency_throttle_cluster_wide” examples located in the examples/software_developers/concurrency_throttle and examples/software_developers/concurrency_throttle_cluster_wide directories under the Flux installation directory.

As an alternative to job pinning, you can use Flux Agents. Briefly, an agent is a lightweight Flux component located on a remote computer that executes Process Actions on behalf of a Flux engine. Using agents, it’s easy to specify that Unix processes should execute on agents residing on Unix computers and Windows processes should run on Windows computers, by way of a Flux agent on each of those computers.

For more information on Flux Agents, see (cross reference).

Pausing and Resuming Jobs

If you need to temporarily prevent a flow chart from firing, you can pause it. Some time later, you can resume it, and it will begin firing again. Paused flow charts do not fire.

To pause a flow chart, call Engine.pause(String flowChartId). Likewise, to resume a paused job, call Engine.resume(String flowChartId). You can also pause all jobs that are of a certain type by calling Engine.pauseByType(String type). Similarly, call Engine.resumeByType(String type) to resume all paused job that are of a given type.