The Trident RAS (Resource Aware Scheduler) API provides a mechanism to allow users to specify the resource consumption of a Trident topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the Resource Aware Scheduler Overview
First, an example:
TridentTopology topo = new TridentTopology();
TridentState wordCounts =
topology
.newStream("words", feeder)
.parallelismHint(5)
.setCPULoad(20)
.setMemoryLoad(512,256)
.each( new Fields("sentence"), new Split(), new Fields("word"))
.setCPULoad(10)
.setMemoryLoad(512)
.each(new Fields("word"), new BangAdder(), new Fields("word!"))
.parallelismHint(10)
.setCPULoad(50)
.setMemoryLoad(1024)
.groupBy(new Fields("word!"))
.persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
.setCPULoad(100)
.setMemoryLoad(2048);
Resources can be set for each operation (except for grouping, shuffling, partitioning). Operations that are combined by Trident into single Bolts will have their resources summed.
Every Bolt is given at least the default resources, regardless of user settings.
In the above case, we end up with
Split
and BangAdder
The API can be called as many times as is desired.
It may be called after every operation, after some of the operations, or used in the same manner as parallelismHint()
to set resources for a whole section.
Resource declarations have the same boundaries as parallelism hints. They don't cross any groupings, shufflings, or any other kind of repartitioning.