Component Accumulator
Purpose of the Component Accumulator
The ComponentAccumulator is a container for storing configuration of components. It may contain configuration of multiple components that form a consistent set (i.e. an algorithm & tools/services it needs for execution). The job configuration is built by merging several ComponentAccumulators.
...
topCA.merge(alg1CA)
topCA.merge(alg2CA)
...
Deduplication
During the merging it may occur that ComponentAccumulators (here topCA
and alg1CA
) contain the configuration of the same component.
The ComponentAccumulator has functionality to unify their settings. So you don't need to worry about adding a service/tool twice. On the contrary, ComponentAccumulator instances are supposed to be as much as possible self-contained, so it is recommended to add services/tools needed for an algorithm to work to the ComponentAccumulator and not to rely on the user to add what is necessary.
The settings unification process is called deduplication and is applied to every component in the merged ComponentAccumulators. It works as follows:
- Components that have the same type but different name are left intact (i.e. both instances are kept) as this is regular case.
- Components that have the same type, name and all properties are silently ignored.
- Components that have the same type, the same name and differently set properties are subject to the unification process. For all differently set properties the unification is attempted. It relies on the semantics that can be defined for each configurable parameter separately (see more). By default, though, any differences are considered a mistake in the configuration and results in configuration failure.
Writing Configuration Methods
Methods that configure pieces of the job instantiate a ComponentAccumulator and add to it services, algorithms, etc. as needed. They can call other configuration methods (to obtain the configuration of components they depend on) and merge the result with their own ComponentAccumulator.
As parameters they always take a configuration flags container as the first argument and potentially other args
and kwargs
(positional and keyword arguments), as discussed the naming conventions mentioned earlier
A typical implementation looks like this:
from AthenaConfiguration.ComponentAccumulator import ComponentAccumulator
from AthenaConfiguration.ComponentFactory import CompFactory
def MyAlgoCfg(flags, name="MyAlgo", **kwargs):
acc = ComponentAccumulator()
# Call the config method of a service that we need:
from ARequiredSvcPack.ARequiredSvcPackConfig import SuperServiceCfg
# We get an accumulator containing possibly other components that our SuperService
# depends upon and the service itself
svcAcc = SuperServiceCfg(flags)
# SuperService is the primary component that SuperServiceConfig configures
# Get it, so we can attach it to the ServiceHandle of an algorithm
kwargs.setdefault("svcHandle", svcAcc.getPrimary())
# Merge its accumulator with our accumulator to absorb all required dependencies
acc.merge(svcAcc)
# NOTE: A shorthand for the above three lines is (more details below)
# kwargs.setdefault("svcHandle", acc.getPrimaryAndMerge(SuperServiceCfg(flags)))
# Set additional properties
kwargs.setdefault("isData", not flags.Input.isMC)
# instantiate algorithm configuration object setting its properties
acc.addEventAlgo(CompFactory.MyAlgo(name, **kwargs))
# Return our accumulator (containing SuperService and its dependencies)
return acc
Please note that, as per the naming conventions, the function is named MyAlgoCfg
after the main component it configures (the algorithm MyAlgo
in this case).
The first argument passed is flags
(as always), the second is a name (which can be defaulted and potentially overridden by the user) and, since this function configures one main component, we also pass kwargs
that set the properties of this MyAlgo
.
ComponentAccumulators with private tools
Private AlgTools
are special because they can't exist without a parent. There is no meaningful way for accumulating them elsewhere.
However, configuration methods configuring a private AlgTool
with all its dependencies yet without a parent is still a valid use-case.
To allow this, the ComponentAccumulator class has methods setPrivateTools
and popPrivateTools
.
A function configuring an AlgTool
returns an instance of ComponentAccumulator that has the tool attached via setPrivateTool
and contains all the auxiliary components (services, conditions Algorithm etc.) that the tool needs to work.
The caller then obtains the private tool via the popPrivateTools
method and assigns it to the PrivateToolHandle
of the parent and merges the returned ComponentAccumulator with its own ComponentAccumulator.
This works for a single AlgTool
as well as for lists of AlgTool
s that are typically assigned to a PrivateToolHandleArray
.
Merging a ComponentAccumulator that has still private tools attached (e.g. popPrivateTools
was never called) will raise an exception complaining about a dangling private tool.
The ComponentAccumulator provides a handy shortcut method popToolsAndMerge
(and getPrimaryAndMerge
, see the next section) that does aforementioned two operations at once.
The example below illustrates how this works is as follows:
def PrivateToolCfg(flags, **kwargs):
acc = ComponentAccumulator()
...
# merge/add dependencies
acc.addService(CompFactory.SomeService(...))
kwargs.setdefault("PropertyA", 1.0)
# it is recommended not to give a custom name to the private tool
acc.setPrivateTools(CompFactory.ToolA(**kwargs))
return acc
def AlgorithmCfg(flags, **kwargs):
acc = ComponentAccumulator()
tool = acc.popToolsAndMerge(PrivateToolCfg(flags))
# or longer alternative
toolAcc = PrivateToolCfg(flags)
tool = toolAcc.popPrivateTools()
acc.merge(toolAcc)
kwargs.setdefault("Tool", tool)
acc.addEventAlgo(CompFactory.Alg("MyAlg",
**kwargs))
return acc
Designating the primary components
When the ComponentAccumulator is a result of merging of several smaller components it may be useful to designate a component that is a primary concern for a configuration function. This way client code does not need to discover the component by name.
It may also be that the primary component may change depending on the flags. Also in this situation it is convenient to shield the client code by specifying the primary component. Such example is shown below:
def ToolCfg(flags):
# Here we are not passing kwargs, since it is not obvious what the primary component will be
acc = ComponentAccumulator()
# this adds some public tools
acc.merge(OtherToolsCfg(flags))
# not a primary component
acc.addPublicTool(CompFactory.ToolX("X", setting=...))
# configure different tools depending on the flag value
# here we will designate the primary component
if flags.addA:
acc.addPublicTool(CompFactory.ToolA("ToolA", settingX=...), primary=True)
else:
acc.addPublicTool(CompFactory.ToolA("ToolB", settingY=...), primary=True)
return acc
def ConsumerCfg(flags, **kwargs):
acc = ComponentAccumulator()
# instead of code like this:
# toolAcc.getPublicTool("ToolA" if flags.flagA else "ToolB") we can do
tool = acc.getPrimaryAndMerge(ToolCfg(flags)) # no need to agree on the name of the tool # configured by toolCfg
# it possible to go the above in steps:
# tempAcc = ToolCfg(falgs)
# tool = tempAcc.getPrimary()
# acc.merge(tempAcc)
# however the shortcut method getPrimaryAndMerge is provided for your convenience
kwargs.setdefault("ToolX", tool)
acc.addEventAlgo(CompFactor.MyAlg("MyAlg", **kwargs))
return acc
Caching of configuration results
Configuration methods that are called many times may profit from caching their result:
from AthenaConfiguration.AccumulatorCache import AccumulatorCache
@AccumulatorCache
def MyAlgoCfg(flags, name="MyAlgo", **kwargs):
...
MyAlgoCfg
(similar to Python's lru_cache
) if the
function is called with the same flags (and other parameters) multiple times. The decorator
has a few (mostly experts) options that are documented in
AccumulatorCache. It can also print the cache hit/miss statistics via:
from AthenaConfiguration.AccumulatorCache import AccumulatorDecorator
AccumulatorDecorator.printStats()
Warning
Do not blindly apply this decorator. Only use it for methods that are known to be hot spots.
There is some more discussion about how to find hotspots which could benefit from caching in "Profiling and optimising configuration" later.
ComponentAccumulator API
The ComponentAccumulator has the following methods to add components to it:
merge(other, sequenceName=None)
Merge in another instance of ComponentAccumulator. Deduplication is applied. All algorithms form theother
top sequence are added to the destination sequence if the second argument is provided. Else the sequence structure is merged.addSequence(sequence, parentName=None)
Add a sequence, by default to the top-sequence of the accumulator. If second argument is provided thesequence
is added as a subsequence of the sequence with that name. Handy methods to create various types of sequences (parallel/serial with AND/OR logic) are defined in CFElements.addEventAlgo(algo,sequenceName=None,primary=False)
Add one event-processing algorithm, by default to the top-sequence of the accumulator. IfsequenceName
argument is provided algorithm is added to this sequence.addCondAlgo(algo,primary=False)
Add one conditions-processing algorithm. Subject to deduplication.addService(newSvc,primary=False,create=False)
Add one service. Subject to deduplication. Ifcreate
is set the service is added to the set of services forcibly created by Athena early in the job even without any client requiring it.addPublicTool(tool,primary=False)
Add one public tool. Subject to deduplication. Note: Public tools are deprecated for run 3. This feature will be removed.setPrivateTools(tool or list of tools)
Temporarily attach privateAlgTool
(or list of privateAlgTool
s) to the accumulator. They need to be removed before merging.
For the explanation of the primary
option see above.
Exceptions with ConfigurationError
, DeduplicationFailure
or plain TypeError
are raised in case of misuse of these methods.
The ComponentAccumulator can be queried with these methods:
getEventAlgo(name)
Get an event-processing algorithm by name.getEventAlgos(seqName=None)
Get all event algorithms (if sequence name is provided all algorithms in this and nested sequences).getCondAlgo(name)
Get a conditions processing algorithm by name.getService(name)
Get a service by name.getPublicTool(name)
Get a public tool by name Note: Public tools are deprecated for run 3. This feature will be removed.getSequence(SequenceName=None)
Returns a sequence (by searching the tree of sequences). By default returns the top sequence of the accumulator.popPrivateTools()
Returns theAlgTool
or list ofAlgTool
s previously attached to the accumulator.getPrimary()
Returns the component that is designated to be the primary one (see above for explanation).
Additional methods are available for the use in top level scripts for running the configuration contained in the ComponentAccumulator.
run(maxEvents=None,OutputLevel=INFO)
That starts the athena execution.store(outfile):
Saves the configuration in the python pickle format to a file (The file needs to be open withopen
and closed after invocation ofstore
).
Content of the configuration can be printed with:
printConfig(withDetails=False, summariseProps=False, onlyComponents = [], printDefaults=False, printComponentsOnly=False)
Various flags define level of details that will be emitted from this function. The meaning should be obvious.
Running configuration stored in ComponentAccumulator / self testing
The top level configuration file would need to:
- setup flags,
- import the flags (see code around
initConfigFlags
function below), - change their values as desired (typically set the input file),
- possibly can add some new flags to make your job configurable from command line (see code around
RunThis/RunThat
below), - update flag values from command line (see
fillFromArgs()
), - lock the flags,
- import the flags (see code around
- setup main services (see
MainServicesCfg
), - add the components you need,
- run this configuration by calling
acc.run()
, - handle possible execution error (basically check the return
StatusCode
from the Athena).
A "Hello World" example can be found in Athena HelloAlg example config.
Coincidentally, in an identical way a self-test for configuration fragment can be setup.
It is advised that it is part of each file defining the function that generates configuration. The only difference is that the setup has to be wrapped in the
if __name__ == "__main__":
clause that prevents it from being executed when the file is imported as a module.
If the configuration contains algorithms a short test job can be run.
Configurations without the algorithms can still be tested in this way to some extent.
# assume this is the content of MyAlgConfig.py
def MyAlgCfg(flags):
acc = ComponentAccumulator()
...
return acc
if __name__ == "__main__": # typically not needed in top level script
# import the flags and set them
from AthenaConfiguration.AllConfigFlags import initConfigFlags
flags = initConfigFlags()
# potentially add a flag that can be modified via command line
# and that mak this script more universal
flags.addFlag("RunThis", False)
flags.addFlag("RunThat", False)
...
flags.Exec.MaxEvents = 3
...
# use one of the predefined files
from AthenaConfiguration.TestDefaults import defaultTestFiles
flags.Input.Files = defaultTestFiles.RAW
flags.fillFromArgs() # make the job understand command line options
# lock the flags
flags.lock()
# create basic infrastructure
from AthenaConfiguration.MainServicesConfig import MainServicesCfg
acc = MainServicesCfg(flags)
# add the algorithm to the configuration
acc.merge(MyAlgCfg(flags))
# or make it conditional on your flags:
if flags.RunThis:
# ... add/do something
if flags.RunThat:
# ... add/doo something else
# debug printout
acc.printConfig(withDetails=True, summariseProps=True)
# run the job
status = acc.run()
# report the execution status (0 ok, else error)
import sys
sys.exit(not status.isSuccess())
This script it then runnable with the command:
python -m PackageName.MyAlgConfig
CMakeLists.txt
atlas_add_test( MyAlgConfig
SCRIPT python -m PackageName.MyAlgConfig
POST_EXEC_SCRIPT noerror.sh)
# another variant of the test differing via flag
atlas_add_test( MyAlgConfigRunThat
SCRIPT python -m PackageName.MyAlgConfig RunThat=1
POST_EXEC_SCRIPT noerror.sh)
The functionality to interpret command line options as flags flags.fillFromArgs()
is documented here (see more).
More examples of top level applications: Run3DQTestingDriver.py, RecoSteering.py.