User Configuration¶
CP Algorithms are configured primarily through YAML configuration files, which provide a declarative way to specify analysis settings. Python configuration is also available for integration into larger Athena jobs where programmatic control is needed.
For more information on the various blocks you can check:
- the beginners tutorial
- the TopCPToolKit documentation
- the actual blocks in the repository (look for
*Config.pyfiles)
Example YAML Configuration¶
Here is a simple YAML configuration that sets up a typical analysis with muons, electrons, jets, missing transverse energy (MET), and overlap removal:
CommonServices: {}
PileupReweighting: {}
EventCleaning:
runEventCleaning: True
Muons:
- containerName: AnaMuons
WorkingPoint:
- selectionName: medium
quality: Medium
isolation: Loose_VarRad
Electrons:
- containerName: AnaElectrons
WorkingPoint:
- selectionName: loose
identificationWP: LooseBLayerLH
isolationWP: Tight_VarRad
Jets:
- containerName: AnaJets
jetCollection: AntiKt4EMPFlowJets
JVT: {}
MissingET:
- containerName: AnaMET
jets: AnaJets
electrons: AnaElectrons.loose
muons: AnaMuons.medium
OverlapRemoval:
inputLabel: preselectOR
outputLabel: passesOR
jets: AnaJets.baselineJvt
electrons: AnaElectrons.loose
muons: AnaMuons.medium
Output:
treeName: analysis
containers:
mu_: AnaMuons
el_: AnaElectrons
jet_: AnaJets
met_: AnaMET
"": EventInfo
Configuration Structure¶
Top-Level Sections¶
Each top-level key in the YAML file corresponds to a configuration block (ConfigBlock). Configuration blocks are the building blocks of CP Algorithm configuration, each handling a specific aspect of the analysis (e.g., muon reconstruction, jet calibration, overlap removal).
Some sections will be defined as lists, even though we expect most users to define just one entry. However, it being a list means that if you are unsure e.g. which muon calibration mode to use you can just define a second muon container in a second section and get both in the same n-tuple and can directly compare the results you get with either.
Sometimes you want to use all options at their default value, the YAML
way of doing so is to add {} in the same line. E.g. above we use a
default-configured CommonServices block like this:
CommonServices: {}
Note that there are a couple of blocks you'll always need:
CommonServices: this defines the services needed for systematics and selection trackingPileupReweighting: there may be some situations in which it can be skipped, but in general you will need it to set up theRandomRunNumberon Monte Carlo- if you want to calculate
MissingETyou will always needJetsandMuonsjust based on howMissingETis calculated (at least the calorimeter-based MET)
Container Names¶
Container names (e.g., AnaMuons, AnaElectrons, AnaJets) are user-defined
identifiers for object collections. These names are used to:
- Reference objects in subsequent configuration blocks
- Identify containers in the output tree
- Connect related objects (e.g., linking electrons to MET calculation)
Working Points¶
Working points define named selections with specific quality and isolation
criteria. Each working point has a selectionName that can be used to reference
the selection later:
Muons:
- containerName: AnaMuons
WorkingPoint:
- selectionName: medium
quality: Medium
isolation: Loose_VarRad
- selectionName: tight
quality: Tight
isolation: Tight_VarRad
Object References¶
In some cases you need to selected or "good" objects, e.g. to specify
the objects to run MET, OR, or event selection on. These are usually
referenced using the syntax ContainerName.selectionName:
AnaElectrons.loose- electrons passing the "loose" working pointAnaMuons.medium- muons passing the "medium" working pointAnaJets.baselineJvt- jets passing baseline JVT selection (default selection name in the JVT block)
When no selection is specified (e.g., just AnaJets), all objects in
the container are used. For more details see the page on
selections.
Output Section¶
The Output section configures the output n-tuple tree. For
comprehensive documentation on all output options, systematics handling
in output, MET variable configuration, and how to write config blocks
that register output variables, see the N-Tuple
Output page.
Basic example:
Output:
treeName: analysis
containers:
mu_: AnaMuons
el_: AnaElectrons
jet_: AnaJets
met_: AnaMET
"": EventInfo
Configuration blocks automatically register output variables as they are
configured, so many variables will be written without explicit
configuration. You can use the commands option to enable, disable, or
rename specific variables, and the vars option to add variables not
automatically registered. See N-Tuple Output for
details.
Expert Options¶
Some advanced options are marked as "expert mode" and require explicit opt-in. These options are typically for CP studies, cross-checks, or unusual configurations that should not be used in standard physics analyses.
To enable expert mode options, set enableExpertMode: True in CommonServices:
CommonServices:
enableExpertMode: True
When expert mode is not enabled and you try to use an expert option, an error will be raised. When expert mode is enabled, a warning will still be generated for each expert option used.
Property Overrides¶
The propertyOverrides option allows directly setting properties on the
underlying algorithms and tools. This is an expert-only feature and should be
used sparingly:
Muons:
- containerName: AnaMuons
propertyOverrides:
MuonCalibrationAndSmearingAlg.calibrationAndSmearingTool.calibMode: 3
The format is AlgorithmName.toolName.propertyName: value. Note that this
bypasses the normal configuration system, so other algorithms that depend on
the overridden value will not be updated accordingly.
Warning
Property overrides should only be used when no regular option exists for the setting you need to change. If you find yourself needing a property override regularly, consider requesting a proper option be added to the configuration block.
Python Configuration¶
Python configuration is available for integration into larger Athena jobs where
programmatic control is needed. The basic pattern uses ConfigFactory to create
configuration blocks and ConfigSequence to assemble them:
from AnalysisAlgorithmsConfig.ConfigFactory import ConfigFactory
from AnalysisAlgorithmsConfig.ConfigSequence import ConfigSequence
configSeq = ConfigSequence()
factory = ConfigFactory()
# Add muon configuration
subConfig = factory.makeConfig('Muons')
subConfig.setOptionValue('.containerName', 'AnaMuons')
configSeq += subConfig
# Add muon working point
subConfig = factory.makeConfig('Muons.WorkingPoint')
subConfig.setOptionValue('.containerName', 'AnaMuons')
subConfig.setOptionValue('.selectionName', 'medium')
subConfig.setOptionValue('.quality', 'Medium')
subConfig.setOptionValue('.isolation', 'Loose_VarRad')
configSeq += subConfig
For a complete real-world example of Python configuration, see the PHYSLITE derivation
Notes from the Developers¶
We redesigned the configuration with the goal of making the primary configuration format a text file, as that was the predominant configuration style for a number of popular formats at the time. Though unlike most of those frameworks we settled on YAML instead of a completely free form and custom file format. A number of design choices in the configuration system were made with that choice in mind.
The way we originally designed the configuration file is that we started by designing how we wanted that file to look like first. Only then did we look at how to map that onto the configuration mechanisms and algorithm structure, and what adjustments to the original design are necessary. Where possible that is the preferred direction, but it is more work than simply to wrap your algorithms into blocks and just accepting whatever configuration layout results from that. Though in practice both approaches can yield the same result, particularly if you follow existing examples.
The python configuration is less fully developed than the text configuration, in part because we only expected it to be used in some edge cases. It may be worth completely redesigning the interface.