User Configuration¶

CP Algorithms are configured primarily through YAML configuration files, which provide a declarative way to specify analysis settings. Python configuration is also available for integration into larger Athena jobs where programmatic control is needed.

For more information on the various blocks you can check:

the beginners tutorial
the TopCPToolKit documentation
the actual blocks in the repository (look for *Config.py files)

Example YAML Configuration¶

Here is a simple YAML configuration that sets up a typical analysis with muons, electrons, jets, missing transverse energy (MET), and overlap removal:

CommonServices: {}

PileupReweighting: {}

EventCleaning:
  runEventCleaning: True

Muons:
  - containerName: AnaMuons
    WorkingPoint:
      - selectionName: medium
        quality: Medium
        isolation: Loose_VarRad

Electrons:
  - containerName: AnaElectrons
    WorkingPoint:
      - selectionName: loose
        identificationWP: LooseBLayerLH
        isolationWP: Tight_VarRad

Jets:
  - containerName: AnaJets
    jetCollection: AntiKt4EMPFlowJets
    JVT: {}

MissingET:
  - containerName: AnaMET
    jets: AnaJets
    electrons: AnaElectrons.loose
    muons: AnaMuons.medium

OverlapRemoval:
  inputLabel: preselectOR
  outputLabel: passesOR
  jets: AnaJets.baselineJvt
  electrons: AnaElectrons.loose
  muons: AnaMuons.medium

Output:
  treeName: analysis
  containers:
    mu_: AnaMuons
    el_: AnaElectrons
    jet_: AnaJets
    met_: AnaMET
    "": EventInfo

Configuration Structure¶

Top-Level Sections¶

Each top-level key in the YAML file corresponds to a configuration block (ConfigBlock). Configuration blocks are the building blocks of CP Algorithm configuration, each handling a specific aspect of the analysis (e.g., muon reconstruction, jet calibration, overlap removal).

Some sections will be defined as lists, even though we expect most users to define just one entry. However, it being a list means that if you are unsure e.g. which muon calibration mode to use you can just define a second muon container in a second section and get both in the same n-tuple and can directly compare the results you get with either.

Sometimes you want to use all options at their default value, the YAML way of doing so is to add {} in the same line. E.g. above we use a default-configured CommonServices block like this:

CommonServices: {}

Note that there are a couple of blocks you'll always need:

CommonServices: this defines the services needed for systematics and selection tracking
PileupReweighting: there may be some situations in which it can be skipped, but in general you will need it to set up the RandomRunNumber on Monte Carlo
if you want to calculate MissingET you will always need Jets and Muons just based on how MissingET is calculated (at least the calorimeter-based MET)

Container Names¶

Container names (e.g., AnaMuons, AnaElectrons, AnaJets) are user-defined identifiers for object collections. These names are used to:

Reference objects in subsequent configuration blocks
Identify containers in the output tree
Connect related objects (e.g., linking electrons to MET calculation)

Working Points¶

Working points define named selections with specific quality and isolation criteria. Each working point has a selectionName that can be used to reference the selection later:

Muons:
  - containerName: AnaMuons
    WorkingPoint:
      - selectionName: medium
        quality: Medium
        isolation: Loose_VarRad
      - selectionName: tight
        quality: Tight
        isolation: Tight_VarRad

Object References¶

In some cases you need to selected or "good" objects, e.g. to specify the objects to run MET, OR, or event selection on. These are usually referenced using the syntax ContainerName.selectionName:

AnaElectrons.loose - electrons passing the "loose" working point
AnaMuons.medium - muons passing the "medium" working point
AnaJets.baselineJvt - jets passing baseline JVT selection (default selection name in the JVT block)

When no selection is specified (e.g., just AnaJets), all objects in the container are used. For more details see the page on selections.

Output Section¶

The Output section configures the output n-tuple tree. For comprehensive documentation on all output options, systematics handling in output, MET variable configuration, and how to write config blocks that register output variables, see the N-Tuple Output page.

Basic example:

Output:
  treeName: analysis
  containers:
    mu_: AnaMuons
    el_: AnaElectrons
    jet_: AnaJets
    met_: AnaMET
    "": EventInfo

Configuration blocks automatically register output variables as they are configured, so many variables will be written without explicit configuration. You can use the commands option to enable, disable, or rename specific variables, and the vars option to add variables not automatically registered. See N-Tuple Output for details.

Expert Options¶

Some advanced options are marked as "expert mode" and require explicit opt-in. These options are typically for CP studies, cross-checks, or unusual configurations that should not be used in standard physics analyses.

To enable expert mode options, set enableExpertMode: True in CommonServices:

CommonServices:
  enableExpertMode: True

When expert mode is not enabled and you try to use an expert option, an error will be raised. When expert mode is enabled, a warning will still be generated for each expert option used.

Property Overrides¶

The propertyOverrides option allows directly setting properties on the underlying algorithms and tools. This is an expert-only feature and should be used sparingly:

Muons:
  - containerName: AnaMuons
    propertyOverrides:
      MuonCalibrationAndSmearingAlg.calibrationAndSmearingTool.calibMode: 3

The format is AlgorithmName.toolName.propertyName: value. Note that this bypasses the normal configuration system, so other algorithms that depend on the overridden value will not be updated accordingly.

Warning

Property overrides should only be used when no regular option exists for the setting you need to change. If you find yourself needing a property override regularly, consider requesting a proper option be added to the configuration block.

Python Configuration¶

Python configuration is available for integration into larger Athena jobs where programmatic control is needed. The basic pattern uses ConfigFactory to create configuration blocks and ConfigSequence to assemble them:

from AnalysisAlgorithmsConfig.ConfigFactory import ConfigFactory
from AnalysisAlgorithmsConfig.ConfigSequence import ConfigSequence

configSeq = ConfigSequence()
factory = ConfigFactory()

# Add muon configuration
subConfig = factory.makeConfig('Muons')
subConfig.setOptionValue('.containerName', 'AnaMuons')
configSeq += subConfig

# Add muon working point
subConfig = factory.makeConfig('Muons.WorkingPoint')
subConfig.setOptionValue('.containerName', 'AnaMuons')
subConfig.setOptionValue('.selectionName', 'medium')
subConfig.setOptionValue('.quality', 'Medium')
subConfig.setOptionValue('.isolation', 'Loose_VarRad')
configSeq += subConfig

For a complete real-world example of Python configuration, see the PHYSLITE derivation

Notes from the Developers¶

We redesigned the configuration with the goal of making the primary configuration format a text file, as that was the predominant configuration style for a number of popular formats at the time. Though unlike most of those frameworks we settled on YAML instead of a completely free form and custom file format. A number of design choices in the configuration system were made with that choice in mind.

The way we originally designed the configuration file is that we started by designing how we wanted that file to look like first. Only then did we look at how to map that onto the configuration mechanisms and algorithm structure, and what adjustments to the original design are necessary. Where possible that is the preferred direction, but it is more work than simply to wrap your algorithms into blocks and just accepting whatever configuration layout results from that. Though in practice both approaches can yield the same result, particularly if you follow existing examples.

The python configuration is less fully developed than the text configuration, in part because we only expected it to be used in some edge cases. It may be worth completely redesigning the interface.