Tips and tricks¶
Under Construction
This final part of the tutorial aims to capture various tips and tricks that may not be documented elsewhere, but that are regularly used by experienced developers to increase their productivity.
Athena command line options¶
If you didn't do so already, take a look at the command line options part of the configuration documentation.
acmd suite of tools¶
The command acmd.py, which is available in the ATLAS software environment, opens a box of tricks that are extremely useful in a number of different circumstances. You can get the list of available commands by doing
acmd.py -h
chk-file read a POOL file and dump its content.
chk-sg read a POOL file and dump the DataHeader's content
chk-rflx a script to check the definitions of (reflex) plugins
diff-pool diff two POOL files (containers and sizes)
diff-root diff two ROOT files (containers and sizes)
dump-root dump the content of a ROOT file into an ASCII format.
list-events
gen-klass helper script to generate header and cxx files
filter-files filter multiple input (pool/bs) files
cmake a group of sub-commands
jira a group of sub-commands
Each of these commands has its own list of options which can be accessed via the -h option, e.g.
acmd.py chk-file -h
Here we list the ones that we find are particularly important for our daily work.
diff-root and diff-pool¶
These commands are essential for comparing files made before and after a change, especially when the no output changes are expected. The diff-root one is particularly important as it is used for comparing files based on xAOD data structures, including AOD, DAOD_PHYS and DAOD_PHYSLITE. The basic usage is:
acmd.py diff-root -t CollectionTree file1 file2
Important options include
-t: the name of the trees to compare, usuallyCollectionTreefor AOD/DAOD files--order-trees: compare the events according to the event number and not assume that the files are in the same order, which is especially important if the files have been made by AthenaMT.--mode: specify how much detail to print about the differences. The available options are:summarywhich only report the number of differences;semi-detailedwhich reports the number of differences and the leaves that differ;detailedwhich prints the full numerical values of differing leaves as well as the magnitude of the differences.--error-modeshould be set toresilientunless you want the job to stop at the first difference.--nan-equalneeds to be included if the files are expected to containNaNvalues (this includes flavour tagging output). Otherwise the script will complain aboutNaNentries.
list-events¶
acmd.py list-events -f file
This prints a list of event and run numbers contained within the file.
Exploring Athena-produced files¶
There are several ways to explore Athena-produced files without actually running an Athena algorithm over them.
Checker scripts¶
The checker scripts include checkFile.py, checkSG.py, checkxAOD.py and meta-reader.py. They are extremely useful and are available in the ATLAS software environment. All have the same syntax - the command followed by the name of the file that is to be inspected.
checkFile.pydisplays the number of events and then prints out the on-disk and in-memory sizes of each container. Depending on whether the file is xAOD or non-xAOD the granularity with which containers are listed will differ - this command is most useful for non-xAOD files. SeecheckxAODfor xAOD files.checkSG.pydisplays a compact list of the container types and the SG key for each instance of that type in the file (without any size information).checkxAOD.pyis specifically for xAOD files such asAODandDAOD. It provides information in three sections: size breakdown by container as withcheckFile, but better formatted for xAOD containers; breakdown of size by container category (e.g. tracking, trigger); breakdown of metadata content.meta-readerdisplays a formatted tree of metadata contents (which includes the item list of the actual data payload, e.g. a list of container names)checkTriggerxAOD.pylists the various trigger containers including size information (this is not a record of which triggers fired for which events)
Exploring files with ROOT¶
xAOD files can be opened directly in ROOT, just as it they were n-tuple files. Just open a file (either by command line or via the browser), open the CollectionTree (for the even payload) or one of the metadata trees, and then browse as normal. For instance, to scan a variable and then dump a full list of the branches in an xAOD file, one can do in root -b:
TFile f("xaod.pool.root");
TTree* CollectionTree = (TTree*)f.Get("CollectionTree");
CollectionTree->Scan("InDetTrackParticleAuxDyn.pt");
.> CollectionTree.txt
CollectionTree->Print();
.>
You can also plot xAOD variables directly in ROOT via the TBrowser just by opening the CollectionTree and then clicking on the relevant branch name. Note that for DAODs, you need to use the AuxDyn branch as shown above, as opposed to AODs where the Aux branch should be used. Similarly, the xAOD CollectionTree can also be read in using RDataFrame, more or less in the same way as a normal n-tuple.
Interactive Athena¶
Athena can be configured interactively via its Python interface, which can be handy for making quick checks when you don't want to set up a fully working job, or when you want to explore the python layer of Athena without having to run the event loop. To launch Athena in interactive mode you just do:
athena --interactive="init"
athena> from AthenaConfiguration.AllConfigFlags import initConfigFlags
athena> flags = initConfigFlags()
athena> flags.dump()
Component accumulator configuration can also be run interactively, allowing checks to be made as you build the job.
You can also provide a configuration file at the interactive command to enable you to investigate the settings that are applied by that file, e.g.
athena --interactive="init" runargs.RAWtoALL.py
In principle interactive Athena can also be executed (as well as initialised) interactively, but this isn't a common workflow even amongst developers.
Command-line component accumulator configuration¶
Although production jobs run via job transforms can be modified by changing the Python source, it is sometimes extremely useful to be able to modify the configuration without touching the scripts in the release, especially if some kind of special production needs to be run on the grid.
In such cases one can "grab" the component accumulator and adjust the configuration in the postExec of the command - for example the following changes the output contents of a DAOD via the command line:
Derivation_tf.py --inputAODFile AOD.pool.root --outputDAODFile test.pool.root --maxEvents 1 --formats PHYS --postExec "cfg.getEventAlgo('StreamDAOD_PHYS').ItemList.extend(['xAOD::TruthParticleContainer#TruthParticles','xAOD::TruthParticleContainer#TruthVertices','xAOD::AuxContainerBase!#TruthParticlesAux.','xAOD::AuxContainerBase!#TruthVerticesAux.'])"
cfg is known to the job already, so to adjust the configuration one has to grab the relevant event processing algorithm (StreamDAOD_PHYS in this case) and then alter its properties as shown. A similar syntax can be applied to any of the methods of the component accumulator.