Introduction to Gaudi and Athena¶
Setup runtime against the latest Athena main nightly build¶
It is assumed that you have already followed the preparatory instructions. As explained in the ATLAS Git Workflow Tutorial, you should have:
- Forked the Athena repository in GitLab
- Cloned your fork locally into a directory named athena
- Added the main repository as upstream
Start from the athena directory and setup the latest nightly release of the Athena main branch:
cd ..
setupATLAS
lsetup git
asetup Athena,main,latest
Compile the AthExHive package¶
As explained here, the first thing you need to do is to create a space that will hold your code changes in an easily identifiable way. In Git this is done by creating a new branch:
cd athena
git fetch upstream
git checkout -b main-AthExHive upstream/main --no-track
cd ..
After that, you need to list all packages you want to build locally (for this particular example it would be only one package AthExHive) in your private copy of the package_filters.txt or whatever you called it. You can either use the commands from the script below or simply edit your existing file so that it contains only two active lines:
+ Control/AthenaExamples/AthExHive
- .*
Script to make a new package_filters.txt file may look as follows:
if [ -f package_filters.txt ]; then rm package_filters.txt;fi
echo '+ Control/AthenaExamples/AthExHive' >> package_filters.txt
echo '- .*' >> package_filters.txt
Create an empty build directory, configure your build area using cmake, and launch the build:
if [ -d build ]; then rm -rf build;fi
mkdir build
cd build
cmake -DATLAS_PACKAGE_FILTER_FILE=../package_filters.txt ../athena/Projects/WorkDir
make -j
source x86_64-*/setup.sh
Note
The last line in the above example is required for Athena runtime to pick up the locally built packages (e.g. shared libraries, python scripts) from your build directory instead of corresponding packages from the nightly release installation area. You don't have to do that every time you make changes in the code and rebuild your packages. It is usually enough to source this script once, after the first build. If you're using an ARM machine you need to change this string to source aarch64-*/setup.sh
Data dependency declaration in Athena¶
Algorithms in Athena don't communicate directly to each other, that is they don't call each other's interface functions. Instead, the Algorithms consume (read) and produce (write) Data Objects via the transient Event Store. To be properly executed in AthenaMT, i.e. a consumer Algorithm gets executed only after all of its input data objects have been recorded into the Event Store by the upstream producer Algorithms, the Algorithms need to declare their data dependencies - both read and write - to the AthenaMT Scheduler. The algorithms do that by using smart handles for data objects.
Let's assume you develop an algorithm ExampleAlg which needs to declare input dependency on ObjectA, and output dependency on ObjectB. Then, you need to declare two private data members of the ExampleAlg class as follows:
class ExampleAlg : public AthReentrantAlgorithm
{
public:
...
private:
SG::ReadHandleKey<ObjectA> m_readKey
{this, "ReadKey", "In", "StoreGate Key for the ObjectA"};
SG::WriteHandleKey<ObjectB> m_writeKey
{this, "WriteKey", "Out", "StoreGate Key for the ObjectB"};
...
};
You need to pass the following arguments to the constructors of Read/Write handle key objects:
- A pointer to the owner object (usually
this) - Name of the property, which will allow you to modify the default value of the key at configuration time (e.g.
ExampleAlg.ReadKey="NonDefaultValue") - Default value of the property
- Documentation string
The key objects need to be initialized in the initialize() method of the ExampleAlg:
StatusCode ExampleAlg::initialize()
{
ATH_CHECK( m_readKey.initialize() );
ATH_CHECK( m_writeKey.initialize() );
return StatusCode::SUCCESS;
}
Finally in the execute() method of ExampleAlg you can access the instances of ObjectA and ObjectB as follows:
StatusCode ExampleAlg::execute (const EventContext& ctx) const
{
// Construct handles from the keys.
SG::ReadHandle<ObjectA> h_read (m_readKey, ctx);
SG::WriteHandle<ObjectB> h_write (m_writeKey, ctx);
// Now we can dereference the read handle to access input data.
int newval = h_read->val()+1;
// We make a new object, held by a unique_ptr, and record it
// in the store using the record method of the handle.
ATH_CHECK( h_write.record (std::make_unique<ObjectB> (newval)) );
return StatusCode::SUCCESS;
}
Data dependency of algorithms in the AthExHive package¶
The AthExHive package contains several example algorithms, one tool, and a few services. The algorithms are trivial, each sleeping for a random amount of time, and reading and writing instances of the HiveDataObj class to the event store. The schematic below visualizes data dependencies between most of the algorithms in this package. Here you can see the AthExHive algorithms (e.g. HiveAlgA, HiveAlgB, HiveAlgC), and the data object keys (e.g. a1, b1, c2) forming the data dependency graph.

Exercise 1: Run it¶
You can find a job configuration script for running this test in the git repository at Control/AthenaExamples/AthExHive/python/AthExHiveConfig.py. You can also see that this script is installed as a Python module under the build directory:
% ls -l build/x86_64-*/python/AthExHive
total 92
-rw-r--r-- 1 tsulaia zp 93821 Jan 24 00:52 AthExHiveConf.py
lrwxrwxrwx 1 tsulaia zp 77 Jan 24 00:51 AthExHiveConfig.py -> ../../../../athena/Control/AthenaExamples/AthExHive/python/AthExHiveConfig.py
-rw-r--r-- 1 tsulaia zp 0 Jan 24 00:52 __init__.py
Run with one thread¶
To run the test, first make an empty run directory, say, next to your build directory. Go to the run directory and run this command
python -m AthExHive.AthExHiveConfig --threads=1 2>&1 | tee log
Examine the log file. Note the data dependency listing, which will look as follows:
AvalancheSchedulerSvc 0 INFO Data Dependencies for Algorithms:
BeginIncFiringAlg
none
IncidentProcAlg1
none
SGInputLoader
none
EventInfoCnvAlg
o INPUT ( 'EventInfo' , 'StoreGateSvc+McEventInfo' )
o OUTPUT ( 'SG::AuxElement' , 'StoreGateSvc+EventInfo' )
o OUTPUT ( 'xAOD::EventInfo' , 'StoreGateSvc+EventInfo' )
HiveAlgA
o INPUT ( 'xAOD::EventInfo' , 'StoreGateSvc+EventInfo' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+a1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+a2' )
HiveAlgB
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+b1' )
HiveAlgC
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+a1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+C1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+c2' )
HiveAlgD
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+a2' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+d1' )
HiveAlgE
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+C1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+b1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+e1' )
HiveAlgF
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+C1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+a1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+b1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+c2' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+d1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+e1' )
HiveAlgG
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+d1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+g1' )
HiveAlgV
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+C1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+a1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+a2' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+d1' )
o INPUT ( 'HiveDataObj' , 'StoreGateSvc+e1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+V1' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+V2' )
o OUTPUT ( 'HiveDataObj' , 'StoreGateSvc+V3' )
Run with multiple threads¶
Now try running with more than one thread (4 threads used in the example below):
python -m AthExHive.AthExHiveConfig --threads=4 2>&1 | tee log
Note how both the event number and the slot number are shown in the output. For example, if event #2 gets assigned to slot #1 you will see something similar in the log file:
...
HiveAlgD 2 1 DEBUG execute HiveAlgD
HiveAlgD 2 1 INFO sleep for: 8 ms
HiveAlgD 2 1 INFO read: a2 = 10052
HiveAlgD 2 1 INFO write: d1 = 40000
...
Exercise 2: Break it¶
Modify the Python script AthExHiveConfig.py in the git repository directory by changing the key of one of the objects produced by HiveAlgC, so the dependency graph becomes broken. For example, in function HiveAlgCConf() replace the matching lines with
alg = CompFactory.HiveAlgC("HiveAlgC",
OutputLevel=DEBUG,
Time=190,
Key_W1="C1-BAD",
Cardinality=flags.Concurrency.NumThreads)
Rerun the test with either 1 or more threads and see what happens.
Note
There is no need to rebuild the package if you only change your Python configuration script.
Exercise 3: Introduce a new data dependency between two algorithms¶
Introduce data dependency between HiveAlgB (OUTPUT) and HiveAlgA (INPUT). Modify HiveAlgB so that it writes new data object with key b2 to the store, and HiveAlgA so that it reads that object from the store.
The following steps will need to be done:
- Step 1. Introduce new
WriteHandleKeyforb2insideHiveAlgB.h. Tip: see how this has already been done forb1key. - Step 2. Make necessary modifications in
HiveAlgB.cxxfor writing an object with keyb2to the store. Tip: see how this has already been done for =b1= key.- Initialize new
WriteHandleKeyinsideHiveAlgB::initialize() - Write new
HiveDataObjviaWriteHandlefor the keyb2insideHiveAlgB::execute()
- Initialize new
- Step 3. Introduce new
ReadHandleKeyforb2insideHiveAlgA.h. Tip: see how this has already been done fora1key insideHiveAlgC.h - Step 4. Make necessary changes in
HiveAlgA.cxxfor reading the object with keyb2from the store. Tip: see how this has already been done fora1key insideHiveAlgC.cxx- Initialize new
ReadHandleKeyinsideHiveAlgA::initialize() - Read
b1object viaReadHandleinsideHiveAlgA::execute()
- Initialize new
Rebuild the package by going to the build directory and simply typing make.
Rerun the same test as before and examine the log. Verify that the new dependencies have been successfully declared, the new objects have been written and then read, and the job ran to successful completion.