CPGridRun
CPGridRun.py
is a script to submit the analysis job to the PanDA grid (we call the remote computing service a grid job) and you can monitor it on bigPanDA. You want to submit a job when your root files are too big for a local machine, or you are working with officially produced MC samples by ATLAS production team.
Submitting a grid job yourself has a steep learning curve because you are opened up to a whole set of grid errors, which most of the time you will be swammed by the computing server technicalities while debugging. CPGridRun.py
is a centralized script to help you submit the job in a working and suggested way. The script has a lot default settings, in particular, the script is designed to streamline with CPRun.py
. In this section we focus on running CPGridRun.py
with CPRun.py
. The core of the CPGridRun.py
is generating a working prun
(PanDA run) command.
Lets run a demonstration first!
setupATLAS
asetup AnalysisBase,main,latest
touch gridinput.txt
echo "mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490" >> gridinput.txt
echo "mc20_13TeV.700341.Sh_2211_Wmunu_maxHTpTV2_BFilter.deriv.DAOD_PHYS.e8351_s3681_r13145_p6490" >> gridinput.txt
CPGridRun.py -i gridinput.txt --testRun --exec "CPRun.py -t test_configuration_Run2.yaml -e 50" --prefix myTutorial
Py:CPGridRun INFO
Input: mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490
Datasetname: mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490
Projectname: mc20_13TeV
Campaign: mc20
Energy: 13TeV
Dsid: 410470
Main: PhPy8EG_A14_ttbar_hdamp258p75_nonallhad
Step: deriv
Format: DAOD_PHYS
Tags: ['e6337', 's3681', 'r13167', 'r13146', 'p6490']
Etag: e6337
Stag: s3681
Rtag: r13146
Ptag: p6490
Py:CPGridRun INFO Command:
...
test_configuration_Run2.yaml
can be called out of nowhere because is a test configuration installed in AnalysisBase. It is a very useful configuration that you can use to test your code on your machine, it is a good practice to use it when you are not sure if your code is working properly.
You should see the first part is about metadata of your input sample, for the detail check the ATLAS Production naming format section below.
The second part starts with prun
command, which is the grid submission command you just learned in the previous tutorial. CPGridRun.py
is generating a working prun
command for you to run your CP algorithms on the grid with CPRun
.
Py:CPGridRun INFO Command:
prun \
--inDS mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490 \
--outDS user.$USER.myTutorial.410470.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490.test_214093 \
--useAthenaPackages \
--cmtConfig x86_64-el9-gcc13-opt \
--writeInputToTxt IN:in.txt \
--outputs output:output.root \
--exec "CPRun.py --input-list in.txt --output-name output --max-events 50 --text-config test_configuration_Run2.yaml --merge-output-files" \
--memory 2000 \
--addNthFieldOfInDSToLFN 2,3,6 \
--mergeOutput \
--outTarBall cpgrid.tar.gz \
--nEventsPerFile 300 \
--nFiles 10
This is a working prun
command line that you can copy and paste on lxplus; of course you can also use CPGridRun.py
to run the command line for you. There are a few flags we should discuss.
--outDS user.$USER.myTutorial.410470.DAOD_PHYS.e#####.test_#####
we see the user identity(user or group), username is set, followed by the prefixmyTutorial
. At the end, the suffix istest_#####
, it is set automatically because we passed--testRun
--exec "CPRun.py --input-list in.txt --output-name output --max-events 50 --text-config test_configuration_Run2.yaml --merge-output-files"
- The
--exec
is different from what we have entered,CPGridRun
will help you to set the input and output correctly, and make sure the necessary flags are set. - It sets the
--input-list
toin.txt
, you may have found it is from--writeInputToTxt IN:in.txt
. After the grid receive the MC samples you requested, it will read through its database, and find out all the related.root
files, and write it intoin.txt
; which a format thatCPRun.py
can take.
- The
--outputs output:output.root
also another preset that ensure the IO is set correctly.--outTarBall
is askingprun
to (re)compress the repository tocpgrid.tar.gz
, if you see--inTarBall
it means it usescpgrid.tar.gz
but not re-compressing.--nEventsPerFile 300 & --nFiles 10
because we have--testRun
enabled. Sometimes you want to test your code on the grid, but you don't want to wait for a long time to get the results.--testRun
will limit the number of files per job to 10 and number of events per file to 300. This is useful when you want to test a small run on the grid.
At the end you will see a confirmation prompt, press y
and this will be sufficient to submit a job to the grid.
ATLAS Production naming format (Optional)¶
One challenge to setup properly is to get the correct formatting on the grid.
The input name has a format which the ATLAS Production team uses to name the samples they produced. Getting the name correct is crucial because it is the name used on the grid, and it is a format that CPGridRun.py
can recognize and help streamlining.
The ATLAS Production naming format as follow:
* Project name: It is either mc##_%%TeV
or data_##
.
* DSID: dataset ID, a 6 digit unique number that characterize your samples. It may be Standard Model or some exotic simulation.
* Main: It can be quite arbitrary but usually contains simulator information and process.
* Step: deriv
stands for derivation, simul
, evgen
, recon
etc.
* Format: The file storage format, different format has their own purpose and benefit. AOD
, EVNT
, etc.
* Tags: The simulation configuration, i.e., the settings they used in different steps, which are documented by Particle model group. Check the link above for more information.
The full format usually follows:
ProjectName.DSID.Main.Step.FORMAT.tags
CPGridRun arguments (Optional)¶
Let see the help message
setupATLAS
asetup AnalysisBase,main,latest
CPGridRun.py -h
CPGridRun.py
arguments, the other is extracted from CPRun.py
.
Under the CPGridRun.py
section, it is divided into 4 subsections. You will also see some arguments help message have "(PanDA)", which means it is an identical flag taken from prun
.
Important Input/Output file configuration¶
-i
or--input-list
, it is NOT identical to theCPRun.py
input list. It takes two formats,- A name that is recognizable by the PanDA grid, it should be following the ATLAS Production team naming convention. See the sub-section above.
- A text file contains multiple names that follows the ATLAS Production team naming convention.
- User may also use their own files on the grid, but it is out of the tutorial scope.
--output-files
, on the grid NOT all files generated can be downloaded because it takes extra effort for the grid to collect your files to a desired location from multiple computing servers. Users need to notify the grid what to download in advance.--output-files "A.root,B.txt,B.root" results in outDS/A/A.root, outDS/B/B.txt, outDS/B/B.root
in the output directory. If you are using CPRun.py you don't need to set it.
Important Input/Output naming configuration¶
Each time a user submit a grid job they must have a unique outDS
. The outDS
is a unique identifier for the grid, and every specified file will be put under the directory outDS
. If a duplicated outDS
is submitted to the grid, the grid will return an error and asking you to change the outDS
, even if your previous submission with the same outDS
has FAILED. We offer a preset (that is commonly used) to simplify the process.
outDS preset: {group/user}.{username}.{prefix}.{DSID}.{format}.{tags}.{suffix}
username
is obtained automatically,DSID
,format
,tags
is derived from your input samples. User only need to set theprefix
andsuffix
--prefix
Normally a fixed name that user wants to keep using for that sample, for examplettbar2WWnunu
--suffix
Mainly for version control, a name that user is happy to change for uniqueoutDS
, liketest_v1
,v_05
etc. If a submission failed forv_03
, user can change the suffix tov_04
and submit again--outDS
User can override all the preset and set it manually.--gridUsername
it is obtained automatically for single user. If the user is submitting an official group production, user can set it to--gridUsername PHYS-HMBS
etc.
Grid configuration¶
-
--groupProduction
will enable some preset for the group production, including naming and computation resources arrangement. User is expected to have the proper authentication. -
--exec
The executive line that user want to run on the grid. Must encapsulate in double quote "". There are a few things user should know before using theCPRun.py
preset- User should not set the input and output flag, they are streamlined to make sure the grid navigation is correct.
- A working example is simply
--exec "CPRun.py -t analysis_config.yaml"
- Run custom script:
--exec "customRun.py -i inputs -o output --text-config config.yaml --flagA --flagB"
Submission configuration¶
--noSubmit
will NOT submit anything to the grid-
--testRun
will submit jobs to the grid with a random suffix.test_uuid
. It will also greatly limit the number of files per job (10) and number of events (300). It is useful when you want to test a small run on the grid. -
--recreateTar
During submission withprun
, user required to manually askprun
to compress the user's repository with its source code, and submit alongside to the grid. We found that users always forget to re-compress after updating the source code (which always takes a few hours before users realized this mistake), thereforeCPGridRun.py
has a file changes detection to detect if anything changed in the source code or build directory. If soCPGridRun.py
will askprun
to compress again. But user can force re-compression with this flag.