CPGridRun
CPGridRun.py is a script to submit the analysis job to the PanDA grid (we call the remote computing service a grid job) and you can monitor it on bigPanDA. You want to submit a job when your root files are too big for a local machine, or you are working with officially produced MC samples by ATLAS production team.
Submitting a grid job yourself has a steep learning curve because you are opened up to a whole set of grid errors, which most of the time you will be swammed by the computing server technicalities while debugging. CPGridRun.py is a centralized script to help you submit the job in a working and suggested way. The script has a lot default settings, in particular, the script is designed to streamline with CPRun.py. In this section we focus on running CPGridRun.py with CPRun.py. The core of the CPGridRun.py is generating a working prun (PanDA run) command.
Lets run a demonstration first!
setupATLAS
asetup AnalysisBase,main,latest
touch gridinput.txt
echo "mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490" >> gridinput.txt
echo "mc20_13TeV.700341.Sh_2211_Wmunu_maxHTpTV2_BFilter.deriv.DAOD_PHYS.e8351_s3681_r13145_p6490" >> gridinput.txt
CPGridRun.py -i gridinput.txt --testRun --exec "CPRun.py -t test_configuration_Run2.yaml -e 50" --prefix myTutorial
Py:CPGridRun INFO
Input: mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490
Datasetname: mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490
Projectname: mc20_13TeV
Campaign: mc20
Energy: 13TeV
Dsid: 410470
Main: PhPy8EG_A14_ttbar_hdamp258p75_nonallhad
Step: deriv
Format: DAOD_PHYS
Tags: ['e6337', 's3681', 'r13167', 'r13146', 'p6490']
Etag: e6337
Stag: s3681
Rtag: r13146
Ptag: p6490
Py:CPGridRun INFO Command:
...
test_configuration_Run2.yaml can be called out of nowhere because is a test configuration installed in AnalysisBase. It is a very useful configuration that you can use to test your code on your machine, it is a good practice to use it when you are not sure if your code is working properly.
You should see the first part is about metadata of your input sample, for the detail check the ATLAS Production naming format section below.
The second part starts with prun command, which is the grid submission command you just learned in the previous tutorial. CPGridRun.py is generating a working prun command for you to run your CP algorithms on the grid with CPRun.
Py:CPGridRun INFO Command:
prun \
--inDS mc20_13TeV.410470.PhPy8EG_A14_ttbar_hdamp258p75_nonallhad.deriv.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490 \
--outDS user.$USER.myTutorial.410470.DAOD_PHYS.e6337_s3681_r13167_r13146_p6490.test_214093 \
--useAthenaPackages \
--cmtConfig x86_64-el9-gcc13-opt \
--writeInputToTxt IN:in.txt \
--outputs output:output.root \
--exec "CPRun.py --input-list in.txt --output-name output --max-events 50 --text-config test_configuration_Run2.yaml --merge-output-files" \
--memory 2000 \
--addNthFieldOfInDSToLFN 2,3,6 \
--mergeOutput \
--outTarBall cpgrid.tar.gz \
--nEventsPerFile 300 \
--nFiles 10
This is a working prun command line that you can copy and paste on lxplus; of course you can also use CPGridRun.py to run the command line for you. There are a few flags we should discuss.
--outDS user.$USER.myTutorial.410470.DAOD_PHYS.e#####.test_#####we see the user identity(user or group), username is set, followed by the prefixmyTutorial. At the end, the suffix istest_#####, it is set automatically because we passed--testRun--exec "CPRun.py --input-list in.txt --output-name output --max-events 50 --text-config test_configuration_Run2.yaml --merge-output-files"- The
--execis different from what we have entered,CPGridRunwill help you to set the input and output correctly, and make sure the necessary flags are set. - It sets the
--input-listtoin.txt, you may have found it is from--writeInputToTxt IN:in.txt. After the grid receive the MC samples you requested, it will read through its database, and find out all the related.rootfiles, and write it intoin.txt; which a format thatCPRun.pycan take.
- The
--outputs output:output.rootalso another preset that ensure the IO is set correctly.--outTarBallis askingprunto (re)compress the repository tocpgrid.tar.gz, if you see--inTarBallit means it usescpgrid.tar.gzbut not re-compressing.--nEventsPerFile 300 & --nFiles 10because we have--testRunenabled. Sometimes you want to test your code on the grid, but you don't want to wait for a long time to get the results.--testRunwill limit the number of files per job to 10 and number of events per file to 300. This is useful when you want to test a small run on the grid.
At the end you will see a confirmation prompt, press y and this will be sufficient to submit a job to the grid.
ATLAS Production naming format (Optional)¶
One challenge to setup properly is to get the correct formatting on the grid.
The input name has a format which the ATLAS Production team uses to name the samples they produced. Getting the name correct is crucial because it is the name used on the grid, and it is a format that CPGridRun.py can recognize and help streamlining.
The ATLAS Production naming format as follow:
* Project name: It is either mc##_%%TeV or data_##.
* DSID: dataset ID, a 6 digit unique number that characterize your samples. It may be Standard Model or some exotic simulation.
* Main: It can be quite arbitrary but usually contains simulator information and process.
* Step: deriv stands for derivation, simul, evgen, recon etc.
* Format: The file storage format, different format has their own purpose and benefit. AOD, EVNT, etc.
* Tags: The simulation configuration, i.e., the settings they used in different steps, which are documented by Particle model group. Check the link above for more information.
The full format usually follows:
ProjectName.DSID.Main.Step.FORMAT.tags
CPGridRun arguments (Optional)¶
Let see the help message
setupATLAS
asetup AnalysisBase,main,latest
CPGridRun.py -h
CPGridRun.py arguments, the other is extracted from CPRun.py.
Under the CPGridRun.py section, it is divided into 4 subsections. You will also see some arguments help message have "(PanDA)", which means it is an identical flag taken from prun.
Important Input/Output file configuration¶
-ior--input-list, it is NOT identical to theCPRun.pyinput list. It takes two formats,- A name that is recognizable by the PanDA grid, it should be following the ATLAS Production team naming convention. See the sub-section above.
- A text file contains multiple names that follows the ATLAS Production team naming convention.
- User may also use their own files on the grid, but it is out of the tutorial scope.
--output-files, on the grid NOT all files generated can be downloaded because it takes extra effort for the grid to collect your files to a desired location from multiple computing servers. Users need to notify the grid what to download in advance.--output-files "A.root,B.txt,B.root" results in outDS/A/A.root, outDS/B/B.txt, outDS/B/B.rootin the output directory. If you are using CPRun.py you don't need to set it.
Important Input/Output naming configuration¶
Each time a user submit a grid job they must have a unique outDS. The outDS is a unique identifier for the grid, and every specified file will be put under the directory outDS. If a duplicated outDS is submitted to the grid, the grid will return an error and asking you to change the outDS, even if your previous submission with the same outDS has FAILED. We offer a preset (that is commonly used) to simplify the process.
outDS preset: {group/user}.{username}.{prefix}.{DSID}.{format}.{tags}.{suffix}
usernameis obtained automatically,DSID,format,tagsis derived from your input samples. User only need to set theprefixandsuffix--prefixNormally a fixed name that user wants to keep using for that sample, for examplettbar2WWnunu--suffixMainly for version control, a name that user is happy to change for uniqueoutDS, liketest_v1,v_05etc. If a submission failed forv_03, user can change the suffix tov_04and submit again--outDSUser can override all the preset and set it manually.--gridUsernameit is obtained automatically for single user. If the user is submitting an official group production, user can set it to--gridUsername PHYS-HMBSetc.
Grid configuration¶
-
--groupProductionwill enable some preset for the group production, including naming and computation resources arrangement. User is expected to have the proper authentication. -
--execThe executive line that user want to run on the grid. Must encapsulate in double quote "". There are a few things user should know before using theCPRun.pypreset- User should not set the input and output flag, they are streamlined to make sure the grid navigation is correct.
- A working example is simply
--exec "CPRun.py -t analysis_config.yaml" - Run custom script:
--exec "customRun.py -i inputs -o output --text-config config.yaml --flagA --flagB"
Submission configuration¶
--noSubmitwill NOT submit anything to the grid-
--testRunwill submit jobs to the grid with a random suffix.test_uuid. It will also greatly limit the number of files per job (10) and number of events (300). It is useful when you want to test a small run on the grid. -
--recreateTarDuring submission withprun, user required to manually askprunto compress the user's repository with its source code, and submit alongside to the grid. We found that users always forget to re-compress after updating the source code (which always takes a few hours before users realized this mistake), thereforeCPGridRun.pyhas a file changes detection to detect if anything changed in the source code or build directory. If soCPGridRun.pywill askprunto compress again. But user can force re-compression with this flag.