Download Log File
In the previous step, we saw how to find and open and view a log file in a web browser. But what if we wanted to download it? We can do this using Rucio tools.
After one of your jobs has completed we will now find and download the log file.
Tip
When using Rucio, it is almost always better to use it in a separate terminal to where you are running your code or submitting grid jobs. This will minimize the potential for conflicts between different Python versions.
Setup the Rucio tools if you haven't done so already.
lsetup rucio
Tip
If you are working on a new lxplus node, or any computer where you didn't just submit your PanDA job, you may need to create a voms proxy:
voms-proxy-init -voms atlas:/atlas
Unlike pathena
and prun
, rucio
won't do that for you.
Go back to the BigPanDA web page and find the
page with the jediTaskID
that we used previously. Search for the Output
entry in the Containers
table, and note the log file container name, e.g.,
user.aparker.pruntest.log
. Back in your terminal session, try to find this
log file in the grid:
$ rucio list-dids user.aparker:*pruntest*log*
+--------------------------------------------------+--------------+
| SCOPE:NAME | [DID TYPE] |
|--------------------------------------------------+--------------|
| user.aparker:user.aparker.pruntest.log | CONTAINER |
| user.aparker:user.aparker.pruntest.log.340520924 | DATASET |
+--------------------------------------------------+--------------+
We now have two options:
- Download the container, and all log files within it (e.g. if the task contained many subjobs)
- Download just the dataset specific to the single set of jobs
Let's do the second:
rucio download user.aparker:user.aparker.pruntest.log.340520924
After it finishes downloading, navigate into the downloaded directory and extract the files from the tarball:
cd user.aparker.pruntest.log.340520924/
tar -xvf user.aparker.pruntest.log.23186476.000001.log.tgz
Tip
A tarball (a file with the tgz
extension) is a set of files packaged
together and compressed using gzip
. This is an efficient way to
transfer large numbers of small files.
This will give you access to the log file (as well as much more related information) from your job. This can be useful for debugging.
There will be a lot of information in here but when you have extracted the
logs, the file you are probably looking for is payload.stdout
Tip
As you learned earlier, you can also directly download individual files
with Rucio. If you go to the PanDA job page that you saw earlier, you
will see in the table that says 3 job files:
the log tarball. You can
directly download that file with rucio:
rucio download user.aparker.pruntest.log.23186476.000001.log.tgz
Here rucio guessed the right scope to use thanks to the name of the tarball.