Monitor a batch job
For running jobs, there are several ways to monitor it’s progress, whether that’s tailing log files, using sstat
or connecting to the node with srun
.
Watch job progress by tailing log files
If your job produces output as it runs, you can use tail
on the log file to watch the progress. Using -F
causes it to watch for new lines. Use Ctrl-C
(or Cmd-C
on macOS) to stop monitoring the file. This does not affect the running job.
tail -F slurm-XXX.out
Check status quickly using sstat
The sstat
command can provide information about a running job’s use of resources. For best formatting, use the following command:
sstat -a -j <jobid> -o JobID%-15,TRESUsageInTot%-85
Ignore the <jobid>.extern
step. If you use srun
, mpirun
, or mpiexec
, then the numbered job steps show the usage of that program. The .batch
contains the usage of all of the commands in your batch script.
The following table shows some sample output from a job submitted with -n 32 -N 4 --ntasks-per-node=8
, which spread 32 tasks 4 nodes with 8 cores on each (not a recommended layout):
JobID | TRESUsageInTot |
---|---|
jobid.extern | cpu=00:00:00,energy=0,fs/disk=…,mem=0,pages=0,vmem=0 |
jobid.batch | cpu=8-11:09:22,energy=0,fs/disk=…,mem=4164440K,pages=0,vmem=4289444K |
jobid.0 | cpu=25-09:24:52,energy=0,fs/disk=…,mem=12502612K,pages=0,vmem=12532576K |
This job was running for about 25.5 hours. The usage in .batch
only represents the 8 cores on the BatchHost
. The .0
step is the usage of the MPI program. The usage here is 609 hours, which is less than the 816 hours expected (32 cores * 25.5 hours), indicating some inefficiency. This could be due to network traffic, I/O wait, or some non-MPI process that ran first, although that alone would not account for all the time.
Confirm job utilization using srun
In order to confirm that a job is utilizing the all the nodes, cores, and GPUs requested, you may connect to a node interactively using the following command:
srun --overlap --nodelist=<nodename> --pty --jobid=<jobid> /bin/bash
The --nodelist
argument should only contain one name and is only required if you want to connect to a node other than the first one. Use the following command to see your job’s assigned nodes, cores, and GPUs:
scontrol -d show job <jobid>
Where BatchHost
is the node your batch script is running on, and NodeList
is the list of all nodes allocated to your job. CPU_IDS
lists the cores on each node assigned to your job, and the IDX
field shows which GPUs are available to it. Sample output:
JobId=... JobName=...
...
NodeList=uri-gpu003
BatchHost=uri-gpu003
JOB_GRES=gpu:a100:4
Nodes=uri-gpu003 CPU_IDs=0-63 Mem=515000 GRES=gpu:a100:4(IDX:0-3)
...
Check CPU and memory usage
The recommended tool to see GPU utilization is (copy as-is; no need to expand the variables yourself):
systemd-cgtop slurm_${HOSTNAME}/uid_${UID}/job_${SLURM_JOBID}
The %CPU
column shows the sum of the utilization on all of the cores assigned on this node. This should be close to 100 times the number of cores. The Memory
column should show a value close to what you requested. Note tools like htop
may also work, but the make sure the CPU numbers are 0 based.
Check GPU usage
The recommended tool to see GPU utilization is nvitop
. See Unity GPUs for more information. Note it doesn’t show GPUs for other jobs on the node, even if they’re also your jobs.