Unity
Unity
About
News
Events
Docs
Contact Us
code
search
login
Unity
Unity
About
News
Events
Docs
Contact Us
dark_mode
light_mode
code login
search

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

On this page

    • unity-slurm-gpu-usage
    • unity-slurm-node-usage
    • unity-slurm-partition-usage
    • unity-slurm-account-usage
    • unity-slurm-job-time-usage
    • unity-slurm-find-nodes
    • unity-slurm-list-features / unity-slurm-list-constraints
    • unity-slurm-account-list
    • unity-slurm-set-default-account
    • unity-slurm-gypsum-access
  1. Unity
  2. Documentation
  3. Submitting Jobs
  4. Helper Scripts

Helper Scripts

When working with Slurm, you may find that your job submissions are denied for violating resource limits, or you may find that your jobs sit in the queue for a long time. We provide some commands to help you schedule jobs more effectively:

unity-slurm-gpu-usage

For each model of GPU, how many are there in total, and how many are currently idle? How many are pending to be allocated for future jobs?

    Type   |            Allocated            |  Pending  |   VRAM  |   CC
===========================================================================
 any         [#######           ] 525/1423     106         0         0
 unknown                          0            68          0         0
 gh200       [##################] 1/1          0           80        9.0
 l40s        [##                ] 9/68         0           48        8.9
 a40         [######            ] 4/12         0           48        8.6
 a100        [###############   ] 126/148      12          80,40     8.0
 2080        [############      ] 4/6          0           8         7.5
 rtx8000     [####              ] 10/48        6           48        7.5
 2080ti      [#########         ] 205/426      20          11        7.5
 v100        [####              ] 16/73        0           32,16     7.0
 1080ti      [####              ] 76/342       0           11        6.1
 m40         [######            ] 32/96        0           23        5.2
 titanx      [####              ] 42/203       0           12        5.2

You can use the --sort argument to change the GPU sorting. Valid options for --sort are total, cc/vram, free, and type.

lightbulb
Using Idle Resources
Tailoring your job to use idle (not allocated) resources is a good way to get your job running quickly.

 

unity-slurm-node-usage

For each node, how many CPU cores, gigs of RAM, and GPUs are currently idle? Which partitions can I use to schedule a job on that node?

        Hostname       |       Idle CPU Cores      |         Idle Memory         |       Idle GPUs       |              Partitions
===============================================================================================================================================
gpu001                  [#######      ] 9/16        [#            ] 11.8 GB       [             ] 0/2     cpu-preempt,gpu
gpu002                  [###########  ] 14/16       [#############] 190.8 GB      [             ] 0/2     cpu-preempt,gpu
gpu003                  [############ ] 34/36       [######       ] 93.5 GB       [             ] 0/2     cpu-preempt,gpu
gpu004                  [############ ] 34/36       [#######      ] 108.9 GB      [             ] 0/2     cpu-preempt,gpu
gpu005                  [###########  ] 26/32       [############ ] 173.4 GB      [             ] 0/3     cpu-preempt,gpu
gpu006                  [#############] 32/32       [#############] 191.8 GB      [#############] 3/3     cpu-preempt,gpu
gpu007                  [#############] 32/32       [#############] 191.8 GB      [#############] 3/3     cpu-preempt,gpu
gpu009                  [#############] 31/32       [#############] 499.0 GB      [##########   ] 3/4     cpu-preempt,gpu
gpu010                  [#############] 32/32       [#############] 515.4 GB      [#############] 4/4     gpu-preempt
gpu011                  [############ ] 33/36       [######       ] 181.2 GB      [###          ] 1/4     gpu
gpu012                  [###########  ] 30/36       [#######      ] 204.8 GB      [###          ] 1/4     gpu
gypsum-gpu001           [#############] 12/12       [#############] 257.8 GB      [#############] 4/4     cpu,gpu-preempt
gypsum-gpu002           [#############] 12/12       [#############] 257.8 GB      [#############] 4/4     cpu,gpu-preempt
gypsum-gpu003           [#############] 12/12       [#############] 257.8 GB      [#############] 4/4     cpu,gpu-preempt
gypsum-gpu004           [#############] 12/12       [#############] 257.8 GB      [#############] 4/4     cpu,gpu-preempt

You can pipe a list of hostnames into this command to show the usage for only those nodes. Example:

printf 'cpu001\ncpu002' | unity-slurm-node-usage
collecting info from slurm...
  Hostname  |      Idle CPU Cores     |        Idle Memory         |  Idle GPUs  |        Partitions
===========================================================================================================
cpu001       [#########    ] 16/24     [#            ] 5.2 GB                     building,cpu
cpu002       [             ] 0/24      [############ ] 371.0 GB                   building,cpu
lightbulb
Sometimes Resources Go to Waste
It’s possible that there can be idle GPUs shown in unity-slurm-gpu-usage but no job can be scheduled to use them. One reason for this is that there isn’t enough idle CPU cores or memory on the nodes with those idle GPUs.

 

unity-slurm-partition-usage

For each partition, how many CPU cores and GPUs are currently idle?

    partition name    |          idle CPUs          |         idle GPUs         |  total nodes
================================================================================================
arm-gpu                [#######      ] 40/72         [             ] 0/1         1
arm-preempt            [#############] 240/240                                   3
cpu                    [######       ] 4502/9812     [##########   ] 193/252     167
cpu-preempt            [#####        ] 2688/7736     [#######      ] 13/25       126
gpu                    [#####        ] 1024/2740     [########     ] 417/645     125
gpu-preempt            [#####        ] 1690/4492     [#######      ] 492/934     157
mpi                    [#            ] 24/320                                    5
ood-shared             [             ] 0/228         [##########   ] 59/76       19
power9                 [#############] 256/256                                   2
power9-gpu             [#############] 2144/2144     [#############] 44/44       15
power9-gpu-osg         [#############] 256/256       [#############] 4/4         2
power9-gpu-preempt     [#############] 2144/2144     [#############] 44/44       15

 

unity-slurm-account-usage

What resources are the members of my PI group currently using? What more can I allocate without violating PI group resource limits?

Current resource allocation under account "account1":
 username | CPUs allocated | GPUs allocated | CPUs pending | GPUs pending
=====================================================================================
user1      24               0                0              0
user2      9                6                128            0
user3      18               0                0              5
user4      9                1                0              0
user5      3                0                0              0
total      63               7                128            5
lightbulb
“Account” vs “PI group”
When using Slurm terminology, an “account” does not refer to your user account but actually your PI group.

 

unity-slurm-job-time-usage

Of my recently completed jobs, how long did each job run? How much time did I allocate for each job to run?

   JobName JobID           Elapsed  Timelimit
---------- ------------ ---------- ----------
      bash 15843792       00:04:06   01:00:00
slurm-exp+ 16165455       00:00:08   00:05:00
      bash 16197821       00:11:52   01:00:00
      bash 16197899       00:00:04   01:00:00
      bash 16197903       00:02:29   01:00:00
      bash 16198016       00:09:30   01:00:00

This command takes one optional positional argument num_jobs_printed.

lightbulb
Excess Job Time
unity-slurm-job-time-usage can help you to reduce excess job time. If you can do that, your jobs will be scheduled faster and Slurm will be more effective scheduling jobs cluser-wide.

 

unity-slurm-find-nodes

Which nodes have a given feature/constraint?

$ unity-slurm-find-nodes ppc64le
power9-gpu001   power9-gpu004   power9-gpu007   power9-gpu010   power9-gpu013   uri-cpu046
power9-gpu002   power9-gpu005   power9-gpu008   power9-gpu011   power9-gpu014   uri-cpu047
power9-gpu003   power9-gpu006   power9-gpu009   power9-gpu012   power9-gpu016
found 17 nodes.

Learn more about features/constraints

lightbulb
Helper Script Synergy
You can pipe unity-slurm-find-nodes into unity-slurm-node-usage to see the usage for just the nodes that meet a given constraint.

 

unity-slurm-list-features / unity-slurm-list-constraints

What features/constraints can I use to select nodes for my jobs?

$ unity-slurm-list-constraints
1080_ti		amd7502		haswell		intel6148	p923		sm_89		vram8
1080ti		amd7543		ib		    intel6226r	power9le	sm_90		vram80

See the full list of features/constraints

 

unity-slurm-account-list

What accounts am I a member of?

$ unity-slurm-account-list
account1
account2

 

unity-slurm-set-default-account

If you are a member of multiple accounts, one of them will be used by default when you schedule a job without specifying an account. This command changes that default account.

 

unity-slurm-gypsum-access

Determines if you have access the gypsum partitions

Last modified: Thursday, March 13, 2025 at 10:08 AM. See the commit on GitLab.
University of Massachusetts Amherst University of Massachusetts Amherst University of Rhode Island University of Rhode Island University of Massachusetts Dartmouth University of Massachusetts Dartmouth University of Massachusetts Lowell University of Massachusetts Lowell University of Massachusetts Boston University of Massachusetts Boston Mount Holyoke College Mount Holyoke College Smith College Smith College
search
close