Unity
Unity
About
News
Events
Docs
Contact Us
code
search
login
Unity
Unity
About
News
Events
Docs
Contact Us
dark_mode
light_mode
code login
search

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

Documentation

  • Requesting An Account
  • Get Started
    • Quick Start
    • Common Terms
    • HPC Resources
    • Theory of HPC
      • Overview of threads, cores, and sockets in Slurm for HPC workflows
    • Git Guide
  • Connecting to Unity
    • SSH
    • Unity OnDemand
    • Connecting to Desktop VS Code
  • Get Help
    • Frequently Asked Questions
    • How to Ask for Help
    • Troubleshooting
  • Cluster Specifications
    • Node List
    • Partition List
      • Gypsum
    • Storage
    • Node Features (Constraints)
      • NVLink and NVSwitch
    • GPU Summary List
  • Managing Files
    • Command Line Interface (CLI)
    • Disk Quotas
    • FileZilla
    • Globus
    • Scratch: HPC Workspace
    • Unity OnDemand File Browser
  • Submitting Jobs
    • Batch Jobs
      • Array Batch Jobs
      • Large Job Counts
      • Monitor a batch job
    • Helper Scripts
    • Interactive CLI Jobs
    • Unity OnDemand
    • Message Passing Interface (MPI)
    • Slurm cheat sheet
  • Software Management
    • Building Software from Scratch
    • Conda
    • Modules
      • Module Usage
    • Renv
    • Unity OnDemand
      • JupyterLab OnDemand
    • Venv
  • Tools & Software
    • ColabFold
    • R
      • R Parallelization
    • Unity GPUs
  • Datasets
    • AI and ML
      • AlpacaFarm
      • audioset
      • biomed_clip
      • blip_2
      • blip_2
      • coco
      • Code Llama
      • DeepAccident
      • DeepSeek
      • DINO v2
      • epic-kitchens
      • florence
      • gemma
      • gpt
      • gte-Qwen2
      • ibm-granite
      • Idefics2
      • Imagenet 1K
      • inaturalist
      • infly
      • instruct-blip
      • intfloat
      • LAION
      • linq
      • llama
      • Llama2
      • llama3
      • llama4
      • Llava_OneVision
      • Lumina
      • mixtral
      • msmarco
      • natural-questions
      • objaverse
      • openai-whisper
      • pythia
      • qwen
      • R1-1776
      • rag-sequence-nq
      • red-pajama-v2
      • s1-32B
      • satlas_pretrain
      • scalabilityai
      • SlimPajama
      • t5
      • Tulu
      • V2X
      • video-MAE
      • videoMAE-v2
      • vit
      • wildchat
    • Bioinformatics
      • AlphaFold3 Databases
      • BFD/MGnify
      • Big Fantastic Database
      • checkm
      • ColabFoldDB
      • dfam
      • EggNOG
      • EggNOG
      • gmap
      • GMAP-GSNAP database (human genome)
      • GTDB
      • igenomes
      • Kraken2
      • MGnify
      • NCBI BLAST databases
      • NCBI RefSeq database
      • NCBI RefSeq database
      • Parameters of Evolutionary Scale Modeling (ESM) models
      • params
      • PDB70
      • PDB70 for ColabFold
      • PINDER
      • PLINDER
      • Protein Data Bank
      • Protein Data Bank database in mmCIF format
      • Protein Data Bank database in SEQRES records
      • Tara Oceans 18S amplicon
      • Tara Oceans MATOU gene catalog
      • Tara Oceans MGT transcriptomes
      • Uniclust30
      • UniProtKB
      • UniRef100
      • UniRef30
      • UniRef90
      • Updated databases for ColabFold
    • Using HuggingFace Datasets

On this page

  • Create and submit a batch job
  • Check the status of your job while it’s pending or running
  • Receive emails about your job status
    • Receive a time limit email to prevent a loss of work
  • Check job progress
  1. Unity
  2. Documentation
  3. Submitting Jobs
  4. Batch Jobs

Introduction to batch jobs

A batch job refers to a task or a series of tasks that can be executed without user intervention. These jobs are submitted to a job scheduler, which manages resources and executes them when the required resources (such as CPUs, memory, etc.) become available. Unity uses Slurm, a popular open-source job scheduler used in many supercomputing clusters and high-performance computing (HPC) setups.

sbatch is a command within Slurm that is used to submit batch jobs. sbatch is a non-blocking command, meaning there is no circumstance where running the command will cause it to hold. If the resources requested in the batch job are unavailable, the job will be placed into a queue and will start to run once resources become available.

This page guides you through the following:

  • Introduction to batch jobs
    • Create and submit a batch job
    • Check the status of your job while it’s pending or running
    • Receive emails about your job status
      • Receive a time limit email to prevent a loss of work
    • Check job progress

Create and submit a batch job

There are two parts to submitting a batch job:

  • You need to create a batch script, which is a separate file that contains all of the parameters for your job and the commands you want to run.
  • You need to use the sbatch command to submit the batch job you created.

The following steps will guide you through how to create and submit a batch job in more detail.

  1. Create a batch script file in your preferred location.

  2. In the first line of the batch script, write the line #!/bin/bash, or whichever interpreter you need. If you are unsure of which interpreter to use, use #!/bin/bash.

  3. After the #!/bin/bash line, specify your #SBATCH parameters. These parameters specify important information about your batch job, such as the number of cores per task or the amount of memory you are requesting.

    The following example is a simple batch script that contains common sbatch parameters.

    #!/bin/bash
    #SBATCH -c 4  # Number of Cores per Task
    #SBATCH --mem=8192  # Requested Memory
    #SBATCH -p gpu  # Partition
    #SBATCH -G 1  # Number of GPUs
    #SBATCH -t 01:00:00  # Job time limit
    #SBATCH -o slurm-%j.out  # %j = job ID
    
    module load cuda/11.8
    nvcc --version
    

    Note that these lines are contained within the batch script file. Any parameters specified on the command line when submitting your job will override those in the file.

    As defined by the parameters, this example script allocates four CPUs and one GPU in the GPU partition. It queries the available GPUs, and prints only one device to the specified file. The last two lines of this example load the required module and script. Feel free to remove or modify any of the parameters in the script to suit your needs. Additionally, Slurm provides a wide variety of additional parameters for use with sbatch.

  4. To submit your batch job, use the command sbatch BATCH_SCRIPT. Be sure to replace BATCH_SCRIPT with the file name of your batch script.

Check the status of your job while it’s pending or running

To check the status of all your jobs while they are pending or running, use the squeue --me command.

Alternatively, to see the status of a specific job at any time, use the command sacct -j YOUR_JOB_ID. Be sure to replace YOUR_JOB_ID with the actual job ID you received when you submitted your job.

Receive emails about your job status

To receive emails based on the status of your job, use the --mail-type argument. Common mail types are BEGIN, END, FAIL, INVALID_DEPEND, and REQUEUE. For more information on which mail type makes the most sense for you, see Slurm’s sbatch page which not only covers --mail-type but also contains a full guide on sbatch.

To check that the email feature works for you with either salloc or sbatch, use the following code samples.

In your terminal:

salloc --mail-type=BEGIN /bin/true

Or, within your batch script:

#!/bin/bash
#SBATCH --mail-type=BEGIN
/bin/true

The BEGIN mail type sends you an email once your job begins.

lightbulb
If you want Slurm to send mail to an email other than the email associated with your Unity account, you can specify the --mail-user argument.

Receive a time limit email to prevent a loss of work

Your job will be terminated as soon as it reaches its time limit, regardless of how close it was to finishing its task. Without checkpointing, those CPU hours would be lost, and you would have to schedule the job all over again.

Another way to prevent losing your work is to check on your job’s output as it approaches its time limit. To receive an email about your job’s output as it approaches its time limit, use the --mail-type=TIME_LIMIT_80 argument.

With the --mail-type=TIME_LIMIT_80 argument, Slurm emails you if 80% of the time limit has passed and your job is still running. Then, you can check on the job’s output and determine if it will finish in time. If you do not think your job will finish in time, email us at hpc@umass.edu or ask on the Community Slack and we can extend your job’s time limit.

warning
We can’t guarantee we can extend your job’s time limit before the job ends. Please try to request enough time up front and request an extension only in unforeseen circumstances.

Check job progress

To see the status of all your jobs while they are pending or running, use the squeue --me command. This command shows the state of your jobs (e.g., running, pending, completed), job ID, partition, username, and more.

Alternatively, to see the status of a certain job at any time, use the command sacct -j YOUR_JOBID.

For an in-depth guide on monitoring batch jobs, see Monitor a batch job.

Articles in Batch Jobs

description
Array Batch Jobs
Documentation for Array Batch Jobs.
description
Large Job Counts
Documentation for Large Job Counts.
description
Monitor a batch job
Documentation for Monitor a batch job.
Last modified: Thursday, March 13, 2025 at 10:08 AM. See the commit on GitLab.
University of Massachusetts Amherst University of Massachusetts Amherst University of Rhode Island University of Rhode Island University of Massachusetts Dartmouth University of Massachusetts Dartmouth University of Massachusetts Lowell University of Massachusetts Lowell University of Massachusetts Boston University of Massachusetts Boston Mount Holyoke College Mount Holyoke College Smith College Smith College
search
close