Storage
Storage types
Unity provides access to a variety of storage methods for different use cases, including high performance storage, mid performance/large storage, and archival storage.
High performance storage
Unity’s /home
, /work
, and /scratch
directories use high performance VAST DataStore. This storage is suitable for Job I/O (reading and writing to files during a job), which requires a fast, parallel filesystem for best job performance. You must store or stage all data for Job I/O in a high performance storage location. While it is possible to purchase additional /work
space, we strongly advise using /scratch
via HPC Workspace or a lower performance storage option (see below) if possible.
/home
and /work
are snapshotted on a 3 day rolling basis.
Other directories on Unity, including /project
and /scratch
DO NOT have snapshots! We can’t restore
data lost from these directories.Mid performance, large storage
Often, researchers need “warm” data storage that’s larger than their high performance storage group quotas. We recommend storing the bulk of your data in /project
and staging the portions you need for a particular workload in /work
or /scratch
as needed. While the location of
/project
directories varies by institution, most are housed on the Northeast Storage Exchange (NESE)’s Disk Storage. Storing your data in /project
is a cost-effective way to house data in a location with excellent transfer speeds to Unity’s high performance storage. To request /project
space, email hpc@umass.edu. Most campuses provide a base allocation free of charge to research groups upon request.
In addition to NESE Disk, Unity researchers can request access to the Open Storage Network (OSN) S3 storage pods. UMass Amherst and URI own pods with storage available upon request (cost varies), or researchers can request an allocation of 10T through 50T through the NSF’s ACCESS program.
To request /project
or OSN storage, email hpc@umass.edu.
Archival storage
Researchers who need to store data long-term (several years) can purchase archival tape storage through NESE’s Tape Storage. NESE Tape is extremely cost-effective, high-capacity storage meant to house data that’s not often used or modified. To request NESE Tape storage, email hpc@umass.edu.
Storage summary and mountpoint table
Mountpoint | Name | Type | Base quota | Notes |
---|---|---|---|---|
/home | Home directories | HDD | 50 GB | Home directories should be used only for user init files. |
/work/pi_ | Work directories | SSD | 1 TB | Work is the primary location for running cluster jobs. This is a shared folder for all users in the PI group. |
/project | Project directories | HDD | As Needed | Project directories are available to PIs upon request. A common use case is generating job output in /work and copying to /project afterwards. Not for job I/O |
/scratch | Scratch space | SSD | N/A | See the HPC Workspace scratch documentation |
/nese | NESE mounts | HDD | Varying | DEPRECATED: Legacy location for mounts from the Northeast Storage Exchange (NESE). Not for job I/O |
/nas | Buy-in NAS mounts | Varying | Varying | DEPRECATED: Location for legacy buy-in NAS hardware. |
/gypsum | Gypsum devices | HDD | Varying | DEPRECATED: Storage from the former UMass Amherst CICS Gypsum cluster. |
I need more storage!
To request additional storage on Unity:
- Check out our storage management information to determine if you can reduce storage use without storage expansion.
- Determine the amount, duration, and type of storage needed using our handy flowchart and our storage descriptions.
- If you’re requesting a storage expansion that requires payment (see the storage expansion options table), identify the appropriate institution payment method (e.g. speedtype, Chartfield string, etc) for your payment source and the name and email of the finance representative within your department. If you’re unsure what to use, contact your institution’s representative for institution-specific information.
- Email hpc@umass.edu. If you’re not the PI (head) of your research group, this must be done by your PI or with your PI’s consent.
Storage expansion options
Resource | Free Tier Threshold | Notes |
---|---|---|
PI group work directories | 1T | Free tier: automatically allocated on PI account creation. Purchasing: available in 1T increments on 6 month intervals, up to 3 years at a time. |
PI group project directories | 5T (URI, UMassD threshold may vary) | Free tier: allocated upon request via the storage form. Purchasing: available in 5T increments on 1 year intervals, up to 5 years at a time. |
Scratch space | 50T soft cap | No purchasing necessary, see our scratch documentation. |
NESE Tape | N/A | Free tier: none available Purchasing: available in 10T increments on 5 year intervals. |
OpenStorageNetwork S3 from URI and UMass Amherst | TBD | Purchasing: TBD |
Storage expansion flowchart
The following flowchart is intended to help decide what type of storage you need or whether your existing data is ideally placed.
flowchart TD start("`We need more storage!`") quotaCheck("`My group can't reduce space without an increase.`") active("`Are the data needed for active jobs?`") frequent("`Can you stage subsets of this data in high performance storage as needed for active jobs?`") longtermLimited("`Do you need to archive data for a long time without frequent access or modification?`") sharing("`Do you need to share this data publicly?`") tape("`Request/Purchase NESE Tape archival storage.`") osn("`Request/Purchase OSN S3 storage or NESE /project space.`") intermediate("`Do you need additional storage for workflows that create temporary intermediate files?`") inactiveWorkData("`Does your group have inactive data in /work that could be moved to other storage?`") scratch("`Try Unity's scratch space: HPC Workspace.`") publicData("`Do you need additional storage to store a public, open-access dataset?`") email("`Email hpc@umass.edu about /datasets.`") purchaseWork("`Purchase additional /work storage.`") start --> quotaCheck quotaCheck --> intermediate intermediate -- NO --> active intermediate -- YES --> scratch active -- YES --> publicData active -- NO --> longtermLimited frequent -- NO --> purchaseWork frequent -- YES --> osn longtermLimited -- YES --> sharing sharing -- YES --> osn sharing -- NO --> tape longtermLimited -- NO --> osn inactiveWorkData -- YES --> osn inactiveWorkData -- NO --> frequent publicData -- YES --> email publicData -- NO --> inactiveWorkData click scratch "/documentation/managing-files/hpc-workspace/" "Scratch space link" click email "mailto:hpc@umass.edu" "Help email" click quotaCheck "/documentation/managing-files/quotas/" "Space management link" click osn "#mid-performance-large-storage" "Mid performance storage" click purchaseWork "#high-performance-storage" "High performance storage" click tape "#archival-storage" "Tape storage"
Snapshots
Backups are not available on the Unity cluster. There are temporary snapshots created each day at 5am UTC. Snapshots older than three days are deleted. Self-directed restores are accomplished by accessing read-only snapshots (see table below).
Filesystem | Name | Snapshot location |
---|---|---|
/home/<username> | Home directory | /snapshots/home/unity_<timestamp>/<username> |
/work/pi_<pi-username> | Work directory | /snapshots/work/unity_<timestamp>/pi_<pi-username> |
Restore files from a snapshot
The following code sample shows how to restore a specific directory from a snapshot. The example restores to a restore
directory first to ensure that changes aren’t overwritten.
mkdir ~/restore
cp -a /snapshot/home/unity_2023-02-08_05_00_00_UTC/<username>/path/to/file/or/directory ~/restore/