1. Storage Systems

RCIC supports several different storage systems, each with their own “sweet spot” for price and performance.

The storage systems depicted below are all available from the HPC3 cluster. Campus Storage (CRSP) is unique in that it can also be accessed from desktops, laptops, and other systems without going through HPC3.

The two major parallel file systems are DFS and CRSP and the section on comparing CRSP and DFS can help you choose the right system (or combination of systems) to store your data.

Of the multiple HPC3 storage systems available, connectivity, file System architecture, and physical hardware all contribute to performance.

hpc3 storage

Fig. 1.1 HPC3 Storage pictogram

Attention

Storage is shared among all users.
The nature of networked-storage makes is possible for
a single user to render a file system unusable for all.

The following summary explains what each storage system provides, what it should be used for, and shows links for in-depth how to use guides.

Home
See details in HOME storage guide.
Provides a convenience access on all nodes via mount over NFS
Slowest performance, yet is sufficient when used properly
Use to keep small source code or compiled binaries
Use for small (order of Mbs) data files
Do not use for data intensive batch jobs
Scratch
See details in Scratch storage guide.
Local disk space unique to each compute node
Fastest performance, data is removed when job completes
Use as scratch storage for batch Jobs that repeatedly access many small files or make frequent small reads/writes:
Not available on login nodes
Parallel
See details in DFS storage guide.
Provides a convenience access on all nodes via mount
Performance is best for processing medium/large data files (order of 100s Mbs/Gbs)
Use for batch jobs, most common place for data used in batch jobs
Use to keep source code, binaries
Do not use for writing/reading many small files
Campus Storage
See details in CRSP storage guide.
Provides a convenience access on all nodes via mount over NFS
Performance is best for processing medium/large data files (order of 100s Mbs/Gbs)
Use sometimes for batch Jobs, usually better to use DFS or local $TMPDIR storage
Use to keep source code, binaries
Do not use for writing/reading many small files
Campus Storage Annex
See details in CRSP ANNEX storage guide.
Provides a convenience access on all nodes via mount over BeeGFS
Performance is best for processing medium/large data files (order of 100s Mbs/Gbs)
Do not use for writing/reading many small files

1.1. CRSP vs. DFS

The largest capacity storage systems available are CRSP and DFS. Both are parallel filesystems but have different cost, availability, and usage models. The table below highlights the key differences and similarities between these two systems.

Table 1.2 Compare DFS and CRSP

Feature

CRSP

DFS

Cost

$50/TB/Year

$100/TB/5Years

Availability

Highly-available. No routinely planned outages. Can survive many types of hardware failures without downtime

Routine maintenance outage about 4X/year. Survives disk failures (RAID) only

Access

Access from any campus IP or VPN-connected laptop

Access only from HPC3

Snapshots

Daily file system snapshots allow users to self-recover from deletions or overwrites of files

No Snapshots

Backups

Backed up daily offsite with 90 day retention of deleted/changed files

Quota Management

Labs have a space/#files quota. Users and groups can have (sub)quotas set within the lab

All users share the same group quota. All files must be written with the same unix group id to access quota’ed space

Performance

High-performance but DFS is a better match for direct use from HPC3

High-performance. Most common storage for used on HPC3

Encryption at rest

All data is encrypted at rest.

Only dfs3b is encrypted at rest.

File System

IBM Storage Scale (aka GPFS)

BeeGFS with Thinkparq support. Details