Use scratch
A scratch filesystem is a temporary space where you can copy input data and write intermediate, temporary and output results from a job. It is not mandatory to work with a scratch but it offer some advantages:
larger space compared to
$HOME
or group-shared datasets (/Xnfs/$GROUP
),provide greater IOPS [1]
However, scratches are shared spaces. Their performances depend on good use. They should NOT contain:
documentations (too much small files, and… why?) ->
$HOME
,/Xnfs/$GROUP
symbolic links (very tiny files, useless IO)
source codes or softwares (small files, useless IO) ->
$HOME
,/Xnfs/$GROUP
conda, modules or libraries, virtualenv… ->
$HOME
,/Xnfs/$GROUP
Hint
Only input data, temporary data and output data.
Warning
Scratches are not meant for long duration storage. You should cleanup as fast as your job finishes.
DO NOT STORE source codes (including conda, virtualenv…), small files or symbolic links in $SCRATCH
. It degrade performances very fast, FOR EVERYONE.
When scratches are full, PSMN’s Staff will erase files and directories blindly.
You have been warned.
Two types of scratches are available, global to a partition (or cluster) or local to a node. See Login nodes and Clusters/Partitions overview for access, repartition and paths.
Name |
Type |
snapshots |
quotas |
Performance |
Purpose |
---|---|---|---|---|---|
|
glusterfs |
no |
no |
high |
|
local scratch (ssd, disk) |
ext4, zfs |
no |
no |
medium, high |
job specific output requiring IOPS [1] (120 days lifetime residency) |
Warning
DO NOT STORE source codes (including conda, virtualenv…), small files or symbolic links in $SCRATCH
. It degrade performances very fast, FOR EVERYONE.
On this PSMN’s networks topology, you can visualize how scratches connect to clusters:
Examples
Here’s two ways of using scratch:
manual copy
From a login node, you can create, copy, delete into
/scratch/$CLUSTER/$USER/whatever/
, before and after a (set of) job(s).Automated copy
Inside your batch script, you can copy input data to scratch, indicate to your software which
$TMPDIR
or$SCRATCHDIR
to use. Also cleanup at the end of a successfull job.This is mandatory for local scratch, on specific nodes. See Clusters/Partitions overview for paths.
See our repository of examples scripts.
If you do not feel comfortable with scripts and scratch, please ask us around a coffee.
Note
These machines were set up thanks to the preparatory work, recipes and integrations carried out on the CBP experimental platform.