Computing resources
Our clusters are grouped by CPU generations, available RAM size and infiniband networks. They are then sliced into partitions (See Clusters/Partitions overview).
Big picture
Hardware specifications per node:
Clusters |
CPU family |
nb cores |
RAM (GB) |
Network |
main Scratch |
Best use case |
---|---|---|---|---|---|---|
E5 |
E5 |
16 |
62, 124, 252 |
56Gb/s |
/scratch/E5N |
training, sequential, small parallel |
Lake |
E5 + GPU |
8 |
124 |
56Gb/s |
/scratch/Lake |
sequential, small parallel , GPU computing |
Sky Lake |
32 |
94, 124, 190, 380 |
medium parallel, sequential |
|||
Cascade Lake |
||||||
Epyc |
AMD Epyc |
128 |
510 |
100Gb/s |
/scratch/Lake |
large parallel |
Cascade |
Cascade Lake |
96 |
380 |
100Gb/s |
/scratch/Cascade |
large parallel |
See Clusters/Partitions overview for more hardware details and partitions slicing. Available RAM size may vary a little (not all RAM is available for computing, GB vs GiB, etc.).
Available resources
Use the sinfo
[1] command to get the dynamic view of partitions (default one is noted with a ‘*’, also sinfo -l
, sinfo -lNe
and sinfo --summarize
):
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
E5* up 8-00:00:00 4 idle c82gluster[1-4]
Cascade up 8-00:00:00 77 idle s92node[02-78]
Or informations state about a particular partition:
$ sinfo -p Epyc
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
Epyc up 8-00:00:00 1 mix c6525node002
Epyc up 8-00:00:00 12 alloc c6525node[001,003-006,008-014]
Epyc up 8-00:00:00 1 idle c6525node007
To see more informations (cpus and cpu organization, RAM size [in MiB], state/availability), use one of these:
$ sinfo --exact --format="%9P %.8z %.8X %.8Y %.8c %.7m %.5D %N"
PARTITION S:C:T SOCKETS CORES CPUS MEMORY NODES NODELIST
E5* 2:8:1 2 8 16 128872 4 c82gpgpu[31-34]
E5* 2:8:1 2 8 16 64328 3 c82gluster[2-4]
E5-GPU 2:4:1 2 4 8 128829 1 r730gpu20
Lake 2:16:1 2 16 32 385582 3 c6420node[172-174]
Cascade 2:48:1 2 48 96 385606 77 s92node[02-78]
$ sinfo --exact --format="%9P %.8c %.7m %.5D %.14F %N"
PARTITION CPUS MEMORY NODES NODES(A/I/O/T) NODELIST
E5* 16 128872 4 3/1/0/4 c82gpgpu[31-34]
E5* 16 64328 3 3/0/0/3 c82gluster[2-4]
E5-GPU 8 128829 1 0/1/0/1 r730gpu20
Lake 32 385582 3 1/2/0/3 c6420node[172-174]
Cascade 96 385606 77 47/26/4/77 s92node[02-78]
$ sinfo --exact --format="%9P %.8c %.7m %.20C %.5D %25f" --partition E5,E5-GPU
PARTITION CPUS MEMORY CPUS(A/I/O/T) NODES AVAIL_FEATURES
E5* 16 256000 248/120/16/384 24 local_scratch
E5* 16 128828 354/30/0/384 24 (null)
E5* 16 257852 384/0/0/384 24 (null)
E5* 32 257843 384/0/0/384 12 (null)
E5* 16 64328 48/0/0/48 3 (null)
E5* 16 128872 64/0/0/64 4 (null)
E5-GPU 8 127000 32/128/0/160 20 gpu
A/I/O/T
standing for Allocated/Idle/Other/Total
, in CPU terms.
$ sinfo -lN | less
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
[...]
c82gluster4 1 E5* idle 16 2:8:1 64328 0 1 (null) none
s92node02 1 Cascade idle 96 2:48:1 385606 0 1 (null) none
[...]
Basic defaults
default partition: E5
default time: 10 minutes
default cpu(s): 1 core
default memory size: 4GiB / core
Features
Some nodes have features [3] (gpu
, local_scratch
, etc.).
To request a feature/constraint, you must add the following line to your submit script: #SBATCH --constraint=<feature>
. Example:
#!/bin/bash
#SBATCH --name=my_job_needs_local_scratch
#SBATCH --time=02:00:00
#SBATCH --ntasks=8
#SBATCH --mem-per-cpu=4096M
#SBATCH --constraint=local_scratch
Only nodes having features matching the job constraints will be used to satisfy the request.
Maximums
Here are some maximums of usable resources per job:
maximum wall-time : 8 days (‘8-0:0:0’ as ‘day-hours:minutes:secondes’)
maximum nodes per job and/or maximum cores per job:
Partition |
nodes |
cores |
gpu |
---|---|---|---|
E5 |
24 |
384 |
|
E5-GPU |
18 |
144 |
18 |
Lake |
24 |
768 |
|
Epyc |
14 |
1792 |
|
Cascade |
76 |
7296 |
Anything more must be justified using our contact forms.