Computing resources

Our clusters are grouped by CPU generations, available RAM size and infiniband networks. They are then sliced into partitions (See Clusters/Partitions overview).

Big picture

Hardware specifications per node:

Clusters

CPU family

nb cores

RAM (GB)

Network

main Scratch

Best use case

E5

E5

16

62, 124, 252

56Gb/s

/scratch/E5N

training, sequential, small parallel

Lake

E5 + GPU

8

124

56Gb/s

/scratch/Lake

sequential, small parallel , GPU computing

Sky Lake

32

94, 124, 190, 380

medium parallel, sequential

Cascade Lake

Epyc

AMD Epyc

128

510

100Gb/s

/scratch/Lake

large parallel

Cascade

Cascade Lake

96

380

100Gb/s

/scratch/Cascade

large parallel

See Clusters/Partitions overview for more hardware details and partitions slicing. Available RAM size may vary a little (not all RAM is available for computing, GB vs GiB, etc.).

Available resources

Use the sinfo [1] command to get the dynamic view of partitions (default one is noted with a ‘*’, also sinfo -l, sinfo -lNe and sinfo --summarize):

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
E5*          up 8-00:00:00      4   idle c82gluster[1-4]
Cascade      up 8-00:00:00     77   idle s92node[02-78]

Or informations state about a particular partition:

$ sinfo -p Epyc
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
Epyc         up 8-00:00:00      1    mix c6525node002
Epyc         up 8-00:00:00     12  alloc c6525node[001,003-006,008-014]
Epyc         up 8-00:00:00      1   idle c6525node007

To see more informations (cpus and cpu organization, RAM size [in MiB], state/availability), use one of these:

$ sinfo --exact --format="%9P %.8z %.8X %.8Y %.8c %.7m %.5D %N"
PARTITION    S:C:T  SOCKETS    CORES     CPUS  MEMORY NODES NODELIST
E5*          2:8:1        2        8       16  128872     4 c82gpgpu[31-34]
E5*          2:8:1        2        8       16   64328     3 c82gluster[2-4]
E5-GPU       2:4:1        2        4        8  128829     1 r730gpu20
Lake        2:16:1        2       16       32  385582     3 c6420node[172-174]
Cascade     2:48:1        2       48       96  385606    77 s92node[02-78]

$ sinfo --exact --format="%9P %.8c %.7m %.5D %.14F %N"
PARTITION     CPUS  MEMORY NODES NODES(A/I/O/T) NODELIST
E5*             16  128872     4        3/1/0/4 c82gpgpu[31-34]
E5*             16   64328     3        3/0/0/3 c82gluster[2-4]
E5-GPU           8  128829     1        0/1/0/1 r730gpu20
Lake            32  385582     3        1/2/0/3 c6420node[172-174]
Cascade         96  385606    77     47/26/4/77 s92node[02-78]

$ sinfo --exact --format="%9P %.8c %.7m %.20C %.5D %25f" --partition E5,E5-GPU
PARTITION     CPUS  MEMORY        CPUS(A/I/O/T) NODES AVAIL_FEATURES
E5*             16  256000       248/120/16/384    24 local_scratch
E5*             16  128828         354/30/0/384    24 (null)
E5*             16  257852          384/0/0/384    24 (null)
E5*             32  257843          384/0/0/384    12 (null)
E5*             16   64328            48/0/0/48     3 (null)
E5*             16  128872            64/0/0/64     4 (null)
E5-GPU           8  127000         32/128/0/160    20 gpu

A/I/O/T standing for Allocated/Idle/Other/Total, in CPU terms.

$ sinfo -lN | less
NODELIST     NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
[...]
c82gluster4      1       E5*        idle 16      2:8:1  64328        0      1   (null) none
s92node02        1   Cascade        idle 96     2:48:1 385606        0      1   (null) none
[...]

Important

  • HyperThreading [2] is activated on all Intel nodes, but not available as computing resources (real cores vs logical cores).

  • RAM size is in MiB, and you cannot reserve more than 94% of it, by node.

Basic defaults

  • default partition: E5

  • default time: 10 minutes

  • default cpu(s): 1 core

  • default memory size: 4GiB / core

Features

Some nodes have features [3] (gpu, local_scratch, etc.).

To request a feature/constraint, you must add the following line to your submit script: #SBATCH --constraint=<feature>. Example:

#!/bin/bash
#SBATCH --name=my_job_needs_local_scratch
#SBATCH --time=02:00:00
#SBATCH --ntasks=8
#SBATCH --mem-per-cpu=4096M
#SBATCH --constraint=local_scratch

Only nodes having features matching the job constraints will be used to satisfy the request.

Maximums

Here are some maximums of usable resources per job:

  • maximum wall-time : 8 days (‘8-0:0:0’ as ‘day-hours:minutes:secondes’)

  • maximum nodes per job and/or maximum cores per job:

Partition

nodes

cores

gpu

E5

24

384

E5-GPU

18

144

18

Lake

24

768

Epyc

14

1792

Cascade

76

7296

Anything more must be justified using our contact forms.