Transfer protocols
A number of methods allow transferring data in and out of PSMN computing center. For most cases, we recommend using SSH-based file transfer commands, such as scp
, sftp
, or rsync
. They will provide the best performance for data transfers from and to computing center.
For the rest of this documentation, replace mylogin
by your login as provided by PSMN.
Note
All examples below are based on the following configuration and login nodes.
SCP (Secure Copy)
The easiest command to use to transfer files to/from PSMN is scp. It works like the cp command, except it work over the network to copy files from one computer to another, using SSH protocol.
For instance, the following command will copy the file named myfile from my local machine to the mydir directory in my home directory on PSMN (on allo-psmn gateway):
$ scp myfile mylogin@allo-psmn.psmn.ens-lyon.fr:~/mydir/
Important
While this is handy, for large transfert operations it is better to use multi-hops (See related documentation and login nodes):
$ scp myfile mylogin@x5570comp1:~/mydir/
(replace mylogin
by your login as provided by PSMN, x5570comp1
by your prefered login node).
You can copy myfile under a different name, or to another directory, with the following commands:
$ scp myfile mylogin@x5570comp1:~/inputfile
$ scp myfile mylogin@x5570comp1:~/mydir/subdir/foofile
To copy back files from PSMN to your local machine, you just need to reverse the order of the arguments, as in this example:
$ scp mylogin@x5570comp1:~/inputfile local_inputfile
scp also support recursive copying of directories, with -r option:
$ scp -r mydir/ mylogin@x5570comp1:~/
SCP from outside ENS network
To transfer your files between your PC and allo.psmn
from outside the ENS network, you have to use the ssh.psmn
gateway as a proxy (see Connection on PSMN servers), so in a terminal of your workstation, you could execute:
#your PC -> your PSMN home:
$ scp -oProxyCommand="ssh mylogin@ssh.psmn.ens-lyon.fr netcat -w1 allo-psmn %p" source_file mylogin@allo-psmn:~/destination_file
# your PSMN home -> your PC :
$ scp -oProxyCommand="ssh mylogin@ssh.psmn.ens-lyon.fr netcat -w1 allo-psmn %p" mylogin@allo-psmn:~/source_file destination_file
where source_file and destination_file should be changed as needed. If you want to transfer a directory (and not a file) you have to add -r
option to scp
(i.e. scp -r -oProxyCommand=...
).
Important
While this is handy, for large transfert operations it is better to use multi-hops (See related documentation and login nodes). Which will resume to:
$ scp source_file x5570comp1:~/destination_file
SFTP (Secure File Transfer Protocol)
SFTP clients are interactive file transfer programs (as to FTP), which perform all operations over an encrypted transport.
A variety of graphical SFTP clients are available:
When setting up your connection to PSMN in the above, use these informations:
host: x5570comp1
port: 22
ssh gateway (or jump host): allo-psmn.psmn.ens-lyon.fr
port: 22
username: your login at PSMN
password: your password at PSMN (if needed)
ssh key: your personnal ssh private key file (prefered method)
MobaXterm or WinSCP (via PuTTY/KiTTY) can use ssh keys and ssh-agent, and multi-hops.
However, as FileZilla has no native support for SSH tunnelling (aka jump hosts/port forwarding), you will have to setup a ssh tunnel on your local machine:
$ ssh -L 3322:x5570comp1:22 mylogin@allo-psmn.psmn.ens-lyon.fr
then configure FileZilla to connect to localhost:3322 using your PSMN credentials. This will also work for MobaXterm or WinSCP on Windows (using OpenSSH). It will also be necessary when more than one hop is needed (localhost -> ssh.psmn -> allo-psmn -> x5570comp1).
OpenSSH also provide a command-line SFTP, named sftp, which can take advantage of ssh-agent, ssh keys and configured ProxyJump. Example of use:
$ sftp mylogin@x5570comp1
Connected to x5570comp1.
sftp>
There are many tutorials online containing more informations about SFTP clients. Here’s one.
rsync
If you have complex hierarchies of files to transfer, or if you need to synchronize a set of files and directories between your local machine and PSMN storages, rsync will be one of the best tools to do the job. It will efficiently transfer and synchronize files across systems, by checking the timestamp and size of files. Which means that it won’t re-transfer files that have not changed since the last transfer, and will complete faster.
Also, if, for any reason, a transfer is interrupted, you might end up with part of files being transferred. Rather than restarting the transfer from scratch, rsync will only transfer what needs to be transferred: missing files, modified files, etc.
For large transfert operations, it is better to use multi-hops (See related documentation and login nodes).
For instance, to transfer the whole ~/test/
folder tree from my local machine to my home directory on PSMN, I can use the following command:
$ rsync -n -avzP -e ssh ~/test/ mylogin@x5570comp1:~/test
Refer to the rsync manual for more options, like these ones:
--dry-run (-n)
--archive
--verbose
--recursive
--itemize-changes
--append-verify
--progress
--bwlimit=56K
--numeric-ids
Warning
Always test with a dry-run first !!!
As it is very easy to rsync empty data, or non-existent data, to existent data (therefore erasing data), we do recommend to test with a -n/--dry-run
first.
fpart (+rsync)
fpart generate lists of files that can be feeded to rsync, correcting some of rsync defaults on large filetrees:
no parallelism -> small parallelism (3 to 4 process, don’t be greedy),
larges batches that don’t fit in memory -> small batches (start early, fit in memory),
decreasing use of bandwidth over time -> frequent ‘restarts’ maintening maximum use of bandwidth over time.
See fpart documentation.
$ cd /Xnfs/planetary
$ fpart -L -v -f 2000 -Z -o /tmp/planetary.part.out -W \
'parallel --semaphore -j 4 \
"rsync -e ssh -az --numeric-ids --files-from=${FPART_PARTFILENAME} /Xnfs/planetary/ user@external_server:/data/planetary"' .
This example will scan the /Xnfs/planetary filetree, creating lists of 2000 files each, feeding them to 4 parallel rsync that copy these files, from a PSMN login node, over ssh, on external_server.
Refer to the fpart manual for more options and use cases.
Unison
unison
is a file-synchronization tool that is available on PSMN clusters.
SSHFS
Sometimes, moving files in and out of the cluster, and maintaining two copies of each of the files you work on, both on your local machine and on PSMN, may be painful. Fortunately, PSMN offers the ability to mount its home filesystem to your local machine, using a secure and encrypted connection (and vice-versa, if your workstation expose a SSH server).
With SSHFS, a FUSE-based filesystem implementation used to mount remote SSH-accessible filesystems, you can access your files on PSMN as if they were locally stored on your own computer.
Hint
Be aware that, while very convenient, SSHFS is also quite slow, due to FUSE.
This comes particularly handy when you need to access those files from an application that is not available on PSMN, but that you already use or can install on your local machine. Like a data processing program that you have licensed for your own computer but can’t be use on PSMN, a specific text editor that only runs on MacOS, or any data-intensive 3D rendering software that wouldn’t work comfortably enough over a forwarded X11 connection (See also Visualization server).
SSHFS is available for all platforms (Linux, MacOS and Windows).
Warning
SSHFS on MacOS
SSHFS on macOS is known to try to automatically reconnect filesystem mounts after resuming from sleep or suspend, even without any valid credentials. As a result, it will generate a lot of failed connection attempts and likely make your IP address blacklisted on ssh.psmn.ens-lyon.fr or allo-psmn.psmn.ens-lyon.fr.
Make sure to unmount your SSHFS drives before putting your macOS system to sleep to avoid this situation.
For instance, on a Linux machine with SSHFS installed, you could mount your PSMN home directory with the following commands:
$ mkdir ~/PSMN_home
$ sshfs mylogin@allo-psmn.psmn.ens-lyon.fr:~/ ~/PSMN_home
(replace mylogin
by your login as provided by PSMN).
And to unmount it:
$ umount ~/PSMN_home
or:
$ fusermount -u ~/PSMN_home
For more information about using SSHFS on your local machine, you can refer to this tutorial for more details and examples.