User's guide of the Boréale cluster¶
Connection, environment and workspaces¶
The connection is done by SSH to the front-end nodes grouped under the name
Command line syntax:
ssh -l monlogin boreale.criann.fr (replacing
monlogin by your login). Linux and Mac environments natively integrate the SSH protocol via a terminal. If you are under Windows environment, we recommend you to use the MobaXTerm software which will bring you a complete working environment based on the ssh protocol (screen export, file transfer).
When you connect for the first time, you will be asked to change your password. Read carefully what is asked:
(current) Password is the current password and not the new one you want...
Customizations (environment variables, aliases) are done with a
~/.bashrc) to create.
The user has a personal workspace in
By default a user disk quota is set to 50 GB in
mmlsquota gpfs1:home command provides the quota and the space the user occupies in the
We encourage you to calculate in the temporary scratch folders (
/dlocal/run/<jobid> available via the
LOCAL_WORK_DIR environment variable) created by the Slurm batch tool for each calculation. A quota of 10 million files is enforced on the
/dlocal folder; the
mmlsquota gpfs1:dlocal command provides the quota and the number of files belonging to you in the
If you feel that these limits (quotas) are too restrictive for you, feel free to open a ticket with support. These limits can be increased upon justified request.
For more information about this workspace and recommended commands for managing your files, see this page.
If your structure has a firewall limiting the outgoing flows from your site, here are the ports to open:
- SSH connections (port 22) to the 2 front-end nodes behind the name
- IPv4 :
- IPv4 :
Concerning the remote viewing sessions, to know the IP and ports to open, please contact the support by mail at firstname.lastname@example.org
A hardware description of the Boréale cluster is available.
This machine targets vectorized applications and ideally of memory-bound profile, since the memory bandwidth of an NEC SX-Aurora TSUBASA 20B (or
Vector Engine) card has a high value of 1.53 TB/s, while its double-precision peak power has a value of 2.45 TFlop/s, which is moderate compared to other types of coprocessors. However, for comparison, the CPU power of a Boréale compute node (or
Vector Host), host of 8 VE 20B, is 2.97 TFlop/s.
Since a VE has 8 cores, application parallelization via OpenMP and/or MPI is required for full use of this processor. Each of these cores has 64 long vectors (registers), each with a size of 256 double precision elements (16384 bits). The vectorization of the internal code loops is then crucial.
Three modes of use of this machine are possible:
- Use of the VEs in native mode or automatic offloading
- The programming is based only on standard languages (FORTRAN, C, C++), OpenMPI and/or MPI. The application is compiled by the NEC compiler and the system automatically and completely offloads the execution to VE
- Use of VE and VH in hybrid mode
- If parts of the algorithm or I/O calls are not vectorizable, the native mode may not be efficient in performance. The NEC programming environment allows hybrid programming for CPU and VE with several APIs: main program on VE and kernel offload on VH (reverse offload mode, VHcall API), or main program on VH and kernel offload on VE (accelerator mode, VEO, AVEO and VEDA APIs). Refer to the training document for getting started with Boréale, and the following pages:
- Use of VH
- The usefulness of running CPU versions of code is naturally for comparison of numerical and performance results, with VE versions under development
Application and library environments are accessible through modules (see commands
For libraries, loading a module activates the environment variables that can be included in a Makefile: see the [modules] documentation page (modules.en.md).
Libraries or software compiled for the vector architecture have their module in the
/soft/Modules/Modules-boreale/vecto directory of the module folder (obtained by
module avail). The rest allows the use of host CPUs, for hybrid VH-VE or pure CPU computation modes.
- Quantum Espresso 6.4 for VE, with and without ELPA (Eigenvalue soLvers for Petaflop Application)
Data format libraries
- HDF5 1.10.5, sequential and parallel, for VE and VH (CPU version compiled by Gnu and NEC MPI)
- NETCDF C 4.7.4 and NETCDF FORTRAN 4.5.3, sequential and parallel, for VE and VH (CPU version compiled by Gnu and NEC MPI)
NEC Numeric Library Collection (NLC)
NEC's SDK for the VE architecture includes the NLC suite, which includes optimized versions of BLAS, LAPACK, SCALAPACK, and FFTW (sequential, OpenMP, or MPI) among others:
For the CPU architecture (x86), the Intel MPI compiler and library environment, OneAPI, and the Gnu compiler version 12.2.0 are available (type
For the NEC vector architecture, a
compilers/nec/3.5.1 module is available for the compiler (commands
nc++ for FORTRAN / C / C++).
For MPI, the following three modules allow compilation (wrappers
mpinc++ for FORTRAN / C / C++) and execution (
mpirun in submission scripts by Slurm), respectively for hybrid (VE-VH) or VH mode with Gnu's compiler, hybrid or VH mode with Intel's compiler and native (VE) :
For MPI code parts compiled for VH, the
-vh compilation option is required with NEC MPI.
Sample of useful compiler options with the NEC compiler
- Third level optimization, OpenMP directive interpretation and compiler diagnostics (vectorization report in
-O3 -fopenmp -report-all -fdiag-vector=3
- Add runtime profiling (ftrace):
-O3 -fopenmp -report-all -fdiag-vector=3 -ftrace
The Boreale Getting Started training document provides more information on the use of these compiler and profiler options.
Reference documentation on these topics can be found at https://sxauroratsubasa.sakura.ne.jp/Documentation#SDK
Submission environment (Slurm)¶
/soft/boreale/slurm/criann_scripts_templates directory contains the generic submission scripts for sequential or OpenMP, MPI, MPI+OpenMP code in native (
job*_VE.sl), hybrid (
job_VE-VH.sl), hybrid using the VEDA API (
job_*VEDA*), or purely CPU (
SBATCH --mem directive controls for a job:
The resident memory (RSS) per VE: the limit requested by
--memapplies to each VE in the job.
Resident memory per VH: the limit requested by
--memapplies to the job's sequential process, sum of threads, or sum of MPI tasks on CPU in each compute node (Vector Host) of the job
This table provides commands useful for job submission.
|Characteristics of partitions (classes)
|Submitting a Job
|Submit a job in hold
sbatch -H script_submission.sl
|Release a job in hold
scontrol release job_id
|List all jobs
|List your own jobs
squeue -u login
|Show job characteristics
scontrol show job_id
|Checking the syntax of a job in the job queue is a good way to check if the job has been queued
squeue --start --job job_id
|Syntax checking and scheduling of a job without submitting it to the system
sbatch --test-only script_submission.sl
|Checking the syntax and scheduling of a job without submitting it
squeue -u login --start
|Killing a job
The following utility variables (non-exhaustive list) can be used in the user commands (Shell) of a submission script.
|Job id (exemple : 64549)
|Job name (as specified by the
#SBATCH -J directive)
|Name of the initial directory (where the
sbatch command was submitted)
|Number of job MPI processes
|Name of the temporary scratch directory based on the calculation number:
This folder will be deleted 45 days after the end of the job.
Partitions (submission classes)¶
Compute execution works
The partition is specified by the submission script with the
#SBATCH --partition directive (or in the command line:
sbatch --partition compute script_submission.sl).
|Limits by calculation
|9 nodes (288 cores and 243000 MB / node)
startvisu command submits a
visu partition job, requiring 64 GB of memory, 4 cores and a 4 hour time limit.
The time request is extendable to a longer time if needed, by the --time option (example for 6 hours:
startvisu --time 6:00:00). The time limit for the visu partition is 8 hours.
A specific documentation for visualization jobs is available.
Handling signals sent by Slurm¶
Use email address email@example.com for user requests