HPC clusters consist of an enormous number of computation cores (CPUs), that can be utilized as a whole during the execution of computing heavy programs. To put it in a simple way, HPC clusters can be thought of as a large number of "computers" connecting with each other and sharing the computation payload through the connections. Each "computer" is called a node and each node possesses a certain number of processors.

Simulations related to the NeuLAND detectors usually takes a significantly amount of times due to the heavy computations of the particle interactions from the Geant4 simulation framework. They could take days or even weeks if more than a million events need to be simulated. The solution to reduce the computation time is to run the simulation in the HPC clusters such that each core runs only a fraction of the total events independently and simultaneously. For example, if the simulation is run with 20 nodes and each node uses 50 cores, the real simulation time would be 1000 ( = 20 x 50 ) times shorter.

How to do

Here are the steps to run the NeuLAND cli application in HPC clusters:

Login to a submit node:

ssh username@virgo.hpc.gsi.de

See this subsection below about how to get access to the HPC submit node.
Download NeuLAND Apptainer image in any folder under /lustre:

Before pulling the image, please make sure Sylabs has been added to the remote if it has not been done before:

apptainer remote add --no-login SylabsCloud cloud.sycloud.io

Then pull the image to your current folder:

apptainer pull -F neuland library://yanzhao/r3bdev/neuland:latest

This could take few minutes if you have a slow internet. See this section below for more details about Apptainer images.
Create a submit script: The file should contain the following content:

#!/bin/bash

#SBATCH --nodes=[number of nodes]

#SBATCH --ntasks-per-node=[number of cores per node]

#SBATCH --account=[slurm account name]

#SBATCH --job-name=[job name]

#SBATCH --output=[STDOUT output text file name]

#SBATCH --chdir=[path to working directory]

srun [path to image dir]/neuland sim -c neuland_sim_config.json

The location of the submit script as well as all the paths specified inside the script must all under the folder /lustre. See this section below for the explanations of each option.
Submit the task:

sbatch -p [partition] submit_script.txt

The [partition] option could be either debug, main, long or new.
Check the status of the running task:

squeue --me

NeuLAND program as an Apptainer image

Attention: The usage of NeuLAND Apptainer image requires the software apptainer already installed in the server. If not, please contact and ask the IT department to install the software.

An Apptainer image could be thought of as a bundle which contains the program and everything that program needs, such as the operating system, compilers and third party libraries. The operating system used in the NeuLAND Apptainer image is Fedora 41 with gcc14 as the main C++ compiler. Here is the list of version information of the contained compilers and libraries:

gcc: 14.2
FairSoft: jan24p4
FairRoot: dev branch
ucesb: dev branch

The dev branch from the above list contains the latest commit up to the time when the container was built.

How the image is built

The build processes of the NeuLAND Apptainer image can be summarized in the following steps:

Build the docker image yanzhaowang/r3bdev:fedora41, containing the compiler and FairSoft, using the Fedora 41 base image. The build script (i.e. Dockerfile) can be found here.
Build the docker image yanzhaowang/r3bdev:r3broot, which contains the dev version of FairRoot and the edwin_dev version of R3BRoot, using the previous r3bdev:fedora41 as the base image. Its build script can be found in this webpage.
Build the Apptainer image yanzhao/r3bdev/neuland:latest, which specifies the execution script of the image, using the previous docker image r3bdev:r3broot as the base image. The Apptainer build script can be found from the file, neuland.def.

The step 2 and 3 are automatically done by the this CI/CD workflow whenever a new commit is pushed to edwin_dev branch. Both the docker images, r3bdev:fedora and r3bdev:r3broot, can be found in this dockerhub repo and the Apptainer image can be found in this sylabs repo.

Testing the validity of the image

To test whether the image still works, first download the image if not yet done:

apptainer pull -F neuland library://yanzhao/r3bdev/neuland:latest

then run a simple simulation like:

./neuland sim

HPC submit node and Slurm

HPC clusters have some special nodes that are only used to submit the tasks from users. These nodes are called "submit nodes". Please visit this website to check all the available nodes in GSI. All the actions, such as submitting a task and querying the status are done with a software called Slurm. Available commands from the Slurm can be found in the official Slurm documentation.

Registration

To get access to the submit node, users have to complete the registration using this link (GSI Web account is acquired). During the registration, please provide the following information:

Linux group: land
Collaboration/Experiment/Department: r3b
Slurm account name: r3b
Slurm account coordinator: Spokesperson's name

Login to a submit node

Note: All available HPC submit nodes, such as virgo.hpc.gsi.de are behind the GSI network firewall and can only be accessed from other servers (as a jump server, e.g. lx-pool.gsi.de) in the GSI network.

To login with a jump server:

ssh -J username@lx-pool.gsi.de username@virgo.hpc.gsi.de

A simpler way to login to submit node is to add the following configuration to ~/.ssh/config file (please create one if not existed):

Host gsigate
    HostName lx-pool.gsi.de
    User username
 
Host gsihpc
    HostName virgo.hpc.gsi.de
    User username
    ProxyJump gsigate
    ForwardAgent no

Then, login can be simply done with:

ssh gsihpc

Submit script

The submit script specifies the configuration information needed to run a task, such as the number of nodes or account information and the execution command that launches the program. Each configuration must starts with #SBATCH, followed by an option and its value:

#!/bin/bash
#SBATCH --nodes=[number]
#SBATCH --ntasks-per-node=[number]
#SBATCH --account=[string]
#SBATCH --job-name=[string]
#SBATCH --output=[string]
#SBATCH --chdir=[string]

The meanings of these options are:

--nodes: The number of nodes for the task.
--ntasks-per-node: The number of cores used in the each node.
--account: The Slurm account used in the registration.
--job-name: The name of your task (job).
--output: The name the output file, which contains all STDOUT prints from the program.
--chdir: The path to the working directory when the task is run in HPC clusters.

All other Slurm options can be found in its official documentation website.

After the specification of options, user needs to specify the execution command to run the program:

srun [path to image dir]/neuland sim -c neuland_sim_config.json

Again, all the files and folders mentioned above must be under /lustre as it's the only file partition that is mounted to the GSI Cluster nodes.

Important: To run the NeuLAND program simultaneously and independently in each core, enable-mpi from the general JSON configuration must be true.

Job status

The command

squeue --me

returns the information of the running jobs belonging to you. The ST and NODELIST(REASON) columns indicate the status of the job and the "reason" why job is at the corresponding status.

The status of job can be one of:

Status	code	Explanation
COMPLETED	CD	The job has completed successfully.
COMPLETING	CG	The job is finishing but some processes are still active.
FAILED	F	The job terminated with a non-zero exit code and failed to execute.
PENDING	PD	The job is waiting for resource allocation. It will eventually run.
PREEMPTED	PR	The job was terminated because of preemption by another job.
RUNNING	R	The job currently is allocated to a node and is running.
SUSPENDED	S	A running job has been stopped with its cores released to other jobs.
STOPPED	ST	A running job has been stopped with its cores retained.

Job reason code could be one of:

Reason Code	Explanation
Priority	One or more higher priority jobs is in queue for running. Your job will eventually run.
Dependency	This job is waiting for a dependent job to complete and will run afterward.
Resources	The job is waiting for resources to become available and will eventually run.
InvalidAccount	The job’s account is invalid. Cancel the job and rerun with the correct account.
InvalidQoS	The job’s QoS is invalid. Cancel the job and rerun with the correct account.
QOSGrpCpuLimit	All CPUs assigned to your job’s specified QoS are in use; the job will run eventually.
QOSGrpMaxJobsLimit	Maximum number of jobs for your job’s QoS have been met; the job will run eventually.
QOSGrpNodeLimit	All nodes assigned to your job’s specified QoS are in use; the job will run eventually.
PartitionCpuLimit	All CPUs assigned to your job’s specified partition are in use; the job will run eventually.
PartitionMaxJobsLimit	Maximum number of jobs for your job’s partition have been met; the job will run eventually.
PartitionNodeLimit	All nodes assigned to your job’s specified partition are in use; the job will run eventually.
AssociationCpuLimit	All CPUs assigned to your job’s specified association are in use; the job will run eventually.
AssociationMaxJobsLimit	Maximum number of jobs for your job’s association have been met; the job will run eventually.
AssociationNodeLimit	All nodes assigned to your job’s specified association are in use; the job will run eventually.

note: Tables above are copied from this website.

Example

TO be added ...

Table of Contents