Singularity
Last Updated: December 6, 2024
Overview
Singularity is maintained in Genkai as a container environment.
Docker is not available, but Docker containers can be used from Singularity.
Containers can also be used on login nodes, but since memory usage is limited for processes on login nodes, problems may occur when using large containers (i.e., they cannot run due to insufficient memory). In such cases, compute nodes should be used.
For detailed usage of Singularity, functions and options other than those described in this page, please refer to Singularity manual or the singularity help
command.
Preparation for use
To use singularity, it is necessary to load the module in advance.
[ku40000105@genkai0001 ~]$ module load singularity-ce/4.1.3
[ku40000105@genkai0001 ~]$ module list
Currently Loaded Modulefiles:
1) singularity-ce/4.1.3(default)
[ku40000105@genkai0001 ~]$ singularity --version
singularity-ce version 4.1.3
[ku40000105@genkai0001 ~]$
|
Obtaining and building container images
You can get and build container images by singularity pull or singularity build.
When you get a Docker container from Docker Hub and use it, you need to specify the path with docker://
as follows.
[ku40000105@genkai0001 ~]$ singularity build ubuntu_23.10.sif docker://ubuntu:23.10
INFO: Starting build...
INFO: Fetching OCI image...
26.0MiB / 26.0MiB [==========================================================] 100 % 2.2 MiB/s 0s
INFO: Extracting OCI image...
INFO: Inserting Singularity configuration...
INFO: Creating SIF file...
INFO: Build complete: ubuntu_23.10.sif
[ku40000105@genkai0001 ~]$
|
(You can also use singularity pull ubuntu_23.10.sif docker://ubuntu:23.10
to get the same.)
The NVIDIA NGC Catalog allows you to retrieve pre-built containers related to GPUs.
Note that some containers are large and may take time to acquire and build.
(The pull process in the example below took about 17 minutes on a login node.
It seemed that the process after Extracting OCI image
took longer than downloading the data).
[ku40000105@genkai0001 ~]$ singularity pull docker://nvcr.io/nvidia/nvhpc:24.5-devel-cuda_multi-ubuntu22.04
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
INFO: Fetching OCI image...
2.4MiB / 2.4MiB [==============================================================] 100 % 4.5 MiB/s 0s
139.2MiB / 139.2MiB [==========================================================] 100 % 4.5 MiB/s 0s
28.2MiB / 28.2MiB [============================================================] 100 % 4.5 MiB/s 0s
6.4GiB / 6.4GiB [==============================================================] 100 % 4.5 MiB/s 0s
1.9MiB / 1.9MiB [==============================================================] 100 % 4.5 MiB/s 0s
3.9GiB / 3.9GiB [==============================================================] 100 % 4.5 MiB/s 0s
INFO: Extracting OCI image...
INFO: Inserting Singularity configuration...
INFO: Creating SIF file...
[ku40000105@genkai0001 ~]$
|
Using container images
To use the obtained container image (execute a program using the container), use the singularity exec
or singularity shell
command.
Executing commands on a container
The singularity exec
command allows you to execute specified commands on the container. In the following example, you can see that the OS information of the container is displayed instead of the OS installed on Genkai.
[ku40000105@genkai0001 ~]$ singularity exec ubuntu_23.10.sif cat /etc/os-release
PRETTY_NAME="Ubuntu 23.10"
NAME="Ubuntu"
VERSION_ID="23.10"
VERSION="23.10 (Mantic Minotaur)"
VERSION_CODENAME=mantic
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=mantic
LOGO=ubuntu-logo
[ku40000105@genkai0001 ~]$
|
Running commands interactively on containers
The singularity shell
command allows you to run commands interactively on the container.
[ku40000105@genkai0001 ~]$ singularity shell ubuntu_23.10.sif
Singularity> cat /etc/os-release
PRETTY_NAME="Ubuntu 23.10"
NAME="Ubuntu"
VERSION_ID="23.10"
VERSION="23.10 (Mantic Minotaur)"
VERSION_CODENAME=mantic
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=mantic
LOGO=ubuntu-logo
Singularity>exit
exit
[ku40000105@genkai0001 ~]$
|
Using containers when executing batch jobs
To use containers when executing batch jobs, add the -L jobenv=singularity
option when submitting the job.
Example of batch job
#!/bin/sh
module load singulariy-ce
singularity exec container.sif command
|
Note that executing the singularity shell
command in a batch job run will cause the job to wait while accepting command input.
The same option must be added to the pjsub command when executing an interactive job.
[ku40000105@genkai0001 ~]$ pjsub --interact -L rscgrp=b-inter,elapse=1:00:00,gpu=1,jobenv=singularity
[INFO] PJM 0000 pjsub Job 14528 submitted.
[INFO] PJM 0081 .connected.
[INFO] PJM 0082 pjsub Interactive job 14528 started.
$ module load singularity-ce
$ singularity shell ubuntu_23.10.sif
Singularity> cat /etc/debian_version
trixie/sid
Singularity>
exit
[ku40000105@genkai0001 ~]$ singularity exec ubuntu_23.10.sif cat /etc/os-release
PRETTY_NAME="Ubuntu 23.10"
NAME="Ubuntu"
VERSION_ID="23.10"
VERSION="23.10 (Mantic Minotaur)"
VERSION_CODENAME=mantic
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=mantic
LOGO=ubuntu-logo
$ exit
[INFO] PJM 0083 pjsub Interactive job 14528 completed.
[ku40000105@genkai0001 ~]$
|
Note that if you forget to add the option to the pjsub command, an error will occur when executing the container.
The following is an example of a forgotten option.
[ku40000105@genkai0001 ~]$ pjsub --interact -L rscgrp=b-inter,elapse=1:00:00,gpu=1
[INFO] PJM 0000 pjsub Job 14501 submitted.
[INFO] PJM 0081 .connected.
[INFO] PJM 0082 pjsub Interactive job 14501 started.
$ module load singularity-ce
$ singularity shell ubuntu_23.10.sif
ERROR : Failed to create mount namespace: mount namespace requires privileges, check Singularity installation
$ singularity exec ubuntu_23.10.sif cat /etc/os-release
ERROR : Failed to create mount namespace: mount namespace requires privileges, check Singularity installation
$ exit
exit
[INFO] PJM 0083 pjsub Interactive job 14501 completed.
[ku40000105@genkai0001 ~]$
|
Using GPUs from containers
If you want to use GPUs from containers in node groups B or C, please add the --nv
option when you run singularity exec
or singularity pull
.
If the --nv
option is not specified, the GPU cannot be recognized by the container.
[ku40000105@genkai0001 ~]$ pjsub --interact -L rscgrp=b-inter,elapse=1:00,gpu=1,jobenv=singularity
[INFO] PJM 0000 pjsub Job 14529 submitted.
[INFO] PJM 0081 .connected.
[INFO] PJM 0082 pjsub Interactive job 14529 started.
$ module load singularity-ce
$ singularity shell --nv nvhpc_24.5-devel-cuda_multi-ubuntu22.04.sif
Singularity> nvidia-smi
Tue Jun 18 16:17:41 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA H100 On | 00000000:1C:00.0 Off | 0 |
| N/A 20C P0 68W / 700W | 0MiB / 95830MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Singularity> nvidia-smi -L
GPU 0: NVIDIA H100 (UUID: GPU-03fb7359-eec6-36a3-9746-0bc092721223)
Singularity> pgaccelinfo
CUDA Driver Version: 12020
NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.154.05 Thu Dec 28 15:37:48 UTC 2023
Device Number: 0
Device Name: NVIDIA H100
Device Revision Number: 9.0
Global Memory Size: 99875094528
Number of Multiprocessors: 132
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536
Total Shared Memory per Block: 49152
Registers per Block: 65536
Warp Size: 32
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647B
Texture Alignment: 512B
Clock Rate: 1980 MHz
Execution Timeout: No
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: default
Concurrent Kernels: Yes
ECC Enabled: Yes
Memory Clock Rate: 1593 MHz
Memory Bus Width: 6144 bits
L2 Cache Size: 62914560 bytes
Max Threads Per SMP: 2048
Async Engines: 5
Unified Addressing: Yes
Managed Memory: Yes
Concurrent Managed Memory: Yes
Preemption Supported: Yes
Cooperative Launch: Yes
Cluster Launch: Yes
Unified Function Pointers: Yes
Default Target: cc90
Singularity>
exit
$ singularity shell nvhpc_24.5-devel-cuda_multi-ubuntu22.04.sif
Singularity> nvidia-smi
bash: nvidia-smi: command not found
Singularity> nvidia-smi -L
bash: nvidia-smi: command not found
Singularity> pgaccelinfo
No accelerators found.
Try pgaccelinfo -v for more information
Singularity> pgaccelinfo -v
libcuda.so not found
No accelerators found.
Check that you have installed the CUDA driver properly
Check that your LD_LIBRARY_PATH environment variable points to the CUDA runtime installation directory
Singularity>
exit
$
|
Build a container from a recipe file
You can build a customized container from a recipe file (equivalent to a Dockerfile in Docker).
You can also run it on a login node.
Note that some containers may require GPU access during the building process (otherwise, the GPU-enabled container environment cannot be built).
In such a case, please build the container on node group B or C.
# Example of building a container with built-in dnsutils package so that the nslookup command,
# which is not installed as standard in recent Ubuntu, can be used.
# Check the recipe file
[ku40000105@genkai0001 ~]$ cat ubuntu.def
Bootstrap: docker
From:ubuntu:24.04
%post
apt-get update -y
apt-get install -y dnsutils
[ku40000105@genkai0001 ~]$
# Build containers using recipe files
[ku40000105@genkai0001 ~]$ singularity build -f test.sif ./ubuntu.def
INFO: Starting build...
INFO: Fetching OCI image...
28.3MiB / 28.3MiB [====================================================] 100 % 8.2 MiB/s 0s
INFO: Extracting OCI image...
INFO: Inserting Singularity configuration...
INFO: Running post scriptlet
+ apt-get update -y
Get:1 http://archive.ubuntu.com/ubuntu noble InRelease [256 kB]
(snip)
Get:16 http://archive.ubuntu.com/ubuntu noble-backports/universe amd64 Packages [7519 B]
Fetched 22.9 MB in 5s (4567 kB/s)
Reading package lists... Done
+ apt-get install -y dnsutils
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
(snip)
0 upgraded, 19 newly installed, 0 to remove and 0 not upgraded.
Need to get 14.1 MB of archives.
After this operation, 46.3 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu noble/main amd64 krb5-locales all 1.20.1-6ubuntu2 [13.8 kB]
(snip)
Processing triggers for libc-bin (2.39-0ubuntu8.2) ...
INFO: Creating SIF file...
INFO: Build complete: test.sif
[ku40000105@genkai0001 ~]$
# Check the container you built
[ku40000105@genkai0001 ~]$ singularity exec test.sif nslookup google.com
Server: 172.16.0.21
Address: 172.16.0.21#53
Non-authoritative answer:
Name: google.com
Address: 142.250.206.238
Name: google.com
Address: 2404:6800:400a:804::200e
[ku40000105@genkai0001 ~]$
|
Rebuild container (edit container image)
Instead of using a recipe file, the container can be edited and rebuilt through an interactive process. However, this functionality is limited by the file system and requires the use of SSDs on the compute nodes. (In node groups B and C, the SSD can be referenced with ${PJM_SSD_DIR} after the job is started.)
The operation procedure is to prepare a container in the form of an expanded directory, start the container with special options, perform the editing operation, and then put it together again into a container file.
A concrete example is shown below.
# Attempt to introduce the nslookup program, which is not included in recent Ubuntu,
# as an example of editing the container
[ku40000105@genkai0002 ~]$ singularity build -s ./ubuntu docker://ubuntu:latest
[ku40000105@genkai0002 ~]$ singularity shell ./ubuntu.sif
Singularity> nslookup google.com
bash: nslookup: command not found
Singularity>
exit
[ku40000105@genkai0002 ~]$
# Build the container.
# Launch interactive job with jobenv=singularity option (using 1 sub-GPU of mig,
# the resource group with the lowest point consumption)
[ku40000105@genkai0002 ~]$ pjsub --interact -L rscgrp=b-inter-mig,elapse=1:00:00,gpu=1,jobenv=singularity
[INFO] PJM 0000 pjsub Job 116222 submitted.
[INFO] PJM 0081 .connected.
[INFO] PJM 0082 pjsub Interactive job 116222 started.
# change current directory to SSD
[ku40000105@b0037 ~]$ cd $PJM_SSD_DIR
# load singularity-ce module and build ubuntu container with -s option
# (directories are created instead of container files)
[ku40000105@b0037 116222]$ module load singularity-ce
[ku40000105@b0037 116222]$ singularity build -s ./ubuntu docker://ubuntu:latest
INFO: Starting build...
INFO: Fetching OCI image...
28.3MiB / 28.3MiB [===============================================================] 100 % 0.0 b/s 0s
INFO: Extracting OCI image...
INFO: Inserting Singularity configuration...
INFO: Creating sandbox directory...
INFO: Build complete: ./ubuntu
# Start the singularity shell with the -f -w option (files are updated by working on the container)
[ku40000105@b0037 116222]$ singularity shell -f -w ./ubuntu
WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container
WARNING: Skipping mount /etc/libibverbs.d [binds]: /etc/libibverbs.d doesn't exist in container
# Update package management information
Singularity> apt update -y
(snip)
# Install dnsutils including nslookup command
Singularity> apt install -y dnsutils
(snip)
Singularity>
exit
# Rebuild to container file
[ku40000105@b0037 116222]$ singularity build -f ${HOME}/ubuntu_dnsutils.sif ./ubuntu
INFO: Starting build...
INFO: Creating SIF file...
INFO: Build complete: /home/pj24001603/ku40000105/ubuntu_dnsutils.sif
# Interactive job terminated (all files created on SSD will be lost)
[ku40000105@b0037 116222]$ exit
[INFO] PJM 0083 pjsub Interactive job 116222 completed.
[ku40000105@genkai0002 ~]$
# Make sure that the built container can run the nslookup program
[ku40000105@genkai0002 ~]$ singularity shell ./ubuntu_dnsutils.sif
Singularity> nslookup google.com
Server: 172.16.0.21
Address: 172.16.0.21#53
Non-authoritative answer:
Name: google.com
Address: 172.217.25.174
Name: google.com
Address: 2404:6800:400a:80a::200e
Singularity>
exit
[ku40000105@genkai0002 ~]$
|
How to deal with errors when downloading containers
When downloading (pull, build) a large container, the download process may be terminated in the middle. In this case, an error message such as INTERNAL_ERROR; received from peer
will be displayed. In general, this is not a major problem, since the same download process can be continued by executing the same download process again.
This problem may be avoided by setting the environment variable as follows. Please try it.
export GODEBUG=http2client=0
|