Containers Guide

Containers are new and generally exciting development in HPC workloads. Containers rely on existing kernel features to allow greater user control over what applications see and can interact with at any given time. For HPC Workloads, these are usually restricted to the mount namespace. Slurm allows container developers to create SPANK Plugins that can be called at various points of job execution to support containers. Slurm is generally agnostic to containers and can be made to start most, if not all, types.

Links to several container varieties are provided below:

Please note this list is not exhaustive as new containers types are being created all the time.


Container Types

Charliecloud

Charliecloud is user namespace container system sponsored by LANL to provide HPC containers. Charliecloud supports the following:

Docker (running as root)

Docker currently has multiple design points that make it unfriendly to HPC systems. The issue that usually stops most sites from using Docker is the requirement of "only trusted users should be allowed to control your Docker daemon" [Docker Security] which is not acceptable to most HPC systems.

Sites with trusted users can add them to the docker Unix group and allow them control Docker directly from inside of jobs. There is currently no direct support for starting or stopping docker containers in Slurm.

UDOCKER

UDOCKER is Docker feature subset clone that is designed to allow execution of docker commands without increased user privileges.

Rootless Docker

Rootless Docker (>=v20.10) requires no extra permissions for users and currently (as of January 2021) has no known security issues with users gaining privileges. Each user will need to run an instance of the dockerd server on each node of the job in order to use docker. There are currently no helper scripts or plugins for Slurm to automate the build up or tear down the docker daemons.

Kubernetes Pods (k8s)

Kubernetes is a container orchestration system that uses PODs, which are generally a logical grouping of containers for singular purpose.

There is currently no support for Kubernetes Pods in Slurm. Users wishing to run OCI images contained in Pods via Slurm might consider one of the following instead:

Kubernetes requires root privileges but users could consider using rootless Kubernetes inside of jobs:

Shifter

Shifter is a container project out of NERSC to provide HPC containers with full scheduler integration.

Singularity

Singularity is hybrid container system that supports:

  • Slurm integration (for singularity v2.x) via Plugin. A full description of the plugin was provided in the SLUG17 Singularity Presentation.
  • User namespace containers via sandbox mode that require no additional permissions.
  • Users directly calling singularity via setuid executable outside of Slurm.

ENROOT

Enroot is a user namespace container system sponsored by NVIDIA that supports:

  • Slurm integration via pyxis
  • Native support for Nvidia GPUs
  • Faster Docker image imports

Podman

Podman is a user namespace container system sponsored by Redhat/IBM that supports:

  • Drop in replacement of Docker.
  • Called directly by users. (Currently lacks direct Slurm support).
  • Rootless image building via buildah
  • Native OCI Image support

Sarus

Sarus is a privileged container system sponsored by ETH Zurich CSCS that supports:

Overview slides of Sarus are here.


Last modified 21 January 2021