Abstract
In this work, we introduce a framework for creating, testing, versioning and archiving portable applications for analyzing neuroimaging data organized and described in compliance with the Brain Imaging Data Structure (BIDS). The portability of these applications (BIDS Apps) is achieved by using container technologies that encapsulate all binary and other dependencies in one convenient package. BIDS Apps run on all three major operating systems with no need for complex setup and configuration and thanks to the richness of the BIDS standard they require little manual user input. Previous containerized data processing solutions were limited to single user environments and not compatible with most multi tenant High Performance Computing systems. BIDS Apps overcome this limitation by taking advantage of the Singularity container technology. As a proof of concept, this work is accompanied by 19 ready to use BIDS Apps, packaging a diverse set of commonly used neuroimaging algorithms.
Introduction
The last 25 years have witnessed a proliferation of methods for imaging the human brain (including structural, diffusion and functional Magnetic Resonance Imaging, Positron Emission Tomography, Electroencephalography and Magnetoencephalography). These methods have been accompanied by literally thousands of different algorithms for signal denoising, normalization, feature extraction, and statistical analysis. Modern analysis pipelines for neuroimaging often consist of dozens of steps and rely on software developed by multiple external groups with each group often developing their own idiosyncratic parameter settings even when using the same software packages. The increasing complexity of neuroimaging data analysis has led to many discoveries, and the flexibility provided by the plethora of feature extraction methods has allowed cognitive and clinical neuroscientists to develop new theories about the relationships between brain and behavior in healthy and diseased populations.
However, due to the intrinsic heterogeneity of scientific software (arising from the fact that it rarely is developed for widespread distribution), installing, configuring, and running many of the available methods is often difficult. Most neuroimaging software packages run natively on Linux (Hanke & Halchenko 2011) and (to a lesser extent) Mac OS X; however, Windows users are often left without that option. Operating system aside, many scientific packages depend on external libraries (often requiring a particular, sometimes outdated version), and require complex configurations of environment variables and/or data files.
There have been some attempts in the field of neuroinformatics to solve this issue. The most notable is the NeuroDebian project (Halchenko & Hanke 2012), which provides Debian and Ubuntu Linux distributions with packages containing many of the popular neuroimaging software tools. Installing packages prepared by the NeuroDebian team is very easy, as it can be performed with the built-in Debian/Ubuntu package management system. However, this solution only applies to Linux systems, as Mac OS X and Windows users are required to install Linux inside a Virtual Machine (VM). Additionally, creating new Debian/Ubuntu packages is a non-trivial task, which may limit the rate at which new software is added to NeuroDebian.
Another consequence of these deployment and installation issues is that they make it very difficult to perfectly recreate analysis pipelines, which exacerbates reproducibility issues in neuroimaging. This is partly due to journal space limitations, which typically preclude a full accounting of the scientific software stack used to generate results. But even knowing the versions of all of the software tools employed in an analysis is rarely sufficient to completely reproduce a workflow. Studies have shown that operating system type, version, and even hardware architecture can influence results in a significant manner (Mackenzie-Graham et al. 2008; Gronenschild et al. 2012; Glatard et al. 2015). Such issues are troubling in and of themselves, and call for thorough investigations into the sources of this variability. They also raise serious practical concerns for extended longitudinal studies that need to maintain the same software (and sometimes even hardware) stack along the time span of an experiment (e.g. a software upgrade midway through the analysis could lead to spurious differences between groups processed at different points in time).
One potential solution to ease the deployment problem proposed in the bioinformatics community is to create Virtual Machines (VMs) capturing all of the necessary dependencies for a workflow (Angiuoli et al. 2011; Krampis et al. 2012). Running a Virtual Machine does however come with significant performance overhead; furthermore, only a few High Performance Computing (HPC) systems (which have become the primary computational resource for many academics in the last decade) allow their users to run VMs on large clusters. Building on the virtual machine concept, a more lightweight solution has recently become more prevalent in the industry, known as ‘operating-system-level-virtualization’ or (more commonly) containers. In contrast to VMs, containers share the same kernel with the host operating system and thus deliver much better performance. The bioinformatics community has once again been at the forefront of adoption of this new technology, with the aim of improving research reproducibility (Folarin et al. 2015; Moreews et al. 2015; Devisetty et al. 2016; Belmann et al. 2015). Most proposed solutions have been based on a particular implementation of the container concept: ‘Docker’, which, due to its kernel and security requirements, is difficult or impossible to use in a multi-tenant environment such as an HPC system (which is often the most cost effective computational resource available to researchers).
Finally, most existing neuroimaging data processing pipelines expect input datasets to be organized and described in different and idiosyncratic ways. To account for variability in input data organization and a lack of consensus in terms of metadata description, data processing workflows often require users to input metadata manually - in a different fashion for each pipeline. This expandable step can cause errors and lead to incorrect results, while also making it harder to integrate processing pipelines into automated analysis platforms such as those developed to provide “Science as a Service” (Jordan et al. 2011).
After careful evaluation of existing solutions for the specific problems faced by the neuroimaging community, we have developed a framework for sharing and executing neuroimaging analysis pipelines that improves ease of use, accessibility and reproducibility for users of all three major operating systems, as well as for researchers using HPC or cloud computing systems. The framework overcomes Docker’s inability to run on multi-tenant HPC systems by capitalizing on the Singularity (Kurtzer 2016) container technology developed specifically for HPC use. To minimize the number of manual inputs required from researchers--and hence reduce the number of errors arising from misinterpretation of those inputs--the framework capitalizes on the recently introduced Brain Imaging Data Structure (BIDS) standard (Gorgolewski et al. 2016) for organizing and describing datasets. Correspondingly, the proposed framework is named BIDS Apps. We describe here the anatomy of a BIDS App, the infrastructure used for building, testing and archiving BIDS Apps, as well as steps necessary to run BIDS Apps in various scenarios.
Results
What is a BIDS App?
A BIDS App is a container image1 capturing a neuroimaging pipeline that takes a BIDS-formatted dataset as input. Each BIDS App has the same core set of command line arguments, making them easy to run and integrate into automated platforms. BIDS Apps are constructed in a way that does not depend on any software outside of the container mage other than the container engine.
BIDS Apps rely upon two technologies for container computing:
Docker - for building, hosting as well as running containers on local hardware (running Windows, Mac OS X or Linux) or in the cloud.
Singularity - for running containers on HPCs (Kurtzer 2016).
BIDS Apps are deposited in the Docker Hub (http://hub.docker.com) repository, making them openly accessible. Each app is versioned and all of the historical versions are available to download. By reporting the BIDS App name and version in a manuscript, authors can provide others with the ability to exactly replicate their analysis workflow.
Docker is used for its excellent documentation, maturity, and the Docker Hub service for storage and distribution of the images. Docker containers are easily run on personal computers and cloud services. However, the Docker Engine was originally designed to run different components of web services (HTTP servers, databases etc.) using cloud resources. Docker thus requires root or root-like permissions, as well as modern versions of Linux kernel (to perform user mapping and management of network resources); though this is not a problem in context of renting cloud resources (which are not shared with other users), it makes it difficult or impossible to use in a multi-tenant environment such as an HPC system, which is often the most cost effective computational resource available to researchers. Singularity, on the other hand, is a unique container technology designed from the ground up with the encapsulation of binary dependencies and HPC use in mind. Its main advantage over Docker is that it does not require root access for container execution and thus is safe to use on multi-tenant systems. In addition, it does not require recent Linux kernel functionalities (such as namespaces, cgroups and capabilities), making it easy to install on legacy systems.
How to create a BIDS App?
BIDS Apps Forge
Inspired by the conda-forge project (https://conda-forge.github.io/), BIDS Apps development is centered around a GitHub organization (http://github.com/BIDS-Apps) that maintains a code repository for each BIDS App (see Figure 1). Every repository hosts a Dockerfile describing how to build the container image, a lightweight wrapper for providing a unified command line interface as well as parsing the BIDS input, and brief documentation. BIDS Apps are an evolution of lightweight wrappers such as the Interface class developed within Nipype (Gorgolewski et al. 2011), with the added advantage of being programming language agnostic. Since BIDS Apps merely serve the purpose of capturing dependencies and providing a unified way of calling the relevant program (through the command-line interface), the repositories in the github.com/BIDS-Apps organization do not actually host the source code or data of the pipelines and workflows being wrapped: these are contained in the built Docker image stored on DockerHub. Building a BIDS App starts with creating a new repository and populating it with a Dockerfile and run script.
Dockerfile creation
A Dockerfile is a script written in a domain specific language that describes the steps necessary to build a Docker image. The Docker project provides excellent documentation and a tutorial on how to write Dockerfiles (https://docs.docker.com/engine/reference/builder/). However, because the container images built in the BIDS-Apps organization are ultimately intended to alternatively run under Singularity, there are some additional requirements that must be followed:
Apps cannot rely on having elevated permissions inside the container image (in contrast to Docker, processes inside a Singularity container run with the privileges of the user running the container).
Environment variables must be set using the ENV statement within the relevant Dockerfile, rather than relying on config files (such as /root/.bashrc).
Apps should not write anywhere outside of /tmp, $HOME, and the specific output folder provided as a command-line argument.
To facilitate the process of writing Dockerfiles, we have created a set of templates that include installation steps for the most popular neuroimaging tools (FSL, FreeSurfer, AFNI, ANTs etc.): https://github.com/BIDS-Apps/dockerfile-templates. Those templates are also available as container images that can be used directly as a base for new BIDS Apps dockerfiles using the FROM statement.
Command-line interface
To improve user experience and ability to integrate BIDS Apps into various computational platforms, each App follows a set of core command-line arguments:
runscript input_dataset output_folder analysis_levelFor example:
runscript /data/ds114 /scratch/outputs participantinput_dataset provides a path to the dataset to be analyzed (read-only), which must conform to the BIDS standard
output_folder is the folder where results of the analysis will be stored
analysis_level denotes the stage of the analysis that will be performed
To facilitate easy and efficient execution, analyses in BIDS Apps can be split into stages (see Materials and Methods). In the simplest design, an App would run in two stages: ‘participant’ and ‘group’. The ‘participant’ stage runs first level analysis that can be performed independently for each subject in the dataset (and thus can be executed in parallel). Analysis may optionally be restricted to a subset of participants using the --particiapant_label argument. The group level analysis runs on the outputs from the participant level analysis and cannot be split into independent parallel jobs. This scheme is inspired by the MapReduce programming model (Dean & Ghemawat 2008). Multiple such MapReduce steps can be defined for a single pipeline (full specification of the command-line scheme can be found in the Supplementary Materials).
There are no restrictions on what language is used to write the wrapper script as long as it conforms to the prescribed interface. However, to make it easier to generate new BIDS Apps we have created a basic implementation in Python that can be imported into a new script and filled with App-specific options: https://github.com/BIDS-Apps/bidscmd. We also provide several utilities that make it easier to work with BIDS-compatible directory structures--most notably, the PyBIDS Python package (https://github.com/INCF/pybids), which provides tools for simple but powerful logical queries over entities defined in the BIDS specification (e.g., retrieving a list of all unique subjects; getting the fieldmap files for all subjects with a valid first scanning run; etc.).
In addition to conforming to a standardized command-line argument scheme, run scripts are also responsible for validation of the input data before running any analysis. To facilitate the process we have developed a command line validator that checks whether the input datasets are compliant with the BIDS standard https://github.com/INCF/bids-validator. Because not all BIDS compatible datasets can be analyzed by all BIDS Apps (e.g. a surface reconstruction pipeline requires a high-resolution T1 weighted image), the validator can be configured to reject datasets with particular properties. Integrating the validator is as easy as calling an external command; Dockerfiles and container images with the validator pre-installed are also available. BIDS Apps developers can also choose to implement validation steps themselves if the requirements of their pipelines cannot be easily checked by the standard validator.
Building and testing container images
For each BIDS App a Docker image is built and run on a set of lightweight example BIDS datasets (Gorgolewski et al. 2013). This execution tests that the command-line interface and BIDS support are correctly implemented. The tests are run forcing read-only containers, to ensure compatibility with Singularity (which imposes read-only mode). If the Docker image builds successfully and passes all tests, it is assigned a unique version (based on the tag obtained from the corresponding GitHub repository) and uploaded to Docker Hub with a version tag. Since version tags are unique and Docker images stored on Docker Hub are never overwritten, all historical versions of each BIDS App will always be accessible. Building, testing and archiving of BIDS Apps is performed automatically through an Continuous Integration service (CircleCI). Automation of testing and versioning improves reliability due to minimization of human errors.
To facilitate the process of creating new BIDS Apps we have made an example App that can also be used as a template for new Apps: https://github.com/BIDS-Apps/example. Additional documentation and tutorials are available at http://bids-apps.neuroimaging.io. Developers seeking help are also encouraged to subscribe to the bids-app-dev mailing list at https://groups.google.com/d/forum/bids-apps-dev.
Available BIDS Apps
At the moment there are 19 BIDS Apps (see Table 1) in the repository, most of which were developed by participants in the 2016 sprint (see Materials and Methods). The Apps span different imaging modalities (structural, functional and diffusion MRI) as well as languages (Python, C++, MATLAB/Octave, OpenCL).
Running a BIDS App locally
Running a BIDS App on a local system can be performed using Docker, which is easy to install on all three major operating systems. To run the first stage of the example BIDS App for participant number 01 the user needs to open a console (terminal or cmd) and type:
docker run -ti --rm \ -v /Users/cajal/data/ds005:/bids_dataset:ro \ -v /Users/cajal/outputs:/outputs \ bids/example:0.0.4 \ /bids_dataset /outputs participant --participant_label 01Where /Users/cajal/data/ds005 is the path to the input dataset and /Users/cajal/outputs the path where results should be stored. If the BIDS App was not run before on this machine, the Docker image will be automatically downloaded from the Docker Hub.
Running a BIDS App on a cluster (HPC)
On many academic clusters, singularity can be used to run containers2. In these setting, to run a BIDS App, it first needs to be saved to an Singularity-compatible image file. This step needs to be performed outside of the cluster (for example on a laptop) and requires Docker:
docker run --privileged -ti --rm \ -v /var/run/docker.sock:/var/run/docker.sock \ -v D:\singularity_images:/output \ filo/docker2singularity \ bids/example:0.0.4Where D:\singularity_images is a path where the Singularity image will be stored. After transferring the .img file to a cluster it can be run like any other executable:
./bids_example-0.0.4.img /bids_dataset /outputs participant --participant_label 01Discussion
We have proposed a new way of distributing easy-to-use, reproducible neuroimaging analysis workflows that can run on all three major operating systems as well as multi-tenant clusters. Each BIDS App encapsulates all of its binary dependencies providing the means for reproducible analysis as well as an ultimate source of provenance information. Thanks to the BIDS standard for the organization of input data, errors caused by manually provided metadata are minimized. Finally, the unified command-line interface structure combined with flexible MapReduce-style execution schemes lends BIDS Apps to easy integration into data analysis platforms as well as efficient execution on computational clusters independently of the particular scheduling software. To support the
To prove the viability of the BIDS Apps concept we have developed 18 Apps representing a diverse set of neuroimaging software originating from many different labs. We are expecting that the number of available apps will grow in the future. At the time of writing this paper there were two more Apps being developed (see Table 2). We are actively encouraging neuroimaging methods developers to deploy their tools as BIDS App to further growing the library of available pipelines.
Even though similar solutions based on Docker have been proposed in the past, none have addressed the problem of running Docker containers on HPCs. For example, a study evaluating computational overhead running Docker images on an HPC (Di Tommaso et al. 2015) had to limit access to the Docker-enabled cluster only to “trusted” users due to security concerns (personal communication). The solution proposed here combines mature container building tools provided for multiple operating systems by the Docker project with HPC compatibility through Singularity.
Additionally, in contrast to previous proposals, the presented solution puts a strong emphasis on clear versioning of container images and the ability to access all previous versions. We envisage that this feature will be very valuable in the context of longitudinal studies (where the same software stack needs to be used over many years) as well as for accurately reporting a set of computational methods in a publication and later replicating them (which can be achieved by simply referencing a BIDS App with a corresponding version). Strict versioning of BIDS Apps is achieved by careful management of Docker images on Docker Hub via a Continuous Integration service which also is responsible for testing the Docker images, further reducing potential errors.
It is also worth noting that even though containers increase the reproducibility of scientific results, they do not solve the problem of sensitivity of some results to different operating systems, architectures or third party libraries. In particular, numerical and statistical instabilities create artificial dependence of results on hardware details that can only be addressed algorithmically. Further research is necessary to assess the robustness of published results to these factors. BIDS Apps can help in this endeavour to a certain extent - for example, a single analysis can be run using different versions of the same BIDS App to see the variance in results (different flavours of of the same App using different Linux distributions could be created for this purpose). However this approach has limits, further work is needed to better understand the potential for variance in results due to different Linux kernel versions across different systems and hardware architectures, since containers do not encapsulate this.
Another advantage of container-based solutions is that the user can run software in an almost identical software ecosystem as the one used for its development. This reduces the number of problems experienced by users due to nuanced differences in system configuration, such as system libraries or software versions. It also makes maintaining the software significantly easier for the developers, who do not need to support a variety of different configurations, and can easily reproduce errors.
Even though BIDS Apps as a form of neuroimaging software distribution scheme can be perceived as performing a similar task as the NeuroDebian project, the two initiatives in fact complement each other. Many of the example BIDS Apps presented in this manuscript use Debian as their base distribution and benefit from the ease of installation provided by the NeuroDebian project. It not only makes Dockerfiles shorter and easier to maintain, but due to the network of NeuroDebian mirror servers the build process is more reliable than then downloading software from their original locations. On the other hand the NeuroDebian project benefits from the BIDS App distribution scheme by exposing software previously limited only to Debian based distribution to all flavours of Linux. This is important considering that many HPCs run on RedHat and CentOS Linux distributions rather than Debian.
Future work will involve engaging more developers of neuroimaging methods to create BIDS Apps. Additionally there are plans for developing a repository that would be independent of Docker Hub to provide Open Container Consortium as well as Singularity-compatible images. This would provide improved sustainability of the project, but also remove the conversion step (from Docker to Singularity) currently necessary for running BIDS Apps on clusters.
Work is currently underway to facilitate the integration of BIDS Apps in other platforms such as XNAT (Marcus et al. 2007), CBRAIN (Sherif et al. 2014), or cloud based services like Amazon Web Service (AWS). The apps will include a machine readable description of input arguments, their descriptions and acceptable values (based on the Boutiques application descriptor (Glatard et al. 2015). Another benefit of such a description is to allow developers integrating BIDS Apps into their platforms to automatically generate user interfaces, and to improve validation of input parameters.
Singularity is not the only solution proposed to handle containers on HPCs. “Shifter” (as of September 2016 only available as pre-release beta version) has been discussed in the context of running containerized academic software (Hale et al. 2016). The principles behind Shifter are similar to Singularity, but Shifter also attempts to tackle the problem of managing the container images. Because of this, it depends on several services (Redis and MongoDB servers, worker processes, etc.) that make setting it up and maintaining it more involved than Singularity. However, despite the differences all BIDS Apps would be able to run on clusters running Shifter.
It is also worth mentioning that even though the proposed framework is focused on analysis of neuroimaging data, a similar scheme could be applied to other types of data. The only element that is specific to neuroimaging in BIDS Apps is the format of the input data. Other fields with well established data standards can easily adopt the same way of constructing, testing and archiving pipelines with their software dependencies.
BIDS Apps is a framework to help neuroimaging practitioners deploy the complex data processing workflows that they require in their research. Leveraging existing technologies such as GitHub and CircleCI and extensively using containers (Docker and Singularity), the proposed ecosystem automatically generates the appropriate container images with a minimized impact on the researcher’s development flow. This paper shows how these apps are created, stored, and executed either locally or in HPCs. To maximize interoperability and reduce manual metadata handling, BIDS Apps require that input data are in the BIDS organization format. The ultimate goal of BIDS Apps is reproducibility, thus this framework is particularly focused on archiving and versioning of neuroimaging workflows.
Materials and Methods
Engaging the community
To kickstart the repository of BIDS Apps, a four day long coding sprint was organized at Stanford University in August 2016. Leading neuroimaging methods and workflow developers were invited to learn about BIDS, Docker, and Singularity. In addition the sprint was advertised on the Stanford Center for Reproducible Neuroscience and Twitter, and outreach was performed during the 2016 OHBM Meeting in Geneva. The workshop consisted of one day of training; during the remaining three days, hands-on support was provided by experienced Docker developers.
Command line specification
This approach to run our workflows requires sticking with three standards: 1) a common command-line interface, 2) a Docker container to ensure portability, and 3) a standard for organizing input data. Containers created this way can be easily integrated in OpenfMRI as well as other data analysis platforms. Thanks to Docker to Singularity conversion they can also be easily run on High Performance Computers (clusters) without the need to install all of the dependencies.
Each workflow/pipeline will be run independently for each subject (the map step). Results of this execution (arrange in whatever way the pipeline prefers) can be optionally processed in a group level analysis (reduce step).
Command line interface
Each pipeline should have a simple wrapper script used to run it. The script should be command line interface and accept the following command line arguments (minimally):
For the participant level (aka map) step:
bids_dir - (positional argument #1) the directory with the input dataset formatted according to the BIDS standard. This directory is read only.
output_dir - (positional_argument #2) the directory where the output files should be stored. This is the only directory the pipeline should write to. Can be used to store intermediate files, but they should be removed after the pipeline finishes. This directory is shared across all of the participant level jobs - it’s up to the script to create subfolders for each subject.
“participant” - (positional_argument #3) indicates that this is a participant level analysis.
--participant_label - (optional) label of the participant that should be analyzed. The label corresponds to sub-<participant_label> from the BIDS spec (so it does not include “sub-”). If this parameter is not provided all subjects should be analyzed. Multiple participants can be specified with a space separated list.
Example:
Run processing for every subject independently (map step). Each of these operations can be performed in parallel. There are no restrictions or specification of how data should be organized inside the output_dir ./my_pipeline /data/my_dataset /scratch/outputs participant --participant_label 01 ./my_pipeline /data/my_dataset /scratch/outputs participant --participant_label 02 ./my_pipeline /data/my_dataset /scratch/outputs participant --participant_label 03 …
(Optional) For the group level (aka reduce) step:
bids_dir - (positional argument #1) the directory with the input dataset formatted according to the BIDS standard. This directory is read only.
output_dir - (positional_argument #2) the directory where the output files should be stored. This is the only directory the pipeline should write to. Can be used to store intermediate files, but they should be removed after the pipeline finishes. This directory is the same one that was used in the participant level analysis and should be prepopulated with participant level results before running the group level.
“group” - (positional_argument #3) indicates that this is a group level analysis.
--participant_label - (optional) labels of the participants that should be analyzed. The label corresponds to sub-<participant_label> from the BIDS spec (so it does not include “sub-”). If this parameter is not provided all subjects should be analyzed. Multiple participants can be specified with a space separated list. This can be useful if you want to do a group level analysis on a subset of all participants in your dataset
Example: ./my_pipeline /data/my_dataset /scratch/outputs group
The script can also accept other arguments specific to your pipeline (see --template_name in FreeSurfer App). Mind that the same set of extra arguments will be passed to the map (single subject level) and the reduce (group level) stage.
Advanced use cases
Multiple map reduce steps. In case your pipeline needs to do multiple map reduce steps the analysis_level (third positional argument) can take additional arguments: participant2, group2, participant3, group3 etc. In the description of your app please specify how many map reduce steps are necessary.
Within-job multi CPU parallelization. Each job (on any given level: participant or group) might be run on a multi CPU machine; in such case a parameter --n_cpus followed by an integer will be passed with the number of CPUs available.
If your app is capable of adapting its workflow depending on how much memory is available on the environment it is running on you can implement an optional --mem_mb flag. When running your app the execution system will pass available memory in megabytes.
Footnotes
krzysztof.gorgolewski{at}gmail.com, falmagro{at}fmrib.ox.ac.uk, tibor.auer{at}rhul.ac.uk, pierre.bellec{at}criugm.qc.ca, mihai.capota{at}intel.com, mallar{at}cobralab.ca, nchurchill.research{at}gmail.com, ccraddock{at}nki.rfmh.org, gdevenyi{at}gmail.com, anders.eklund{at}liu.se, phd{at}oscaresteban.es, g.flandin{at}ucl.ac.uk, satra{at}mit.edu, swaroopgj{at}gmail.com, mark{at}fmrib.ox.ac.uk, anisha.keshavan{at}ucsf.edu, gkiar{at}jhu.edu, praamana{at}research.baycrest.org, david.raffelt{at}florey.edu.au, steele.christopher.j{at}gmail.com, poq{at}criugm.qc.ca, robert.smith{at}florey.edu.au, sstrother{at}research.baycrest.org, gael.varoquaux{at}inria.fr, tyarkoni{at}utexas.edu, yida.wang{at}intel.com, russpold{at}stanford.edu
↵1 Images vs. containers. A container image is a serialization of binary dependencies and can be used to run containers. In other words a container is a particular instantiation of an image. There can be many running containers of the same image.
↵2 Singularity software needs to be installed on the cluster for users to be able to use Singularity images. However, due to its minimal dependencies and security concerns Singularity is more likely to be approved for multi-tenant systems usage than Docker.