Sheffield R meetup - 6th March 2018 (Updated for SITraN journal club - 2nd April 2019)
Longer version of materials prepared for CRUK Cambridge available here
Mark Dunning (@DrMarkDunning) Bioinformatics Core Director
web : sbc.shef.ac.uk
twitter: @SheffBioinfCore
email: bioinformatics-core@sheffield.ac.uk
https://docs.docker.com/engine/docker-overview
Docker is an open platform for developers to build and ship applications, whether on laptops, servers in a data center, or the cloud.
(may require some messing around with virtualisation or Hyper-V)
Once you have installed Docker using the insructions above, you can open a terminal (Mac) or command prompt (Windows) and run the following to download an image for the Ubuntu operating system from Dockerhub;
docker pull ubuntu
N.B. on Linux, you may need to run docker with sudo, unless you apply this fix
To run a command inside this new environment software we can do;
docker run ubuntu echo "Hello World"
echo
command within this operating systemTo use the container in interactive mode we have to specify a -it
argument. Which basically means that it doesn’t exit straight away, but instead runs the bash
command to get a terminal prompt
docker run -it --rm ubuntu
--rm
means that the container is deleted on exit, otherwise your disk could get clogged up with lots of exited containersif no command is specified, you get a shell prompt
ubuntu
image (or centos
) is often used as a base image upon which other more complicated images are basedubuntu
againYou’ll notice that when you launch a container, you don’t automatically have access to the files on your OS. In Docker, we can mount volumes using the -v argument to make files accessible e.g. -v /PATH/TO/YOUR/data:/data
inside the container.
## should say that no file or directory exists
docker run --rm ubuntu ls /data
## If on Windows, need correct path separator
docker run --rm -v c:\work:/data ubuntu ls /data
## On Unix it would be something more sensible, like
docker run --rm -v c/home/USER/work:/data ubuntu ls /data
The latest version of R and R devel are provided by the rocker project https://github.com/rocker-org/rocker
docker run --rm -it r-base R
r-base
image, if you don’t have it-it
)r-base
docker imageR
executableFor latest developmental version of R:-
docker run --rm -it r-devel R
Can also get previous versions of R
RStudio is also supported. See https://github.com/rocker-org/rocker/wiki/Using-the-RStudio-image
docker run -p 8787:8787 rocker/rstudio
-p
argument opens a port in the docker containerhttp://localhost:8787
rstudio
password rstudio
You can install whatever R packages you need in this container and analyse your data
N.B. Python fans needn’t feel left out; there are docker containers for jupyter too.
-v
.
Once a docker container has quit, you can jump back in with docker start
and docker attach
docker ps ##not name of container that just quit
docker start <name-of-container-that-just-exited>
docker attach <name-of-container-that-just-exited>
You can then build a new image
docker commit <name-of-container-that-just-exited> <new image>
There may already be a docker container for popular sets of tools
docker run --rm -p 8787:8787 -e PASSWORD=PASSWORD bioconductor/release_core2
Dockerfile
The creation of Docker images is specified by a Dockerfile. This is a text file containing the sequence of instructions required to re-create your image from some starting point, which could be the standard Ubuntu image. Essentially we list the commands we use, step-by-step to install all the software required. If you already have a shell script to install your software, then translating to a Dockerfile is relatively painless.
A useful reference is the official Docker documentation on Dockerfiles, which goes into far more detail than we will here.
The example below shows the Dockerfile used to create a Ubuntu image use git to clone a repository and install some packages
FROM ubuntu
MAINTAINER YOU NAME<your.name@sheffield.ac.uk>
RUN apt-get update
RUN apt-get install -y wget build-essential git
RUN git clone.....
RUN R -e 'install.packages(....)'
The docker build
command will build a new image from a Dockerfile
. With docker push
you can distribute this on dockerhub once you have a user name.
docker build -t=my_username/my_new_image .
docker push
Several headaches can emerge when preparing the materials for a training course
docker pull markdunning/rnaseq-toolkit
docker run --rm -p 6080:80 markdunning/rnaseq-toolkit
docker run -d -p 8787:8787 sje30/waverepo
Issue: doing several analyses at same time, some of which may require latest version of R etc. How can we ensure that previous analyses still run. Within each project, create a Dockerfile
to build a container for the analysis.
FROM bioconductor/release_base2:R3.5.3_Bioc3.8
MAINTAINER Mark Dunning<m.j.dunning@sheffield.ac.uk>
RUN R -e 'install.packages("BiocManager")'
RUN R -e 'BiocManager::install("tidyverse")'
RUN R -e 'BiocManager::install("DEXSeq")'
To see what containers you have run recently
docker ps -a
If you find your disk filling up with docker images, there are convenient one-liners for removing all containers and images.
Don’t run this now, unless you want everything you’ve been working on to be deleted!
docker rm $(docker ps -a -q)
docker rmi $(docker images -q)
You can go back into the environment of a container that has been exited. Firstly, we make sure the container is running by using docker start:-
docker start <container_ID>
We can then use docker attach
. Note that you will have to press ENTER twice in order to get a new command prompt within the container.
docker attach <container_ID>
Sounds great so far! But…
There is an alternative….
Singularity
You can build a singularity image from a docker container
markdunning/dexseq-analysis
(build from Dockerfile
above) to build an image. You need sudo access to do this.sharc
exec
command to Run R
with an analysis script### Run where you have sudo access
sudo singularity build singularity/dexseq docker://markdunning/dexseq-analysis
## Copy the image to sharc /shared/bioinformatics_core1/Shared/software/singularity/dexseq
## On sharc
singularity exec /shared/bioinformatics_core1/Shared/software/singularity/dexseq R -f dexseq_analysis.R
More documentation is available for using singularity on sharc