This guide gives a brief overview of how to turn a workflow into a ProtProtocol Docker container. It will show you how to add a new tool to the existing IsoProt
container.
Tip: In case your new workflow only requires a different type of statistical analysis, have a look at creating a new Jupyter notebook first.
This guide assumes that the following terms mean something to you:
- Docker
- Dockerfile
- Linux
Setup
We will use the IsoProt
container as a starting point. To clone the latest version from GitHub, execute:
git clone https://github.com/ProtProtocols/IsoProt.git
Dockerfile
ProtProtocol protocols are shipped as Docker containers. Docker containers are created following a set of instructions found in a Dockerfile
. Conveniently, Dockerfiles
are by default named Dockerfile
.
The official Dockerfile reference can be found here.
A Dockerfile
starts with a FROM
commend, telling the Docker builder which image to use as a starting point:
FROM protprotocols/protprotocols_template
We created our own protprotocols_template
base image. It comes Jupyter notebook and several dependencies pre-installed and serves as a common ground for all ProtProtocol images.
Installing software
Next, we change to the user root
to install additionally required software for the IsoProt
image:
USER root
RUN apt-get update \
&& apt-get install -y --no-install-recommends mono-complete libxml2-dev libnetcdf-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Tip: The Docker builder creates a new layer for every command in the Dockerfile
. Therefore, to reduce the total size of the image, it is a good idea to combine several commands in one and always cleanup afterwards. In the above case, this is achieved by combining several apt-get
commands using the &&
sign.
Additional files from the current directory can be copied into the image using the ADD
command:
# Setup R
ADD DockerSetup/install_packages.R /tmp/
RUN Rscript /tmp/install_packages.R && rm /tmp/install_packages.R
In the code above, we copy our install_packages.R
script into the image to then use it through the Rscript
command to install all required R packages.
As you saw, the commands in a Dockerfile
are basic Linux commands executed in the containers shell. We use a bit more complex commands to install tools like SearchGUI and PeptideShaker:
RUN mkdir /home/biodocker/bin
RUN PVersion=1.16.27 && ZIP=PeptideShaker-${PVersion}.zip && \
wget -q http://genesis.ugent.be/maven2/eu/isas/peptideshaker/PeptideShaker/${PVersion}/$ZIP -O /tmp/$ZIP && \
unzip /tmp/$ZIP -d /home/biodocker/bin/ && rm /tmp/$ZIP && \
bash -c 'echo -e "#!/bin/bash\njava -jar /home/biodocker/bin/PeptideShaker-${PVersion}/PeptideShaker-${PVersion}.jar $@"' > /home/biodocker/bin/PeptideShaker && \
chmod +x /home/biodocker/bin/PeptideShaker
Adding additional software
To add additional software, simply add the required RUN
commands to the Dockerfile
.
If the software you need is available in the repository, the easiest is to adapt the first RUN
command used to install mono
and related libraries. Simply add the packages you need to this command.
If the software you need is available as a biocontainer image, have a look at the respective Dockerfile
in Biodocker’s GitHub repository. The simplest method is to copy the install command from there.
Alternatively, you need to manually add the required files for your tool to the Docker image using the ADD
or COPY
commands and then execute the required shell commands using RUN
statements.
Tip: Always delete temporary files at the end of a run statement to keep your image as small as possible.
Building your container
To build your container, execute the following command in the directory containing the Dockerfile
:
sudo docker build -t yourname/my_container .
sudo
is needed on most Linux distributions since the docker
command can only be executed by root
.
The -t yourname/my_container
parameter assigns a tag to the new container. The format for these tags is {username}/{container name}:{version}
. If {version}
is ommitted, like in the above example, the default :latest
version is automatically added to the tag. If launching or pulling a container without specifying the version, this also defaults to the :latest
version.
Warning: Do not use the :latest
version for your final workflows. These do not reference a stable version and will change in the future and do not provide a stable point of reference.
Testing your container
To launch your container, simply call:
sudo docker run yourname/my_container
Since you will want to access the Jupyter web interface and don’t want the container to exit immediately, you also need to map the respective port and enable an interactive session:
sudo docker run -p 8888:8888 -ti yourname/my_container
For more information on how to start a container using the docker
command directly, have a look at Run IsoProt without docker-launcher.
Debugging your container
Sometimes, you will want to access your container directly using a shell. There are two options to do this:
Start the container to open a shell directly:
sudo docker run -ti -p 8888:8888 yourname/my_container bash
To have root access in the session:
sudo docker run -ti -p 8888:8888 -u root yourname/my_container bash
Connect to a running container
Run sudo docker container ls
to find the appropriate container_id.
Then, run sudo docker exec -ti -u root <container_id> bash
:
# Get the running container id
$> sudo docker container ls
CONTAINER ID IMAGE COMMAND CREATED [...]
c4f023536c56 protprotocols/isoprot:release-0.2 "/bin/sh -c 'jupyter…" [...]
# Open a session in this container
sudo docker exec -ti -u root c4f023536c56 bash