This project actively encourages developers to submit their assemblers for benchmarking. A basic API is required so that all Docker assembler images can be tested consistently and in the same way. Therefore the ENTRYPOINT for each assembler image should be a script, in any language, that accepts the following three arguments:
single-cell
or default
.contigs.fa
.The following are examples of Docker images for two assemblers:
One you have created your Docker image, the source code for it should be hosted as a publicly viewable repository on either github.com or bitbucket.org. These are the two code hosting sites currently supported by Docker. Once your code is uploaded and available, the next step is to create a docker.com repository. Once you have a docker.com account select create new 'Automated Build' and select your repository. This will trigger Docker to build your assembler image, and all subsequent images whenever a change is made. This may take up to an hour. Any errors will be highlighted in the build log for your repository.
Once your assembler image has been successfully built on docker.com it is ready
for use in the nucletid.es benchmarks. The next step to have your assembler
included in the benchmarks is to fork and clone the list of assemblers.
Then run ./script/add
in this repository. This will add an empty list item to
the file assembly.yml
. Update this
YAML entry with as much detail as you
can. Finally commit your changes then create a pull request. Once this
request is merged your assembler will be included in all further benchmarks.
All developers encouraged to join the nucleotid.es mailing list. This list will be used to communicate updates and changes in how the benchmarks are run. This list can also be used to ask questions or get help debugging your images.
Below is a ENTRYPOINT bash script for running velvet. Detailed comments explain each line. Bash is used here as an example, but any language preferred by the author can be used as this script will run inside the container.
#!/bin/bash
# It's good practice to set these variables. This
# will make it easier to debug your assembler if
# you encounter any problems.
set -o errexit
set -o xtrace
set -o nounset
# The first argument is the mode to run the assembler.
# This should match an entry in the Procfile.
readonly PROC=$1
# The second argument is the location of the reads
# in the container filesystem. The will be present
# in a read-only directory
readonly READS=$2
# The third argument is a directory with
# write-access where the final assembly should be
# written to.
readonly DIR=$3
# The assembly should be written to the file
# "contigs.fa" in the output directory.
readonly ASSEMBLY=$DIR/contigs.fa
# Setup logging. This will make it easier to debug any problems.
LOG=$DIR/log.txt
exec > >(tee ${LOG})
exec 2>&1
# Create a temporary directory to run the assembly
readonly TMP_DIR=$(mktemp -d)
# Determine which command to run by pulling from the Procfile
CMD=$(egrep ^${PROC}: /Procfile | cut -f 2 -d ':')
if [[ -z ${CMD} ]]; then
echo "Abort, no proc found for '${PROC}'."
exit 1
fi
# Here is where the assembler is actually run
eval ${CMD}
# Copy contigs to target directory
cp $TMP_DIR/contigs.fa $ASSEMBLY