Infrastructure and software environment

IFB National Network of Computing Resources

Jacques van Helden

2019-11-28

Command-line tools in bioinformatics

Working environments

Most bioinformatics tools can be installed on Unix-like operating systems (Linux, Mac OS X), and can be used in different environments.

Virtual machines

Container-based virtualisation

Virtual machines versus containers

**Comparaison of virtualisation solutions.** Right: Virtual Machine; Center: Docker container; right: Singularity container. Source: Greg Kurtzer keynote at HPC Advisory Council 2017 @ Stanford

Comparaison of virtualisation solutions. Right: Virtual Machine; Center: Docker container; right: Singularity container. Source: Greg Kurtzer keynote at HPC Advisory Council 2017 @ Stanford

Installation of software tools in the local operating system

Conda packages

Computer cluster

A cluster is a set of computers (denotes as nodes) that work together and can be seen as a single system. Clusters are generally used to run parallel computing

**Grappe de serveurs.** En avant-plan: *Homo sapiens* tentant d'établir une interaction physique avec les machines.  Source: <https://en.wikipedia.org/wiki/Parallel_computing>

Grappe de serveurs. En avant-plan: Homo sapiens tentant d’établir une interaction physique avec les machines. Source: https://en.wikipedia.org/wiki/Parallel_computing

Parallel computing

Parallel computing consists in running simultaneously a series of processes on a computer system.

Tasks can be distributed on several Computer Processing Units (CPUs) of a same computer and/or on several computers (cluster).

The distribution of tasks on nodes and CPUs relies on a job scheduler. Users submit jobs (in the form of command lines or scripts) to the scheduler, which manages their execution on the different nodes and CPUs.