Jacques van Helden
2019-11-28
A large proportion of bioinformatics tools are available only on the command line. Moreover, even for tools equipped with a graphical user interface (e.g. BLAST, clustal, …) the use of command-line can be necessary for some projects
Enables to automate the tasks
High performance computing (HPC)
Traceability, reproducibility, reusability
Most bioinformatics tools can be installed on Unix-like operating systems (Linux, Mac OS X), and can be used in different environments.
ssh
connection)Components
Typical applications
Examples of hypervisors
Applications run on a shared operating system without requiring a virtual machine
Advantages
Container management software
Comparaison of virtualisation solutions. Right: Virtual Machine; Center: Docker container; right: Singularity container. Source: Greg Kurtzer keynote at HPC Advisory Council 2017 @ Stanford
Advantages
Weaknesses
Doc : https://conda.io/docs/
Advantages
Weaknesses
A cluster is a set of computers (denotes as nodes) that work together and can be seen as a single system. Clusters are generally used to run parallel computing
Grappe de serveurs. En avant-plan: Homo sapiens tentant d’établir une interaction physique avec les machines. Source: https://en.wikipedia.org/wiki/Parallel_computing
Parallel computing consists in running simultaneously a series of processes on a computer system.
Tasks can be distributed on several Computer Processing Units (CPUs) of a same computer and/or on several computers (cluster).
The distribution of tasks on nodes and CPUs relies on a job scheduler. Users submit jobs (in the form of command lines or scripts) to the scheduler, which manages their execution on the different nodes and CPUs.