In this article, we will introduce the concept of molecular dynamics simulations and run a basic example using GROMACS on Qarnot’s computing cloud platform.
Molecular dynamics
Introduction
Molecular dynamics simulations are computational methods aiming to emulate the behaviour of atoms and molecules over time and space and thereby to study the behaviour of the global system they constitute.
These methods are mainly found in chemical physics, materials science and biophysics. They are used to test models of interactions between atoms, molecules and macromolecules (such as proteins, nucleic acids, etc.) which are phenomenons that cannot be observed directly. Thus, they allow to refine the models and to better understand the functions of molecules in their biological system.
Working principle
Molecular systems typically consist of a very large number of particles, hence it is computationally intractable to study their dynamics analytically. Numerical methods are nonetheless particularly suitable for dealing with this complexity.
Molecular dynamics simulations consist in iteratively applying the equations of motion to each particle with time discretization. For each iteration, the acceleration of the particle is computed using its mass and the forces that are applied according to the given force field. With the prior position and velocity of the particle, its new position is determined after a small time step. By that means, starting from a given configuration, the trajectories of every particles are computed over the duration of the simulation.
Simulation workflow
Molecular dynamics simulations are usually divided into the following steps:
- System definition
- Provide the molecule structure
- Provide the positions of atoms
- Specify the force field that models interactions between atoms and molecules
- Set spatial limitations
- Fill in with solvent
- Fix the ions concentration
- System equilibration
- Minimize the energy of the molecule
- Stabilize the temperature of the solvent with NVT simulation
- Stabilize the pressure of the solvent with NPT simulation
- Run of the actual molecular dynamics simulation
- Analysis and visualization of the resulting trajectories
High Performance Computing
High computational power needs
The amount of computations needed to perform a molecular dynamics simulation depends on the following parameters:
- The simulation size (number of particles)
- The number of snapshots (iterations) of the dynamics evolution, which breaks into:
- The time span (total time duration of the simulation)
- The time step (time interval between two consecutive iterations)
These parameters must be chosen with care to obtain exploitable results:
- The number of particles depends on the size of the system which should be large enough to avoid errors induced by boundary conditions .
- The time span should be long enough to encompass the complete dynamic phenomenon, usually in nanoseconds (10-9s) to microseconds (10-6s).
- The time step should be small enough to avoid discretization errors, usually in femtoseconds (10-15s).
In consequence, molecular dynamics simulations may require several CPU-days to CPU-years of computational time. There is a compromise to reach between model complexity and computational power needs.
Parallel computing
During a classical molecular dynamics simulation, the most CPU intensive task is the evaluation of the energy of all particles at each iteration.
Parallel algorithms allow the load to be distributed. For instance, the domain decomposition method distributes sub-domains of the system to be computed separately by cores working in parallel. This can dramatically reduce the computational time.
Lysozyme in water example
This section relies on Justin Lemkul’s Lysozyme in water GROMACS Tutorial to showcase the use of molecular dynamics on Qarnot.
We will consider a lysozyme protein in a box of water and analyze the dynamics of this system. Lysozymes are proteins found for instance in egg white or secretions like tears, saliva, milk and mucus. They also play a part in the immune system.
GROMACS
The lysozyme simulation will be run with GROMACS. It is one of the fastest and most popular molecular dynamics software packages. It is free and open-source and can run on central processing units (CPUs) and graphics processing units (GPUs).
Data
The following files are needed to run this use case:
- ions.mdp: parameters to generate an atomic description of the system
- minim.mdp: parameters to relax the system (energy minimization)
- nvt.mdp: parameters to stabilize the temperature of the system (nvt simulation)
- npt.mdp: parameters to stabilize the pressure of the system (npt simulation)
- md.mdp: parameters to run the MD simulation
- 1aki.pdb: protein structure file (egg white lysozyme) from the RCSB protein data bank
- run_md.sh: GROMACS instructions to run the simulation (see below)
You can download all the files at once here.
Once you have downloaded the structure file (1aki.pdb
), you can render it using a visualization program such as VMD, Chimera or PyMOL.
Lysozyme 1aki protein – RCSB Protein Data Bank, rendered by VMD
Simulation
The following script runs the simulation:
run_md.sh
For more details on this script and how to make the best use of GROMACS, please consult Justin Lemkul’s tutorial.
Molecular dynamics on Qarnot
Qarnot’s cloud platform provides on demand computational power that can reduce the running time of an molecular dynamics simulation.
Prerequisites
The first step to use Qarnot’s platform is to create a Qarnot account. Then, retrieve your personal configuration file qarnot.conf
from the Acces token section of your Qarnot account.
Follow these steps to set up a Python virtual environment and install the Qarnot Python SDK.
Launching the use case
To launch the use case simulation on Qarnot, unzip the lysozyme-in-water.zip
archive and rename the folder containing the files as input
.
Then, copy paste the following script and save it next to your input
folder and qarnot.conf
configuration file.
gromacs.py
Your working directory should look like this :
input/
1aki.pdb
ions.mdp
md.mdp
minim.mdp
npt.mdp
nvt.mdp
run_md.sh
qarnot.conf
gromacs.py
Once the environment is ready, run python3 gromacs.py
from a terminal to launch the computation on Qarnot (it can take up to 1h).
Results
At any given time, the task can be monitored from the terminal or from Tasq. Once the task is completed, all output files will automatically be downloaded onto your local computer in an output
folder. You can for instance open the md_0_1.gro
file to render the final result using a visualization program such as VMD or with an online .gro
files visualizer such as Groview.
Lysozyme 1aki protein – after simulation, rendered by VMD
For more information on how to analyze the results of this simulation, see the Analysis part of Justin Lemkul’s tutorial.
Conclusion
Molecular dynamics simulations are essential to broaden our understanding of biological systems at microscopic level. But they are inevitably high demanding in computing power. If you do not have sufficient computational resources to meet your needs, using a cloud computing platform could be the solution.
We have seen how to run a molecular dynamics simulation with GROMACS on Qarnot’s high performance computing cloud platform. However, since Qarnot’s platform is based on containerization, you could use another software of you choosing to run a computation. You can find on the blog the list of already supported payloads in the Documentation section. You can also use one of your own Docker image or one from a public repository on Docker Hub. If you have any question, don’t hesitate to consult Qarnot’s documentation or to contact us at qlab@qarnot.com.
Want more bioinformatics content? You could read one of our other blog posts, and learn how to run a molecular docking simulation or a nucleotide sequence alignment on Qarnot.
comments