Tiphaine Bonniot
HPC Innovation Engineer
Engineer at Qarnot, specialized in HPC, environmental footprint and operations research

The journey of a protein in a glass of water

October 26, 2021 - Biotech

In this article, we will introduce the concept of molecular dynamics simulations and run a basic example using GROMACS on Qarnot's computing cloud platform.

Molecular dynamics

Introduction

Molecular dynamics simulations are computational methods aiming to emulate the behaviour of atoms and molecules over time and space and thereby to study the behaviour of the global system they constitute. These methods are mainly found in chemical physics, materials science and biophysics. They are used to test models of interactions between atoms, molecules and macromolecules (such as proteins, nucleic acids, etc.) which are phenomenons that cannot be observed directly. Thus, they allow to refine the models and to better understand the functions of molecules in their biological system.

Working principle

Molecular systems typically consist of a very large number of particles, hence it is computationally intractable to study their dynamics analytically. Numerical methods are nonetheless particularly suitable for dealing with this complexity. Molecular dynamics simulations consist in iteratively applying the equations of motion to each particle with time discretization. For each iteration, the acceleration of the particle is computed using its mass and the forces that are applied according to the given force field. With the prior position and velocity of the particle, its new position is determined after a small time step. By that means, starting from a given configuration, the trajectories of every particles are computed over the duration of the simulation.

Simulation workflow

Molecular dynamics simulations are usually divided into the following steps:

  1. System definition
    • Provide the molecule structure
    • Provide the positions of atoms
    • Specify the force field that models interactions between atoms and molecules
    • Set spatial limitations
    • Fill in with solvent
    • Fix the ions concentration
  2. System equilibration
    • Minimize the energy of the molecule
    • Stabilize the temperature of the solvent with NVT simulation
    • Stabilize the pressure of the solvent with NPT simulation
  3. Run of the actual molecular dynamics simulation
  4. Analysis and visualization of the resulting trajectories

High Performance Computing

High computational power needs

The amount of computations needed to perform a molecular dynamics simulation depends on the following parameters:

  • The simulation size (number of particles)
  • The number of snapshots (iterations) of the dynamics evolution, which breaks into:
    • The time span (total time duration of the simulation)
    • The time step (time interval between two consecutive iterations)

These parameters must be chosen with care to obtain exploitable results:

  • The number of particles depends on the size of the system which should be large enough to avoid errors induced by boundary conditions .
  • The time span should be long enough to encompass the complete dynamic phenomenon, usually in nanoseconds (10-9s) to microseconds (10-6s).
  • The time step should be small enough to avoid discretization errors, usually in femtoseconds (10-15s).

In consequence, molecular dynamics simulations may require several CPU-days to CPU-years of computational time. There is a compromise to reach between model complexity and computational power needs.

Parallel computing

During a classical molecular dynamics simulation, the most CPU intensive task is the evaluation of the energy of all particles at each iteration. Parallel algorithms allow the load to be distributed. For instance, the domain decomposition method distributes sub-domains of the system to be computed separately by cores working in parallel. This can dramatically reduce the computational time.

Lysozyme in water example

This section relies on Justin Lemkul’s Lysozyme in water GROMACS Tutorial to showcase the use of molecular dynamics on Qarnot. We will consider a lysozyme protein in a box of water and analyze the dynamics of this system. Lysozymes are proteins found for instance in egg white or secretions like tears, saliva, milk and mucus. They also play a part in the immune system.

GROMACS

The lysozyme simulation will be run with GROMACS. It is one of the fastest and most popular molecular dynamics software packages. It is free and open-source and can run on central processing units (CPUs) and graphics processing units (GPUs).

Data

The following files are needed to run this use case:

  • ions.mdp: parameters to generate an atomic description of the system
  • minim.mdp: parameters to relax the system (energy minimization)
  • nvt.mdp: parameters to stabilize the temperature of the system (nvt simulation)
  • npt.mdp: parameters to stabilize the pressure of the system (npt simulation)
  • md.mdp: parameters to run the MD simulation
  • 1aki.pdb: protein structure file (egg white lysozyme) from the RCSB protein data bank
  • run_md.sh: GROMACS instructions to run the simulation (see below)

You can download all the files at once here. Once you have downloaded the structure file (1aki.pdb), you can render it using a visualization program such as VMD, Chimera or PyMOL.

 

Lysozyme 1aki protein - RCSB Protein Data Bank, rendered by VMD

Simulation

The following script runs the simulation: run_md.sh For more details on this script and how to make the best use of GROMACS, please consult Justin Lemkul’s tutorial.

Molecular dynamics on Qarnot

Qarnot's cloud platform provides on demand computational power that can reduce the running time of an molecular dynamics simulation.

Prerequisites

The first step to use Qarnot’s platform is to create a Qarnot account. Then, retrieve your personal configuration file qarnot.conf from the Acces token section of your Qarnot account. Follow these steps to set up a Python virtual environment and install the Qarnot Python SDK.

Launching the use case

To launch the use case simulation on Qarnot, unzip the lysozyme-in-water.zip archive and rename the folder containing the files as input. Then, copy paste the following script and save it next to your input folder and qarnot.conf configuration file. gromacs.py Your working directory should look like this :

  • input/
    • 1aki.pdb
    • ions.mdp
    • md.mdp
    • minim.mdp
    • npt.mdp
    • nvt.mdp
    • run_md.sh
  • qarnot.conf
  • gromacs.py

Once the environment is ready, run python3 gromacs.py from a terminal to launch the computation on Qarnot (it can take up to 1h).

Results

At any given time, the task can be monitored from the terminal or from Tasq. Once the task is completed, all output files will automatically be downloaded onto your local computer in an output folder. You can for instance open the md_0_1.gro file to render the final result using a visualization program such as VMD or with an online .gro files visualizer such as Groview.

 

Lysozyme 1aki protein - after simulation, rendered by VMD

For more information on how to analyze the results of this simulation, see the Analysis part of Justin Lemkul's tutorial.

Conclusion

Molecular dynamics simulations are essential to broaden our understanding of biological systems at microscopic level. But they are inevitably high demanding in computing power. If you do not have sufficient computational resources to meet your needs, using a cloud computing platform could be the solution. We have seen how to run a molecular dynamics simulation with GROMACS on Qarnot's high performance computing cloud platform. However, since Qarnot's platform is based on containerization, you could use another software of you choosing to run a computation. You can find on the blog the list of already supported payloads in the Documentation section. You can also use one of your own Docker image or one from a public repository on Docker Hub. If you have any question, don't hesitate to consult Qarnot's documentation or to contact us at qlab@qarnot.com. Want more bioinformatics content? You could read one of our other blog posts, and learn how to run a molecular docking simulation or a nucleotide sequence alignment on Qarnot.

Share on networks