Add new Edit Delete History Source Add new post
Prose.io: Add new Edit Add new post

4 June 2021

Basics of a Metadynamics Simulation

Performing metadynamics in GROMACS using PLUMED2.

About this tutorial

This tutorial assumes that you are familiar with UNIX commands, navigating a terminal, performing a molecular dynamics simulation in GROMACS, and the basics of collective variables. Here, we will be running metadynamics simulations on alanine dipeptide in vacuum. PLUMED runs in conjunction with GROMACS commands nicely and provides a vast library of open-source post processing tools for MD simulations.

Introduction

Historically, the movement and behavior of molecules in molecular dynamics simulations has been described by the values of slow degrees of freedom, often called Collective Variables (CVs). One of the greatest challenges in computational biochemistry is picking the correct CVs that move slow enough to quantify, and also adequately describe what changes in the system over the simulation time. There are a vast number of collective variables that can be utilized in PLUMED, but today we will be focusing on the backbone dihedral angles as a CV, called TORSION in PLUMED. While this CV is not very effective on its own, the backbone dihedral angles in this small system are easy to visualize; plus, the whole range of possible values can be quickly explored. Considering all of this, how can we assure that the system adopts every possible psi and phi value? What would simulations tell us if we simply allowed them to float by themselves in a vacuum with no external force (bias)? Given enough time, this sort of approach would tell us something, but we don’t have all day and unlimited computational resources. To overcome this, many enhanced sampling methods have been developed to decrease computational time and increase the chance of observing a rare configuration.

One of the most widely used enhanced sampling techniques is metadynamics. The goal of metadynamics is to push reactions over energy barriers to speed up the exploration of reactions and protein configurations. In the context of biochemistry, we apply a position dependent bias to selected CVs to encourage the protein to adopt unusual, rarely observed configurations and secondary structures. Today, we are applying that bias to the backbone dihedral angles of alanine dipeptide shown below. Then, we will develop a free energy surface as a function of this CV.

_{Figure 1: Alanine dipeptide}

With some organic chemistry intuition, you can see how this trans configuration would be the most favorable. Those oxygen atoms do not want to be right next to each other and the methyl group. We can also see a chiral center at the alpha carbon. In the interest of this tutorial, we want to describe the free energy surface as a function of the angle phi. In order to do this, we need to explore all possible configurations, including the sterically unfavorable ones to get a clear picture.

Part 1: Setup and Classic Metadynamics

The necessary files to run this tutorial can be found here. Inside the metadtutorial directory, you will find two files, diala_md.tpr (topology file) and diala.mdp (parameter file). The topology file contains the starting structure of the protein, steps that were run, and parameters that were used along the way in a non-readable format, while the .mdp file was generated by the group that created the Trieste Tutorial and contains atom labels and starting coordinates. This protein was equilibrated in a vacuum and minimized for 100ps.
Below is a sample plumed.dat file that we will use to initialize the metadynamics run:

#labeling the dihedral atoms for phi, atoms C-N-CA-C
phi: TORSION ATOMS=5,7,9,15
#labeling the dihedral angle psi, atoms N-CA-C-N. See if you can fill these in yourself from the .pdb
psi: TORSION ATOMS=

#This is one way to format the metadynamics. In this case, we are applying the bias to the angle phi
METAD ...
        ARG=phi #indicating which CV to apply the bias
        SIGMA=0.1 #The width of the gaussians being deposited. Lower number = more computational time
        HEIGHT=0.6 #start depositing gaussians at 1.2 kJ/mol
        PACE=500 #interval at which Gaussians added in steps. one gaussian added every 500 steps
        LABEL=restraint #this label shows up in the file we indicate below, 'COLVAR'
... METAD
#all of that could be written on one line, as long as its sandwiched between the '...'

PRINT ARG=phi,psi,restraint.bias FILE=COLVAR STRIDE=10

This .pdb file acts as a great template for basic metadynamics simulations. The top section is where the CVs are indicated. Below the METAD… label lies all the parameters for the metadynamics simulation. Most of these reflect computational load and accuracy. Depositing larger Gaussians (SIGMA) at greater intervals (PACE) would decrease computational time at the the expense of fine detail. While there are many other keywords to parameterize metadynamics simulations, these are the basics. Enter the following into the command line to initiate a 5ns metadynamics simulation, which may take a few minutes.

gmx mdrun -v -s diala_md.tpr -nsteps 2500000 -plumed plumed.dat

Once finished, you will generate a variety of files, HILLS and COLVAR are what we are most interested in. In the COLVAR file, we will find 4 columns:

To see the angle that we biases as a function of simulation time, enter:

>gnuplot
>p "COLVAR" u 1:2

_{Figure 2: backbone dihedral angle during the metadynamics run}

Looking at the figure above, we get a rough idea of what our free energy surface will look like. We can see early in the simulation, the angle phi does not occupy ~2.1 rad (120 degrees) and roughly 0 rad. As the simulation progresses and more energy is added to the system, the dihedral angle phi is free to rotate and explore all possible values. Finally, we must generate our free energy surface. While there are many methods to do this, we are going to use the easy way with:

plumed sum_hills --hills HILLS

This will generate a file called fes.dat which shows how the free energy of a system varies in relation the angle phi. The reason we can estimate the free energy is because we explored every possible configuration of this angle and determined which were the most and least favorable. You can plot the free energy surface with:

xmgrace fes.dat

_{Figure 3: free energy of the phi angle rotation from metadynamics simulation}

Here, we can see that some of this data lines up with the data in Figure 2! We see the largest peak of free energy at about 2 radians, consistent with the unfavorable configuration that wasn’t explored until later in the configuration. Another broad peak can be observed at about 0 rad, also consistent with one of the peaks in Figure 2.

Part 2: Well-Tempered Metadynamics

Make a new directory for the following simulations!
Since metadynamics was first described by Michele Parrinello and Alessandro Laio in 2002, many improvements have been made by way of additional parameters, or entirely different algorithms. Fortunately for us, metadynamics is now as easy as typing a few lines in a .dat file in PLUMED. You should recall from the first plumed.dat file that we wrote contained a parameter called SIGMA. This can be thought of as the size of the “gaussian rain” that is added to the system. This must be chosen carefully because deposited gaussians that are too large can decrease the definition of your results. When producing figure 2, I initially chose a SIGMA value of 1, which resulted in Figure 3:

_{Figure 4}

Another value that needs to be chosen carefully is the HEIGHT factor. If you don’t add enough energy to the system, your system won’t converge and the CV won’t be adequately explored. Add too much energy, and your system might fall apart or adopt physically impossible conformations.

To address some of these problems, well-tempered metadynamics was created, in part by Parrinello in 2008. Where metadynamics adds a stream of constant free energy, indicated by the HEIGHT value, well-tempered metadynamics adjusts the HEIGHT value adaptively as the simulation progresses. To visualize this, I plotted the height of the gaussian added at the specific simulation time below.

_{Figure 5}

Notice the force added to simulation decreases gradually with the simulation time, in an almost exponential fashion. With all of this considered, lets take a look at the .dat file that will perform well tempered metadynamics:

phi: TORSION ATOMS=5,7,9,15
psi: TORSION ATOMS=7,9,15,17

METAD ...
       ARG=phi PACE=500
       GRID_MIN=-pi    #the GRID values are optional, but they specify  the upper and outer bounds
       GRID_MAX=pi     #of the grid
       TEMP=300.0      #TEMP must be specified in well-tempered metad
       LABEL=restraint
       HEIGHT=1.2
       SIGMA=0.1
       BIASFACTOR=20
... METAD
PRINT ARG=phi,psi,restraint.bias FILE=COLVAR STRIDE=1

Notice that all of our parameters in this new file, plumed_welltemp.dat are mostly the same. The SIGMA value is the same, along with the HEIGHT value, the starting point of the energy to be added. I added the GRID_MIN and GRID_MAX values, which are optional, but logical in this case because this grid size covers all possible values of phi. TEMP must be included in a well tempered case, along with BIASFACTOR. In this case, I chose a bias factor of 20. As the bias factor approaches 1, you are essentially performing an unbiased simulation. But as the bias factor approaches infinity, you will just perform a standard metadynamics simulation. To run the 5ns well-tempered metadynamics simulation, enter the following:

gmx mdrun -v -s diala_md.tpr -nsteps 2500000 -plumed plumed_welltemp.dat

This command will generate the same types of files as the standard metadynamics simulations. We are mainly interested in the COLVAR and HILLS files. To generate the free energy surface, again, just enter

plumed sum_hills --hills HILLS

xmgrace fes.dat

_{Figure 6}

The free energy surface you plotted from the well-tempered case should look similar to Figure 5. You should also notice that this free energy surface is strikingly similar to the one produced by the classic metadynamics simulation with one exception: The values of the free energy on the y axis. The highest peak on the free energy surface in Figure 2 is at about -75 kJ/mol, while that same peak is at about -45 kJ/mol in the well-tempered free energy surface. How can we compare these two free energy surfaces together and evaluate the accuracy of our well tempered simulation when they appear to be so different?
To do this, lets revisit PLUMED’s sum_hills function. Execute the following on the HILLS file from the well-tempered case AND the classic metadynamics simulation:

plumed sum_hills --hills (hills file) --mintozero --outfile fes_mintozero.dat
xmgrace fes_mintozero.dat

The ‘mintozero’ flag essentially makes the lowest point on the free energy surface the new zero point so the magnitudes and locations of the peaks on the free energy surface can be accurately compared. Below, the two free energy surfaces are plotted together and we can see that our well-tempered metadynamics simulation did a great job at estimating the free energy surface.

We can see some slight variations between the two simulations, which means we could have probably chosen a better bias factor. Choosing bias factors can be done methodically and you can read more about it here. While comparing the two methods this way shows how similar the methods are, it doesn’t really get at the main benefits of well-tempered metadynamics over standard metadynamics. For well-tempered simulations, you do not need to play with unbiased simulations to determine appropriate height factors, since it is adjusted adaptively and more likely to converge (I am unsure about this). Another great benefit of well-tempered metadynamics is reduced computational resources. To understand just how much time can be saved through well-tempered metadynamics, take a look at the md.log file for both cases with:

tail -15 md.log

On my machine, the well-tempered simulation ran over 5 times faster than the standard metadynamics simulation. While that isn’t much when looking at a small system like this, this could add up to days in real applications.

Conclusion

By the end of this tutorial you should have:

Applied ideas from organic chemistry to computational biochemical applications.
Become more familiar with basic PLUMED commands/syntax and Collective Variables.
Ran your first metadynamics and well-tempered metadynamics simulations.
Critically analyzed the free energy surfaces generated by both methods.
Formed a general understanding of classic metadynamics and why it can be improved.

By performing all of the steps that I have highlighted in this tutorial, you should now have a basic understanding of how metadynamics works and why it is important. Since it’s inception, there have been dozens of variants of metadynamics designed that come with their own pros and cons. Today, we took a look at the original methodology and one of the most commonly used flavors of metadynamics. For more information on PLUMED syntax, check out the metadynamics and well tempered metadynamics documentation pages.

Acknowledgements

This tutorial was heavily inspired by the Trieste Tutorial on the PLUMED website. I greatly benefited from this tutorial in a number of ways, but I felt it was a little outdated. I reformatted the plumed syntax to match up with the current syntax on the documentation page. I generated all of my own data and only borrowed the .pdb file from the tutorial’s tarball. I decided to take the end of the tutorial in a different direction than the original and generate some different figures. Not only will this help familiarize new group members to the programs that are in our machines in a familiar format, building this tutorial was a great exercise for me. It was a crash course in markdown, a good refresher in python after a decent break, and it completely reinforced my understanding of metadynamics.