CDK5/(R)-roscovitine Complex
In this tutorial we will use the Antechamber package of AmberTools to create topology and structure files that can be used in a GROMACS molecular dynamics simulation. The Generalized Amber Force Fields (GAFF) are a general set of force field parameters to describe small organic molecules such as drugs. In this tutorial we will parameterize the small molecule (R)-roscovitine in complex with the cyclin-dependent kinase, CDK5. The starting pdb file can be downloaded from the Protein Data Bank (PDB code: 1UNL).
Software needed to complete this tutorial:
The pdb file 1UNL contains the p25 activator, the CDK5 protein, and the drug (R)-roscovitine. We need to prepare two separate topologies: (1) the protein CDK5 topology using GROMACS pdb2gmx and (2) the ligand topology using AmberTools.
First, we need to extract just the chain A coordinates:
And we will also extract the (R)-roscovitine coordinates:
To generate the protein topology file we use the GROMACS pdb2gmx tool
When prompted, choose (6) AMBE99SB-ILDN for the force field, and (1) TIP3P for the water model. The file cdk5.gro contains the coordinates (with hydrogens added) of the CDK5 protein, and the file cdk5.top contains the topology and force field parameters for the protein and water molecules. Now we must add the ligand, but the parameters for the ligand are not included in the protein force fields. We must add these parameters ourselves. In this tutorial, we will use the Generalized Amber Force Fields (GAFF) for small molecules.
Make sure your AMBER_PREFIX is set by typing in the termimial:
where /path/to/amber18 is the path to ambertools on your machine. First, we use the reduce command to add hydrogens to the pdb file for (R)-roscovitine.
Our next step is to use antechamber to create a mol2 file which is required to build our topology file.
Here the -i signals the input file and -fi specifies that the input file is in pdb format. The -o specifies the name of the output file to write and we use -fo to specify that we want the output to be written in mol2 format. The -c bcc option tells antechamber to use the AM1-BCC semi-empirical charge model to calculate the partial charges on each atom. Normally, we would calculate partial charges from the output of a quantum chemistry calculation using the restrained electrostatic potential (RESP) procedure, but for the purpose of this tutorial the semi-empirical AM1 model is sufficient. The -s 2 defines the verbosity - how much information antechamber will print to the screen during the procedure.
A number of new files are produced by this program. The sqm.xxx files are the input and out files for the sqm quantum chemistry code used by Antechamber to compute the point charges on each atom. Check to make sure the procedure completed with no errors by looking at the end of the sqm.out file. The last line should read
The output mol2 file, roscovitine_h.mol2, contains the definition of our drug including all of the charges and atom types. The GAFF parameters are located in $AMBERHOME/dat/leap/parm/gaff.dat. At this stage we want to check to make sure there are no missing parameters in our molecule. We can use the utility parmchk to test if all the parameters required are available
The above command produces the output roscovitine_h.frcmod file which is a parameter file that we can later load to add missing parameters. It will contain all of the missing parameters. Antechamber will try and fill in these missing parameters by analogy to similar structures. Looking at our frcmod file we see that 8 improper angles had missing parameters. For the sake of this tutorial, we will assume that the parameters Antechamber has suggested for us are acceptable. If you see any parameters listed with the comment “ATTN: NEEDS REVISION” then it means that Antechamber could not determine suitable parameters and so you must manually provide these before you can proceed with the simulation.
Next we load the amber software LEaP and load in the GAFF force field parameters.
followed by
Now we can load our drug molecule (roscovitine_h.mol2)
We now need to load our frcmod file in order to tell tleap the how to treat the missing parameters.
We can save a library file for the (R)-roscovitine molecule (roscovitine.lib) and a topology file prmtop and inpcrd file (roscovitine.prmtop, roscovitine.inpcrd)
Now we can exit tleap
We now have the Amber files roscovitine.prmtop and roscovitine.inpcrd which contain the force field parameters and input coordinates in Amber format. We now need to convert these into a GROMACS topology and structure file using the Python interface ACPYPE
You should now have the Gromacs files RRC_GMX.gro and RRC_GMX.top which contain the GROMACS topology and coordinates for (R)-roscovitine
Now we have our protein files cdk5.gro and cdk5.top and the ligand files RRC_GMX.gro and RRC_GMX.top. In order to have one set of GROMACS files we can combine them with the python script ligand2gro.py.
The output files will be a GROMACS structure file for both the ligand and protein called complex.gro and a topology file complex.top containing the force field parameters.
You can now create a simulations box and add solvent molecules
We also add ions as before using the ions.mdp file:
Finally, we perform an energy minimization with em.mdp:
At the end of this run we now have an energy minimized structure of the protein/ligand complex called em.gro. From this structure file and with our topology file complex.top we can continue performing an NVT equilibration simulation as described in Basics of a GROMACS simulation.