Blog
Protein-Ligand MD Simulation in GROMACS: A Step-by-Step Tutorial for Beginners
- June 27, 2026
- Posted by: Stem Skills Lab
- Category: Molecular Modeling

To run a protein-ligand MD simulation in GROMACS, you first generate a force-field topology for the ligand (with CGenFF for CHARMM36 or ACPYPE for AMBER/GAFF), merge it into the protein topology, build a single complex coordinate file, then solvate, add ions, energy minimize, equilibrate with position restraints under NVT and NPT, and finally run unrestrained production MD. The extra step versus a plain protein run is the ligand topology.
You have a docked pose from AutoDock Vina and a binding score, but a score is a static snapshot. The question your supervisor will ask next is whether the ligand actually stays put when the protein moves in water at body temperature. Molecular dynamics answers that. This guide walks through simulating a protein-ligand complex from start to finish using the T4 lysozyme L99A protein with the JZ4 ligand (PDB entry 3HTB), the same system used in the most widely followed GROMACS protein-ligand tutorial.
This is a spoke in our learn molecular dynamics with GROMACS series, and it assumes you have already run a plain protein simulation. If you are mapping where MD fits into a research career, see the full computational biology skills roadmap.
What is different about simulating a protein with a ligand?
A standard force field already knows every atom in the 20 amino acids, so gmx pdb2gmx can build a protein topology automatically. A drug-like ligand is not a standard residue, so the force field has no parameters for it. The one new task in a protein-ligand simulation is generating those missing parameters (atom types, charges, bonds, angles, dihedrals) and splicing them into the protein topology. Everything after that, solvation, ions, minimization, equilibration, and production, follows the same pattern as a protein-only run.
You also have to keep the ligand from being kicked out of the pocket during equilibration. The fix is to apply position restraints to both the protein and the ligand while the water and ions relax around them, then release the restraints for the production run.
How do you prepare the protein and ligand files?
Download the complex from the Protein Data Bank and split it into two files. Open the PDB in a text editor or a viewer like PyMOL and separate the protein from the ligand. Strip crystallographic water unless you have a specific reason to keep it, and remove any crystallization additives that are not part of your system.
# keep only protein atoms (ATOM records) for pdb2gmx
grep "^ATOM" 3htb.pdb > protein_only.pdb
# the JZ4 ligand sits in HETATM records
grep "JZ4" 3htb.pdb > jz4.pdbThe ligand from a crystal structure has no hydrogen atoms. You must add them before parameterizing, because the charges depend on the full set of atoms. Add hydrogens with a tool such as Open Babel, then save the ligand as a MOL2 file:
obabel jz4.pdb -O jz4.mol2 -hCheck the result. Open Babel can occasionally assign the wrong bond order or protonation state, so confirm the structure matches the chemistry you expect before moving on.
How do you generate the ligand topology with CGenFF?
The CHARMM General Force Field (CGenFF) extends CHARMM36 to drug-like organic molecules. Upload your ligand MOL2 file to the CGenFF server, which returns a stream file (.str) containing atom types, charges, and a penalty score for each assignment. The CGenFF method was published by Vanommeslaeghe and colleagues in the Journal of Computational Chemistry (DOI: 10.1002/jcc.21367), and the paper is worth reading because it explains what the penalty values mean.
High penalty scores are a warning. As the CGenFF documentation states, charge penalties above roughly 50 and parameter penalties above 50 indicate analogy that is poor enough to warrant manual validation or dedicated parameterization. For a first project, pick a ligand whose penalties stay low so you can trust the automated assignment.
Convert the stream file into GROMACS format with the conversion script that ships with the CHARMM36 port for GROMACS:
python cgenff_charmm2gmx_py3_nx2.py JZ4 jz4.mol2 jz4.str charmm36-jul2022.ffThis produces jz4.itp (the ligand topology), jz4.prm (any new bonded parameters), and a GROMACS coordinate file for the ligand. Download the CHARMM36 force field port for GROMACS from the MacKerell lab force field page and unpack it into your working directory.
Can you parameterize the ligand without CGenFF?
Yes. If you prefer the AMBER family of force fields, use ACPYPE, which wraps Antechamber and assigns the General AMBER Force Field (GAFF) with AM1-BCC charges. ACPYPE was described by Sousa da Silva and Vranken in BMC Research Notes (DOI: 10.1186/1756-0500-5-367) and outputs ready-to-use GROMACS files in one command:
acpype -i jz4.mol2 -b JZ4The trade-off is consistency: your ligand force field must match your protein force field family. Use CGenFF with CHARMM36, or GAFF with an AMBER protein force field such as ff14SB. Do not mix a CHARMM protein with a GAFF ligand.
| Step | CGenFF route (CHARMM36) | ACPYPE route (AMBER/GAFF) |
|---|---|---|
| Protein force field | CHARMM36 | AMBER ff14SB or ff19SB |
| Ligand charges | CGenFF analogy charges | AM1-BCC (via Antechamber) |
| Input you provide | MOL2 uploaded to CGenFF server | MOL2 or PDB to local ACPYPE |
| Runs offline | No (server step) | Yes |
| Quality flag to watch | Charge and parameter penalties | Antechamber warnings on net charge |
How do you build the protein topology and merge the ligand?
Generate the protein topology with the CHARMM36 force field. When prompted, choose CHARMM36 and the TIP3P water model:
gmx pdb2gmx -f protein_only.pdb -o protein_processed.gro -water tip3pNow combine the protein and ligand coordinates into one file. Open protein_processed.gro, copy the ligand atom lines from the ligand .gro produced by the conversion script, paste them in just before the final box-vector line, and increase the atom count on the second line of the file by the number of ligand atoms. Save this as complex.gro.
Then edit topol.top in three places. Add an include for the ligand parameters near the top, just after the force field include line:
; Include ligand parameters
#include "jz4.prm"Add an include for the ligand topology after the protein topology include:
; Include ligand topology
#include "jz4.itp"Finally, list the ligand in the [ molecules ] section at the very bottom:
[ molecules ]
; Compound #mols
Protein_chain_A 1
JZ4 1Order matters here. The molecules must be listed in the same order their coordinates appear in complex.gro.
Want the guided, hands-on version?
Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.
How do you solvate the complex and add ions?
Define a simulation box around the complex, leaving at least 1.0 nm between the protein and the box edge so it never sees its own periodic image:
gmx editconf -f complex.gro -o newbox.gro -bt dodecahedron -d 1.0Fill the box with water:
gmx solvate -cp newbox.gro -cs spc216.gro -p topol.top -o solv.groNeutralize the system and add physiological salt. First assemble a run input file using an ions parameter file, then run genion:
gmx grompp -f ions.mdp -c solv.gro -p topol.top -o ions.tpr -maxwarn 1
gmx genion -s ions.tpr -o solv_ions.gro -p topol.top -pname NA -nname CL -neutralWhen prompted for a continuous group to replace with ions, choose SOL. The -neutral flag adds just enough counter-ions to bring the net charge to zero.
How do you restrain the ligand during equilibration?
The protein already has position restraints from pdb2gmx. You need to create matching restraints for the ligand so it does not drift while the solvent equilibrates. Make an index group for the ligand heavy atoms, then generate the restraint file:
gmx make_ndx -f jz4.gro -o index_jz4.ndx
# at the prompt, type: 0 & ! a H* then q
gmx genrestr -f jz4.gro -n index_jz4.ndx -o posre_jz4.itp -fc 1000 1000 1000Reference this restraint file in topol.top right after the ligand topology include, wrapped so it activates only when position restraints are switched on:
; Ligand position restraints
#ifdef POSRES_LIG
#include "posre_jz4.itp"
#endifYou will pass -DPOSRES -DPOSRES_LIG through the .mdp files during equilibration, then drop them for production.
How do you run minimization, equilibration, and production?
Energy minimization removes bad contacts introduced during setup:
gmx grompp -f minim.mdp -c solv_ions.gro -p topol.top -o em.tpr
gmx mdrun -v -deffnm emFor temperature coupling, group the protein and ligand together and the solvent together. Create the combined groups with make_ndx on the full system so your .mdp tc-grps can reference Protein_JZ4 and Water_and_ions:
gmx make_ndx -f em.gro -o index.ndx
# type: 1 | 13 (protein OR ligand) then qRun NVT equilibration (constant volume and temperature) with restraints on, followed by NPT (constant pressure) to settle the density:
gmx grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -n index.ndx -o nvt.tpr
gmx mdrun -deffnm nvt
gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -n index.ndx -o npt.tpr
gmx mdrun -deffnm nptThe -r flag supplies the reference coordinates for the position restraints. Finally, launch the unrestrained production run:
gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -n index.ndx -o md_0_10.tpr
gmx mdrun -deffnm md_0_10The complete worked example, including every .mdp file, is Justin Lemkul’s protein-ligand tutorial, part of the peer-reviewed suite published in the Living Journal of Computational Molecular Science (DOI: 10.33011/livecoms.1.1.5068). Use those parameter files rather than writing your own from memory.
How do you check the ligand stayed bound?
After production, correct periodic boundaries with gmx trjconv -pbc mol -center, then measure two things. Compute the protein backbone RMSD to confirm the simulation equilibrated, and compute the RMSD of the ligand alone, fitting on the protein, to see whether the pose held. A ligand RMSD that stays low and flat means the docked pose is stable in dynamics; a ligand RMSD that jumps and climbs means the ligand rearranged or left the pocket. The exact commands are covered in our guide on analyzing RMSD and RMSF from a GROMACS trajectory.
Troubleshooting common protein-ligand setup errors
- “Atomtype X not found” during grompp. The ligand parameter include is missing or in the wrong place. Make sure
#include "jz4.prm"appears after the force field line and before any molecule definitions. - “Number of coordinates in coordinate file does not match topology.” The atom count on line two of
complex.growas not updated, or the ligand is listed in the wrong order in[ molecules ]. Recount and match the order to the coordinate file. - grompp warns about a non-zero total charge. You skipped or under-counted ions. Rerun
gmx genionwith-neutral, and read the note grompp prints about the system charge. - The ligand drifts out of the pocket during NVT. Ligand position restraints are not active. Confirm
posre_jz4.itpis included under the right#ifdefand that your equilibration .mdp files define bothPOSRESandPOSRES_LIG. - High CGenFF penalties. The automated parameters may be unreliable for your molecule. Choose a simpler ligand for a first project, or budget time for manual parameter validation against the CGenFF paper’s guidance.
Frequently asked questions
Do I have to start from a docked pose?
For a binding study, yes. The simulation tests whether a proposed pose is stable, so it must begin from a sensible starting position, usually the top pose from docking or an experimental co-crystal structure. MD relaxes and validates the pose; it does not search for it the way docking does.
CGenFF or ACPYPE: which should a beginner use?
Use CGenFF if your protein is set up with CHARMM36, and ACPYPE if you are using an AMBER force field. CGenFF needs a free server upload but is well documented for the standard GROMACS tutorial; ACPYPE runs entirely offline. The key rule is to keep the protein and ligand in the same force field family.
How long should a protein-ligand production run be?
For a student project, runs of 50 to 100 ns are common and usually enough to see whether the pose is stable, though the right length depends on the system. Report the length you ran and judge stability from the shape of the RMSD curve rather than a fixed cutoff.
Why restrain the ligand and not just the protein?
During equilibration the water and ions are still settling, and an unrestrained ligand can be pushed out of a shallow pocket before the system stabilizes. Restraining the ligand holds the starting pose until the solvent is equilibrated, after which you release it for the production run.
Can I run this on a laptop?
A short protein-ligand run is feasible on a laptop but slow. A free GPU runtime on Google Colab is far faster, and our companion guide on running GROMACS on Google Colab shows how to set that up without installing anything locally.
Where this fits in your MD workflow
A stable protein-ligand trajectory is the foundation for the analyses that make a thesis chapter: ligand RMSD, hydrogen-bond occupancy between ligand and pocket residues, contact maps, and ultimately binding free energy estimates with MM-PBSA or MM-GBSA. Get the topology and equilibration right first, because every downstream number depends on a trajectory you set up correctly.
For the simulation engine itself, the reference is Abraham et al., “GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers,” SoftwareX 1-2 (2015) 19-25 (DOI: 10.1016/j.softx.2015.06.001).
Written by the StemSkills Lab team, with 10+ years in sequence and structural bioinformatics, drug discovery and design, and multiscale molecular modeling.
Want the guided, hands-on version?
Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.