Blog
Your First GROMACS MD Simulation: A Step-by-Step Tutorial for Beginners
- June 26, 2026
- Posted by: Stem Skills Lab
- Category: Molecular Modeling

To run your first GROMACS MD simulation, you take a protein PDB file and move it through eight stages: build the topology with gmx pdb2gmx, define a box with gmx editconf, add water with gmx solvate, neutralize with gmx genion, energy-minimize, then equilibrate under NVT and NPT before the production run. The classic worked example is lysozyme in water, and it finishes with a real trajectory.
A molecular dynamics simulation sounds intimidating until you realize it is a fixed recipe. Every protein-in-water run follows the same eight steps in the same order, and once you have done it once on a small protein, you can repeat it for almost any system. This tutorial walks you through that recipe end to end using hen egg-white lysozyme, the structure that the standard GROMACS tutorial uses, so you finish with a production trajectory you can analyze and put in your thesis or CV.
This is a core spoke in our pillar guide on how to learn molecular dynamics with GROMACS, and it sits on the wider computational biology skills roadmap. If GROMACS is not installed yet, start with our GROMACS installation guide first, then come back here.
What is a molecular dynamics simulation, and what will you build?
Molecular dynamics (MD) computes how every atom in a system moves over time by solving Newton’s equations of motion in tiny steps, typically two femtoseconds each. A force field supplies the equations that describe bonds, angles, and the electrostatic and van der Waals forces between atoms. Run enough steps and you get a trajectory: a movie of the protein breathing, wobbling, and settling into stable conformations.
In this tutorial you will simulate lysozyme, a 129-residue enzyme, surrounded by explicit water and a few ions, then run 1 nanosecond of dynamics. The reference workflow is Justin Lemkul’s widely used GROMACS tutorial, published as “From Proteolysis to Drug Discovery: An Introduction to GROMACS” in the Living Journal of Computational Molecular Science (2019). We follow the same canonical commands here, with explanations of what each one does and why.
What do you need before you start?
You need three things. First, a working GROMACS install with the gmx command on your path (confirm with gmx --version). Second, a protein structure: download PDB entry 1AKI (hen egg-white lysozyme) from the RCSB Protein Data Bank. Third, the set of parameter files (the .mdp files) that control each stage; these come with the Lemkul tutorial and you should download them rather than type them by hand.
Before anything else, strip the crystallographic water out of the PDB file so GROMACS can add its own solvent cleanly:
grep -v HOH 1aki.pdb > 1AKI_clean.pdbThat single line removes every line containing the HOH water residue and writes a clean structure. Work in a fresh, empty folder, because this process generates a dozen intermediate files and you do not want them mixed with anything else.
How do you prepare the protein topology with pdb2gmx?
The first real GROMACS step converts the raw structure into a topology: a description of every atom, bond, and charge that the force field understands. Run:
gmx pdb2gmx -f 1AKI_clean.pdb -o 1AKI_processed.gro -water spceGROMACS will list the available force fields and ask you to choose one. For this tutorial, pick the OPLS-AA/L all-atom force field. The -water spce flag selects the SPC/E water model, a common choice paired with OPLS. This command produces three files: 1AKI_processed.gro (the structure in GROMACS format), topol.top (the master topology), and posre.itp (position restraints used later in equilibration). The force field is the single most important scientific choice you make, so when you move beyond this tutorial, read about which force field suits your molecule type.
How do you build the box and add water and ions?
A protein in a vacuum is not realistic. You need to place it in a box and fill that box with water, then add ions to neutralize the net charge. This takes three commands.
Define a cubic box that keeps the protein at least 1.0 nm from every edge, so it never sees its own periodic image:
gmx editconf -f 1AKI_processed.gro -o 1AKI_newbox.gro -c -d 1.0 -bt cubicFill the box with SPC/E water:
gmx solvate -cp 1AKI_newbox.gro -cs spc216.gro -o 1AKI_solv.gro -p topol.topLysozyme carries a net positive charge, so the system needs chloride ions to balance it. Ion addition is a two-step process: first assemble a run input file with gmx grompp using the ions.mdp parameter file, then run gmx genion:
gmx grompp -f ions.mdp -c 1AKI_solv.gro -p topol.top -o ions.tpr
gmx genion -s ions.tpr -o 1AKI_solv_ions.gro -p topol.top -pname NA -nname CL -neutralWhen genion asks which group to replace with ions, choose the SOL (solvent) group so it swaps water molecules, never protein atoms. The -neutral flag adds just enough sodium or chloride to bring the total charge to zero. You now have a solvated, neutral system ready for energy minimization.
How do you energy-minimize the system?
Adding water and ions can leave atoms too close together, which would make the simulation explode on the first step. Energy minimization relaxes those clashes before any dynamics begin. As Lemkul’s GROMACS tutorial puts it, the goal is to ensure “the system has no steric clashes or inappropriate geometry.” Assemble the input and run it:
gmx grompp -f minim.mdp -c 1AKI_solv_ions.gro -p topol.top -o em.tpr
gmx mdrun -v -deffnm emThe -deffnm em flag tells mdrun to use em as the base name for every output file (em.gro, em.log, em.edr). Minimization is converged when the maximum force (Fmax) drops below the target set in the parameter file, commonly 1000 kJ/mol/nm. Check the end of em.log for a line reporting the final maximum force and potential energy; a negative, stable potential energy means you are ready to move on.
Want the guided, hands-on version?
Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.
How do you equilibrate with NVT and NPT?
Before the real simulation, the solvent needs to settle around the protein at the right temperature and pressure. This happens in two phases, and during both you keep position restraints on the protein (that is what posre.itp was for) so the solvent equilibrates without the protein flying apart.
NVT equilibration holds the number of particles, volume, and temperature constant while the system reaches your target temperature, usually 300 K:
gmx grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -o nvt.tpr
gmx mdrun -deffnm nvtNPT equilibration then stabilizes the pressure and density, letting the box size adjust. Note that it reads the checkpoint file nvt.cpt so the velocities carry over:
gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
gmx mdrun -deffnm nptEach equilibration phase is typically 100 picoseconds in this tutorial. You can verify temperature and pressure converged by extracting them with gmx energy from the .edr files, but for a first run, a completed NPT step that reports a stable density near 1000 kg/m³ is the green light to start production.
How do you run the production MD simulation?
This is the step that produces your actual data. With the protein equilibrated and restraints released, you run free dynamics. Build the input from the NPT output and launch:
gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr
gmx mdrun -deffnm md_0_1The md.mdp file in the tutorial sets a 2 femtosecond timestep (dt = 0.002) and 500,000 steps, which is 1 nanosecond of simulation. On a CPU-only laptop this can take several hours; with a GPU it finishes far faster, because GROMACS offloads the heavy nonbonded calculations to the card. When it completes you have md_0_1.xtc (the trajectory) and md_0_1.tpr (the run input), which are exactly the two files you feed into analysis.
Your next move is to turn that trajectory into results. Start with our step-by-step guide on how to analyze RMSD and RMSF from a GROMACS trajectory, the first two checks every thesis committee expects.
The full GROMACS MD pipeline at a glance
Here is the entire workflow in order, with the command and the file each stage produces. Bookmark this table as your checklist for every future simulation.
| Stage | GROMACS command | What it does | Key output |
|---|---|---|---|
| 1. Topology | gmx pdb2gmx | Builds topology and applies the force field | topol.top |
| 2. Box | gmx editconf | Defines the simulation box | 1AKI_newbox.gro |
| 3. Solvate | gmx solvate | Fills the box with water | 1AKI_solv.gro |
| 4. Add ions | gmx genion | Neutralizes net charge | 1AKI_solv_ions.gro |
| 5. Minimize | gmx mdrun -deffnm em | Removes steric clashes | em.gro |
| 6. NVT | gmx mdrun -deffnm nvt | Equilibrates temperature | nvt.gro |
| 7. NPT | gmx mdrun -deffnm npt | Equilibrates pressure and density | npt.gro |
| 8. Production | gmx mdrun -deffnm md_0_1 | Runs free dynamics | md_0_1.xtc |
What are the common errors, and how do you fix them?
Three failures trip up nearly every beginner. First, “Residue ‘XXX’ not found in residue topology database” from pdb2gmx usually means your PDB still has heteroatoms, ligands, or unusual residues the force field does not recognize; clean the file or pick a force field that covers them. Second, “number of coordinates in coordinate file does not match topology” almost always means you skipped updating topol.top during solvate or genion, or you mixed files from different stages; rerun the stage so the structure and topology agree. Third, a minimization or run that crashes with “LINCS warning” or segmentation faults points to bad starting geometry, so confirm energy minimization actually converged before moving to NVT. When in doubt, the official GROMACS common errors documentation explains each message in detail.
Frequently asked questions
How long does a first GROMACS MD simulation take?
The 1 nanosecond lysozyme production run takes a few hours on a typical CPU-only laptop and minutes to under an hour on a machine with a supported GPU. The setup steps (topology through equilibration) run quickly; the production step is the time-consuming one.
Do I need a GPU to run this tutorial?
No. GROMACS runs entirely on CPU, and this tutorial completes without a GPU. A GPU only speeds up the production run. If your laptop is slow, you can run the same workflow on a free Google Colab GPU instead.
Why lysozyme and the 1AKI structure?
Lysozyme is small (129 residues), well-behaved, and has no cofactors or unusual residues, which keeps the setup clean. It is the structure used in the standard GROMACS tutorial, so you can cross-check every step against a trusted reference.
What is the difference between NVT and NPT equilibration?
NVT holds volume and temperature fixed to bring the system to your target temperature. NPT then allows the box volume to change so pressure and density stabilize. You run NVT first, then NPT, before production.
What do I do with the trajectory once the run finishes?
You analyze it. The standard first analyses are backbone RMSD (is the structure stable?) and per-residue RMSF (which regions are flexible?), both covered in our dedicated RMSD and RMSF guide.
Run this once and the recipe stops being abstract. Every protein MD project you tackle afterward, including ligand complexes and membrane systems, is a variation on these same eight stages.
Want the guided, hands-on version?
Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.
Written by the StemSkills Lab team, computational scientists with 10+ years in sequence and structural bioinformatics, drug discovery, and multiscale molecular modeling.