Blog

StemSkills Lab > Blog > Molecular Modeling > How to Interpret Molecular Docking Results: Binding Affinity and Scores Explained

How to Interpret Molecular Docking Results: Binding Affinity and Scores Explained

June 16, 2026
Posted by: Stem Skills Lab
Category: Molecular Modeling

No Comments

How to Interpret Molecular Docking Results: Binding Affinity and Scores Explained

A molecular docking score is a predicted binding affinity in kilocalories per mole (kcal/mol), and a more negative value means stronger predicted binding. Use it to rank poses and ligands within one consistent setup, not as an exact experimental energy. The real test of whether you can trust a result is re-docking a known ligand and checking how closely the predicted pose reproduces the crystal structure.

You have run your first docking job and the software hands back a table of negative numbers. Now what? Knowing how to interpret molecular docking results, separating a meaningful prediction from numerical noise, is the skill that turns a finished run into an actual conclusion. This guide explains what each number means, how to compare them, and how to prove to yourself that a result is real.

It assumes you have already produced some output, so it pairs directly with our AutoDock Vina tutorial for beginners and our PyRx tutorial for molecular docking. It sits inside our pillar guide on how to learn molecular docking, a key stop on the computational biology skills roadmap.

What does the binding affinity score in kcal/mol mean?

The binding affinity is the docking program’s estimate of how favourable the interaction between your ligand and the receptor is, expressed as a free energy of binding in kilocalories per mole (kcal/mol). It is calculated by a scoring function, a mathematical model that adds up contributions like hydrogen bonds, hydrophobic contacts, and steric clashes for a given pose, then returns a single number.

The scoring function in the most widely used free docking tool, AutoDock Vina, is described in its canonical paper by Oleg Trott and Arthur J. Olson, “AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading” (Journal of Computational Chemistry, vol. 31, no. 2, 2010, pp. 455-461). The authors report that Vina “achieves approximately two orders of magnitude” speed-up over AutoDock 4 while also improving the accuracy of binding-mode predictions, which is why it became the default engine for student projects and large virtual screens alike.

The crucial caveat: this number is a prediction, not a measurement. It is calibrated to approximate experimental binding energies, but it is not the same thing as a measured dissociation constant. Treat it as a ranking signal, not a lab result.

Why is a more negative score stronger?

Binding is a spontaneous process, and in thermodynamics a spontaneous, favourable process has a negative change in free energy. The scoring function follows that convention: the more favourable the predicted interaction, the more negative the number. So a pose scoring -9.2 kcal/mol is predicted to bind more strongly than one scoring -6.1 kcal/mol.

This is the single most common point of confusion for beginners. You are not looking for the biggest number, you are looking for the most negative one. When the software sorts results, the best pose sits at the top with the lowest (most negative) value.

How do I read the AutoDock Vina output table?

When Vina finishes, it prints a ranked table of binding modes. Each row is one candidate pose, sorted from best to worst. There are three columns to understand:

Column	What it reports	How to use it
mode	The rank of the pose (1 is the best-scoring).	Pose 1 is your primary prediction; lower-ranked modes are alternatives.
affinity (kcal/mol)	The predicted binding free energy for that pose.	More negative = stronger. This is the headline number.
rmsd l.b.	Lower-bound RMSD of this pose relative to the top pose.	Measures how different this pose is from mode 1, not from any experiment.
rmsd u.b.	Upper-bound RMSD of this pose relative to the top pose.	A second, stricter way of measuring that same difference.

The most important thing to internalise here is what the RMSD columns are not: they do not tell you whether the docking is correct. Both RMSD values are measured against the best pose in the same run, they describe how spread out your poses are, not how close any of them is to reality. To answer the “is it correct?” question you need a separate control, covered below.

What counts as a good docking score?

There is no universal cutoff that means “this binds.” A score is only meaningful relative to other scores produced with the same receptor, the same box, and the same settings. That said, beginners want a rough orientation, so here is an honest one:

Around -6 kcal/mol or weaker (less negative): treat with caution, many of these are weak or non-specific.
Roughly -7 to -9 kcal/mol: a typical range for plausible drug-like binders in many systems, but system-dependent.
-10 kcal/mol or stronger (more negative): a strong predicted interaction worth a closer look, and worth double-checking for artefacts.

Use these as orientation, not gospel. The numbers shift with the size of the ligand, the pocket, and the target, so a “-8” against one protein is not directly comparable to a “-8” against another. What is always valid is comparison within one consistent experiment.

Want the guided, hands-on version?

Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.

Join the waitlist (free) →

How do I compare poses and ligands correctly?

There are two different comparisons, and mixing them up is a classic mistake.

Comparing poses of one ligand. Look at mode 1 first, the top-ranked pose. Then check whether the next few modes cluster near it (small RMSD, similar score) or scatter into the pocket differently. A tight cluster of similar poses with similar scores is a confident prediction; a wide spread of very different poses with similar scores means the docking is uncertain and you should not over-read the top pose.

Comparing different ligands. To rank candidate molecules against each other, compare their best (mode 1) scores, but only if every ligand was docked into the same prepared receptor, with the same grid box, using the same parameters. Change any of those and the comparison is no longer fair. This is why careful, identical preparation matters; work through how to prepare a protein and ligand for docking so every input in a screen is built the same way.

Why is a re-docking RMSD control the real test of trust?

The single most convincing check on a docking setup is re-docking: take a protein that was crystallised with a ligand already bound, remove that ligand, then dock it back in and see whether the program puts it where the experiment found it. Because you know the true answer (the crystal pose), you can measure how far off the prediction is.

That distance is the root-mean-square deviation (RMSD) between the docked pose and the experimental pose. The widely used convention in the docking literature is that an RMSD below 2.0 Å counts as a successful reproduction of the binding mode. If your setup re-docks a known ligand to under 2 Å, you have evidence that your protein preparation, grid box, and parameters are sound, and that gives you reason to trust the scores for new ligands run the same way.

You can find suitable protein-ligand complexes and their experimental coordinates on the RCSB Protein Data Bank, and the official AutoDock Vina documentation describes the input formats and options you will use to set the control up. Run this re-docking test before you report any result, it is the difference between “the software gave me a number” and “I have a validated docking workflow.”

Common mistakes when interpreting docking results

Reading the score backwards. More negative is stronger. The most negative value is the best pose, not the worst.
Treating the score as a measured affinity. It is a predicted, relative ranking signal, not an experimental binding constant.
Comparing scores across different setups. A score is only comparable to others from the same receptor, box, and parameters.
Confusing the RMSD columns with accuracy. In the Vina table, RMSD is measured against the top pose, not against the crystal structure.
Skipping the re-docking control. Without it, you have no evidence your setup reproduces reality.

Frequently asked questions

What units are docking scores in?

Kilocalories per mole (kcal/mol). The values are negative, and a more negative number indicates a stronger predicted interaction. Treat them as a ranking signal rather than an exact experimental binding energy.

Is a higher or lower docking score better?

Lower, meaning more negative. A pose scoring -9 kcal/mol is predicted to bind more strongly than one scoring -6 kcal/mol. The best pose is the one with the most negative affinity.

What is a good RMSD for re-docking?

Below 2.0 Å is the conventional threshold for a successful reproduction of the experimental binding mode. If your setup re-docks a known ligand to under 2 Å against its crystal pose, that is strong evidence your workflow is trustworthy.

Can I compare docking scores between two different proteins?

Not directly. Scores depend on the receptor, the search box, and the parameters, so a value against one target is not comparable to the same value against another. Compare scores only within one consistent experiment.

Why do my top poses all have similar scores but different shapes?

That usually signals an uncertain prediction. A confident result shows the top poses clustering in a similar orientation with similar scores; a wide spread of different shapes at similar scores means you should not over-interpret the single top pose.

Your next step

Take a protein-ligand complex you trust, remove the ligand, and re-dock it. If the predicted pose lands within 2 Å of the crystal structure, you have validated your whole pipeline, and every score it produces afterwards carries real weight. That one control is what separates a confident result from a guess.

From here, sharpen the run itself with our AutoDock Vina tutorial for beginners, tighten your inputs with preparing a protein and ligand for docking, and use the molecular docking pillar guide and the wider computational biology skills roadmap to plan what comes next.

Want the guided, hands-on version?

Our live Molecular Modeling & MD Simulations cohort bootcamp takes you from zero to running real docking and MD workflows, with a portfolio project for your grad-school applications.

Join the waitlist (free) →

Written by the StemSkills Lab team, computational scientists with 10+ years of combined experience in sequence and structural bioinformatics, drug discovery and design, and multiscale molecular modeling.

Think you know Molecular Docking?

Take the free StemSkills assessment and earn a verifiable certificate you can download and add to your LinkedIn profile.

Start the free assessment

Login/Sign Up

Search

Menu

Blog

How to Interpret Molecular Docking Results: Binding Affinity and Scores Explained

What does the binding affinity score in kcal/mol mean?

Why is a more negative score stronger?

How do I read the AutoDock Vina output table?

What counts as a good docking score?

Want the guided, hands-on version?

How do I compare poses and ligands correctly?

Why is a re-docking RMSD control the real test of trust?

Common mistakes when interpreting docking results

Frequently asked questions

What units are docking scores in?

Is a higher or lower docking score better?

What is a good RMSD for re-docking?

Can I compare docking scores between two different proteins?

Why do my top poses all have similar scores but different shapes?

Your next step

Want the guided, hands-on version?

Leave a Reply