Blog
15 Computational Biology Project Ideas for MSc & Final-Year Students (Docking + MD)
- June 10, 2026
- Posted by: Stem Skills Lab
- Category: Career Guide

Answer: The best computational biology projects for MSc and final-year students use free tools to answer one focused question, like “does this molecule bind my target, and is the complex stable?” Start with a molecular docking project (AutoDock Vina), then add a short molecular dynamics run (GROMACS). Each project below names what you’d do, the free tools, and the skill it proves to a PI or admissions committee.
By the StemSkills Lab team, a group with 10+ years in sequence and structural bioinformatics, drug discovery and design, and multiscale molecular modeling. This guide is written for BSc and MSc students (in India and anywhere) who want concrete, no-cost project ideas that look credible on a CV, statement of purpose, or research proposal.
Why does a computational project matter for grad school and your CV?
A self-driven project is the single most persuasive line on an early-career CV: it shows you can do research without being told exactly how. Coursework proves you can follow instructions; a finished docking or molecular dynamics (MD) project proves you can frame a question, pick a method, generate results, and interpret them honestly. Admissions committees read dozens of similar transcripts; a finished project gives yours something concrete to point to.
Computational structural biology is unusually friendly to students because the barrier to entry is almost zero. The structures are public, the RCSB Protein Data Bank holds over 254,000 experimentally determined structures, and the gold-standard docking and simulation engines are free and open source. A normal laptop and a clear question are enough to produce something defensible.
The 15 ideas below are grouped into three tiers: beginner docking, molecular dynamics, and combined docking-plus-MD. We describe ideas and what each demonstrates, not command-by-command steps, because exact syntax changes between tool versions; always follow the official tutorial for your tool.
What are good beginner molecular docking projects?
Docking is the ideal first project: it runs on any laptop, finishes in minutes to hours, and produces a clear, visual result. The standard free engine is AutoDock Vina (Trott & Olson, 2010), usually paired with the free GUI wrapper PyRx and a viewer such as PyMOL or UCSF ChimeraX. The Vina paper reports it “achieves an approximately two orders of magnitude speed-up compared to the molecular docking software previously developed in our lab (AutoDock 4)” while improving accuracy, which is why it is the default starting point for students. For the full setup path, see our guide on how to learn molecular docking.
1. Reproduce a known drug-target complex (the redocking benchmark)
What you’d do: Take a crystal structure with a co-crystallized ligand, remove the ligand, dock it back, and measure how close your top pose is to the real one (RMSD). Free tools: AutoDock Vina, PyRx, ChimeraX. What it proves: that you validate a method before trusting it, the hallmark of a careful researcher. A low redocking RMSD is evidence your pipeline works, and PIs value a candidate who benchmarks first.
2. Virtual screen of phytochemicals against a disease target
What you’d do: Pull 50-200 natural compounds from PubChem, dock them against a relevant target (a kinase or microbial enzyme), and rank by predicted affinity. Free tools: PyRx (batch docking), Vina, PubChem. What it proves: that you can run a real virtual-screening pipeline at small scale and reason about hit selection, directly relevant to drug-discovery labs.
3. Repurpose approved drugs against a new target
What you’d do: Dock a library of FDA-approved drugs (available as curated sets, or filtered from ChEMBL) against an emerging target to find candidates worth a second look. Free tools: Vina, PyRx, ChEMBL. What it proves: that you understand the logic of drug repurposing and can connect a computational result to a translational question, a strong angle for an SOP.
4. Compare two docking engines on the same system
What you’d do: Dock the same ligand set with AutoDock Vina and a second free engine (such as the web server SwissDock, or AutoDock4), then compare rankings and poses. Free tools: Vina, SwissDock, AutoDock4. What it proves: methodological maturity, that you know results are method-dependent and shouldn’t be taken on faith from a single tool.
5. Map a binding site with blind docking
What you’d do: When the binding pocket is unknown, dock against the whole protein surface (blind docking) to predict where a ligand prefers to sit, then cross-check against the literature. Free tools: Vina, ChimeraX, optionally a deep-learning docker like DiffDock for comparison. What it proves: hypothesis generation, you can produce a testable prediction about where on a protein something acts.
6. Profile a small mutation’s effect on binding
What you’d do: Introduce a point mutation in the binding site (in silico), redock the same ligand, and report how the predicted affinity and pose shift. Free tools: ChimeraX (mutate residues), Vina, PyRx. What it proves: that you can connect sequence to structure to function, the central narrative of structural biology and a great fit for a genetics or biochemistry lab.
Want a mentor-guided project for your applications?
Our live Molecular Modeling & MD Simulations cohort bootcamp ends with a real portfolio project you can put on your CV and SOP.
What are good molecular dynamics (MD) projects?
Molecular dynamics adds the dimension docking lacks, time. Instead of one static pose, MD simulates how atoms move, so you can ask whether a structure is stable, how flexible it is, and what holds it together. The standard free, open-source engine is GROMACS (Abraham et al., 2015), which the developers describe as delivering “high performance molecular simulations through multi-level parallelism from laptops to supercomputers”, short runs on a normal laptop are feasible. The classic free starting point is the GROMACS lysozyme-in-water tutorial (Lemkul). For the full learning path, see our guide on how to learn molecular dynamics with GROMACS.
7. Simulate a protein in water and measure stability
What you’d do: Solvate a small protein, run a short equilibrium MD simulation, and analyze RMSD (overall drift) and RMSF (per-residue flexibility) to describe which regions are rigid and which are floppy. Free tools: GROMACS, ChimeraX/VMD for visualization. What it proves: that you can run and interpret a full MD pipeline, the foundational MD skill every simulation lab expects.
8. Compare a protein’s flexibility across two conditions
What you’d do: Run the same protein at two temperatures (or wild-type vs. a mutant) and compare RMSF and radius of gyration to show how dynamics change. Free tools: GROMACS, Grace/Python (Matplotlib) for plots. What it proves: that you can design a controlled comparison and turn raw trajectories into a clear, plotted argument.
9. Watch a protein’s secondary structure over time
What you’d do: Track how helices and sheets persist or unravel during a simulation (using a secondary-structure analysis such as DSSP within GROMACS). Free tools: GROMACS, DSSP. What it proves: that you can extract a biologically meaningful signal from a trajectory rather than just running the software.
10. Build and simulate a membrane protein system
What you’d do: Use CHARMM-GUI (free for academics) to embed a membrane protein in a lipid bilayer, then run a short MD simulation in GROMACS. Free tools: CHARMM-GUI Membrane Builder, GROMACS. What it proves: that you can handle a genuinely complex, realistic system, membrane simulations are a clear step above a protein-in-water box.
11. Reproduce a published simulation result
What you’d do: Pick a small, well-documented MD study, rerun a comparable short simulation, and check whether you see the same qualitative behavior (stability, a key contact, a conformational change). Free tools: GROMACS, the original paper’s supplementary files. What it proves: reproducibility literacy, increasingly valued, and a credible, low-risk project because you have a target answer to compare against.
What about combined docking + MD projects?
The most impressive student projects chain the two methods: dock to find a binding pose, then run MD to test whether it holds up over time. This “dock-then-simulate” workflow mirrors real computer-aided drug design and is the through-line of our computational biology skills roadmap. These projects are the most directly relevant to active computational drug-design labs.
12. Validate a docking pose with MD stability
What you’d do: Dock a ligand with Vina, take the top pose, build the complex, and run a short MD simulation to see whether the ligand stays bound or drifts out. Free tools: Vina, GROMACS, a ligand-topology tool (e.g., a CHARMM-GUI ligand reader). What it proves: that you understand docking’s biggest weakness, static scoring, and know how to address it. This is the strongest beginner-to-intermediate project for a CV or SOP.
13. Rank candidate binders by how stable their complexes are
What you’d do: From a small virtual screen, take the top 3-5 docked hits and run short MD on each, then re-rank them by how stably each ligand stays in the pocket (not just by docking score). Free tools: PyRx/Vina, GROMACS. What it proves: that you can build a multi-step pipeline and make a defensible, evidence-based recommendation, the essence of applied drug design.
14. Estimate relative binding strength with MM-PBSA / MM-GBSA
What you’d do: After a short MD run on a complex, use a free MM-PBSA/MM-GBSA tool (such as the open-source gmx_MMPBSA) to estimate binding free energy and compare two ligands. Free tools: GROMACS, gmx_MMPBSA. What it proves: that you can go beyond a docking score to a more physically grounded energy estimate, an advanced skill for an MSc student and a solid SOP addition.
15. Probe how a mutation changes a complex (dock + simulate)
What you’d do: Dock a ligand into both wild-type and a mutant binding site, then run MD on both complexes to show how the mutation alters binding stability or interaction network. Free tools: ChimeraX (mutagenesis), Vina, GROMACS. What it proves: that you can integrate sequence variation, structure, and dynamics into one coherent story, exactly the kind of question disease-mechanism and protein-engineering labs care about.
How do I choose the right project for my level?
Match the project to the time you have and the skills you want to show. Docking is achievable in a week or two and proves a clean, visual result; MD takes longer and proves analysis stamina; a combined project takes the most effort but produces the most evidence. Use the table to pick.
| Project type | Difficulty | Rough time | Core free tools | What it best demonstrates |
|---|---|---|---|---|
| Docking (ideas 1-6) | Beginner | 1-2 weeks | AutoDock Vina, PyRx, ChimeraX | A clean, visual, validated result |
| Molecular dynamics (ideas 7-11) | Intermediate | 2-4 weeks | GROMACS, CHARMM-GUI, Python | Trajectory analysis and interpretation |
| Docking + MD combined (ideas 12-15) | Intermediate-advanced | 4-8 weeks | Vina + GROMACS (+ gmx_MMPBSA) | An end-to-end, publication-shaped story |
Two rules of thumb. First, finish something small before you attempt something big, a completed redocking benchmark beats an abandoned MM-PBSA pipeline every time. Second, always validate: redock a known complex, or reproduce a published result, so you have evidence your method works before you trust a novel result.
How do I present the project to a PI or admissions committee?
A project only counts if you can communicate it. Write a short report or one-page GitHub README with five parts: the question; the method and tools (cite the tool papers, Vina is Trott & Olson 2010, GROMACS is Abraham et al. 2015); the result (one or two clear figures, a pose, an RMSD plot); the limitations (be honest: short simulations, single replicate, scoring-function caveats); and the next step you would take with more time. That limitations section is what separates a student who ran software from a researcher who understands it.
Put the code and inputs in a public repository and link it from your CV and SOP. When a PI can click through and see a reproducible, honestly-described project, you have already answered their biggest question: can this person do research?
Want a mentor-guided project for your applications?
Our live Molecular Modeling & MD Simulations cohort bootcamp ends with a real portfolio project you can put on your CV and SOP.
Frequently asked questions
Do I need a powerful computer or a GPU for these projects?
No, not to start. Classic docking with AutoDock Vina runs comfortably on an ordinary laptop, and GROMACS is explicitly designed to scale “from laptops to supercomputers”, short MD runs on small systems are feasible on a normal machine, though longer simulations and large systems benefit from a GPU or free cloud/HPC time.
Are these tools really free for students?
Yes. AutoDock Vina and GROMACS are free and open source. PyRx, ChimeraX, PubChem, ChEMBL, and the RCSB PDB are free to use, and CHARMM-GUI is free for academic users after a quick registration. You can complete every project on this list at zero software cost.
Can I publish a paper from a student docking or MD project?
Sometimes, a careful virtual-screening or dock-plus-MD study can become a short paper, especially with a mentor. But publication should not be the goal for a first project: a finished, well-documented, honestly-described project is valuable for grad-school applications whether or not it ever becomes a paper.
Should I learn docking or molecular dynamics first?
Docking first. It is faster, gives immediate visual results, and builds the structural-biology intuition (binding sites, poses, interactions) that makes MD easier to understand. Once you have a docking project finished, the natural next step is to validate a pose with a short MD simulation, which is also the most CV-worthy combined project on this list.
References and primary sources
- Trott, O. & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry, 31(2), 455-461. doi:10.1002/jcc.21334
- Abraham, M. J., Murtola, T., Schulz, R., Páll, S., Smith, J. C., Hess, B. & Lindahl, E. (2015). GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX, 1-2, 19-25. doi:10.1016/j.softx.2015.06.001
- RCSB Protein Data Bank, structure archive and statistics. rcsb.org