What Are Reproduction Steps? How to Write Steps That Actually Repro
Reproduction steps are the ordered actions that reliably trigger a bug from a known starting state. Here is how to write steps that actually repro, and why steps alone fail 17% of the time.

Definition
Reproduction steps are the ordered, numbered actions that reliably trigger a bug, written from a known starting state so anyone can recreate the failure. They are paired with the expected result and the actual result observed at the point things break.
You will also see them as repro steps, steps to reproduce, or the acronym STR. The triad is always the same: a numbered path of actions, what you expected to happen, and what actually happened. GitHub bakes exactly these three fields into its issue-form templates — 'Steps to reproduce', 'Expected behavior', 'Actual behavior' — making the structure the de facto standard enforced at file time across millions of repositories. This page defines STR and the reasons it fails; for the full six-field bug report it lives inside, see the bug report template guide.
Why it matters
Mozilla's Bug Writing Guidelines are blunt about it: 'Steps to reproduce are the most important part of any bug report. If a developer is able to reproduce the bug, the bug is very likely to be fixed.' The same page warns of the inverse — 'if the steps are unclear, it might not even be possible to know whether the bug has been fixed.' Reproduction is the gate. Everything downstream, from triage priority to the regression test, depends on a developer being able to make the failure happen on demand.
Good steps share three traits. They start from a known entry state (logged out, a fresh tab, a specific URL) so the reader is not guessing at preconditions. They use unambiguous actions that name the exact element and value — 'enter -1 in the Quantity field', not 'add some items'. And they state expected versus actual at the failure point, because ambiguous expectations are not a cosmetic problem: the data-fusion study of 576 non-reproducible reports found ambiguous or outdated expected behavior drove about 8% of non-reproducibility on its own. Length is a trap in both directions — Mozilla advises minimizing to the shortest sequence that still triggers the bug. Over-described reports bury the signal; under-described ones bounce back as 'need more info'.
Here is the part most guides skip. Steps reliably bounce because steps capture actions, not state. The same five clicks can pass or fail depending on the API response, a feature flag, timing, cached data, the viewport, or the account. The 'Works for me!' empirical study of Firefox and Eclipse quantified the cost: non-reproducible reports are about 17% of all bug reports and stay active roughly three months longer than reproducible ones. The data-fusion follow-up found roughly 14% of Eclipse's non-reproducible reports simply lacked the information required to reproduce. Notably, 66% of the non-reproducible reports that were eventually fixed had in fact been reproduced once enough information finally arrived — proof that the missing ingredient is almost always state, not effort.
How this shows up in a real BugMojo bug report
In a BugMojo report the steps do not arrive alone. The browser extension records the failure with the surrounding state attached — an rrweb session replay of exactly what the user did, the console output, and the network request that fed the data. So 'Step 4: click Checkout, expected order confirmation, actual blank page' sits next to the precise POST /api/cart response that returned an empty cart, and the replay shows the click that triggered it. The prose steps become a recipe you can re-run, not a description you have to reconstruct. That is the difference between a developer reproducing on the first try and a ticket that ping-pongs for three months.
State matters even more once the reader is an agent. Cursor shipped a /debug command in April 2026 aimed squarely at bugs that are 'hard to reproduce or understand', where the agent generates hypotheses, adds log statements, and uses runtime information to localize the fault — and its Bugbot reports that '70%+ of flags get resolved before merge'. Reproduction is becoming an agent task, not just a human one. But an agent reading prose steps alone still guesses at the triggering state. BugMojo's MCP server hands the agent (Claude Code, Cursor) the steps plus the replay, console, and network bundle, which is the gap between 'reproduce this for me' as a hunt and the same request as a deterministic task.
| Feature | Capability | BugMojo | Issue tracker + test tool (Jira/TestRail) |
|---|---|---|---|
| Structured STR + expected/actual fields | — | ✓ | ✓ |
| rrweb session replay captured with the steps | — | ✓ | — |
| Console + exact network request attached | — | ✓ | Manual attachment |
| Steps + state handed to an AI agent over MCP | — | ✓ | — |
| Formal test-case library with run history | — | — | ✓ |
| Deep workflow / sprint / Jira admin | — | — | ✓ |
Frequently asked questions
Frequently asked questions
Sources
- Bug Writing Guidelines — Steps to reproduce are 'the most important part of any bug report' — Mozilla / Bugzilla (2025)
- Works for me! Characterizing non-reproducible bug reports (Firefox + Eclipse empirical study) — Mozilla Foundation / Empirical Software Engineering (2022)
- Why are Some Bugs Non-Reproducible? An Empirical Investigation using Data Fusion — arXiv (ICSME 2020) (2021)
- CLI Debug Mode and /btw Support — /debug for bugs 'hard to reproduce or understand' — Cursor / Anysphere (2026-04-14)
- Configuring issue templates — Steps to reproduce / Expected / Actual fields — GitHub Docs (2026)
- Cursor Bugbot — '70%+ of flags get resolved before merge' — Cursor / Anysphere (2026)
Get bug-tracking insights, weekly.
Engineering deep-dives, QA playbooks, and honest tool comparisons. No spam — unsubscribe in one click.

