Ship the safety net before the change.
You're about to modify existing behavior and want to ensure you don't break anything, but adequate test coverage doesn't exist yet. This technique is the responsible precursor to any risky refactoring or behavior change.
You'll recognize the need when you look at a piece of code you need to modify and realize you can't answer the question "how will I know if I broke something?" If the answer is "I won't know," then the first PR is adding tests — not making the change.
This applies whenever you're dealing with legacy code that grew without tests, critical business logic that has never been verified, or a module that "just works" but nobody is sure why. The instinct to jump straight into the refactor is almost always wrong. The investment in characterization tests is the thing that makes every subsequent PR safe to ship.
The discipline is in PR 1: you're not writing tests for how the code should work — you're writing tests that describe how it currently works, including any quirks and edge cases. If every test in PR 1 passes without touching a single line of production code, you've succeeded.
How this looks in your git history:
The team needs to refactor a billing calculation module that computes invoice totals with complex rules for prorated charges, volume discounts, and tax exemptions. The module has no tests and the team doesn't fully understand all the edge cases.
Before touching the logic, we'll add comprehensive characterization tests that document the current behavior. Then we'll refactor with confidence — knowing the tests will catch any regression.
Before changing a single line of production code, we write a comprehensive test suite that documents exactly how calculateInvoice() behaves today — including edge cases we might not fully understand.
Every test is written to pass against the existing, untouched implementation. The goal isn't to verify that the code is correct; it's to create a regression net. If any of these tests fail after a code change, something changed — and now we know about it.
Note the test helpers (makeCustomer, makeItem) that reduce boilerplate and make each test case express only what's different. The helpers also make it easy to add more cases later.
| line number | line content |
|---|
With the characterization tests as a safety net, we can now decompose the monolithic calculateInvoice() function into named, testable units.
Three helper functions are extracted — calculateProration(), applyVolumeDiscount(), and calculateTax() — each responsible for one concern. The public calculateInvoice() becomes a clean pipeline that reads like a description of what it does.
The critical validation: all 131 characterization tests pass without modification. The refactor is proven safe. A reviewer can understand the new structure in seconds rather than needing to trace through 40 lines of interleaved logic.
| line number | line content |
|---|
During code review of PR 2, a teammate spots an issue: calculateProration() uses Math.round, which always rounds half-cent amounts upward. Over millions of invoices, this introduces a systematic upward bias in prorated charges.
The fix is a bankersRound() utility (round-half-to-even), which eliminates the bias by rounding to the nearest even digit at the boundary. calculateProration() is updated to use it.
Running the test suite: all 131 tests pass. The edge cases in our characterization suite don't contain sub-cent unit prices, so existing expectations are unchanged. But the safety net is fully in place — we add one targeted regression test to document the corrected behavior and lock it in against future changes.
Without the safety net from PR 1, making this change would have required manual spot-checking. With it, we merge with confidence.
| line number | line content |
|---|
Writing tests that are too coupled to implementation details. Characterization tests should test inputs and outputs, not internal structure. If your test asserts that a specific private function was called, or breaks because you renamed an internal variable, it's testing the wrong thing. The entire value of the safety net comes from tests that survive structural changes to the code. Test the observable behavior: given these inputs, the function returns these outputs. Nothing more.
Skipping edge cases in the characterization tests. The whole point is comprehensive coverage of current behavior. If you only test the happy path, you'll miss regressions in edge cases during refactoring — which is exactly where bugs hide. Force yourself to think about: empty inputs, zero quantities, negative values, boundary conditions in your discount tiers, customers with unusual tax configurations. The effort to find and test edge cases in PR 1 is the effort that saves you from a 2am incident in PR 3.
Treating the characterization tests as permanent fixtures. Some of these tests document behavior that might be buggy. That's fine — they're fulfilling their purpose as a regression net during the refactor. But after the refactoring is complete, revisit the test suite critically: which tests document intended behavior, and which document bugs that should be fixed? The prorated rounding test in PR 3 is a perfect example: the characterization test captured current behavior, but the fix intentionally changed it. Review your characterization tests after each major change and graduate them into permanent unit tests or update them to match the corrected behavior.