I've started a lot of refactors. Most of them were quietly a waste of everyone's time.
You know the type. You stare at some ugly code, feel a moral urge to fix it, spend two weeks making it "cleaner," and at the end the software does exactly what it did before, except now everyone has to relearn it. Nobody's life improved. That's most refactors.
But one of them genuinely, measurably paid off — and the difference between that one and all my vanity rewrites taught me when refactoring is actually worth doing. Here's the story and the rule I took from it.
The refactor that paid off worked because it was tied to a concrete pain that the code was actively causing, not to an aesthetic itch. Refactoring pays off when messy code is provably slowing down real work — every new feature in an area is brutal, bugs cluster there, people fear touching it. It fails when you're just making clean code cleaner for your own satisfaction. Refactor where it hurts, not where it's ugly.
Let me confess to the bad ones first, because they taught me the rule by contrast.
Early on, I refactored for aesthetics. I'd see code that offended my taste — inconsistent naming, a function I'd have structured differently, a pattern I didn't like — and I'd "fix" it. I felt productive. I was producing risk.
Because here's the thing about a refactor of working code: the upside is "it's nicer now" and the downside is "I introduced a bug into something that worked." That's a terrible trade when the code wasn't causing anyone real pain. I was spending genuine risk to buy myself a feeling of tidiness.
The tell, in hindsight, was always the same: I couldn't name the concrete problem the refactor solved. "It's cleaner" is not a problem being solved. It's a preference being indulged. Knowing the difference between cleaning code and creating value is, for me, a defining line in the brutal truth about becoming a senior developer.
A refactor that only makes code prettier spends real risk to buy a feeling. That's a bad trade almost every time.
Photo by Ilya Pavlov on Unsplash
The refactor that paid off didn't start with my taste. It started with a pattern I couldn't ignore.
We had one module — the part that handled how orders moved through their lifecycle — that had become a swamp. And it wasn't theoretically bad. It was measurably bad, in ways the whole team felt:
That's not an aesthetic complaint. That's a tax the business was paying, in slowness and bugs and fear, every single sprint. The code wasn't ugly — it was expensive.
That distinction is everything. Ugly code that's stable and untouched can stay ugly forever; it costs nothing. But this code was on the critical path of half our work, and its mess was directly converting into lost time and broken releases. That's a refactor worth the risk.
I'd learned not to disappear into a heroic two-week stealth rewrite. So instead of starting, I made the case with numbers.
I pulled the data: how long order-related tickets took versus comparable tickets elsewhere, how many bugs traced back to that module, how often it came up in retros. I turned "this code is bad" — an opinion — into "this code costs us roughly X every sprint" — an argument.
That changed the conversation entirely. Refactoring stopped being my personal crusade and became an obvious business decision. We weren't cleaning code for fun; we were removing a tax. Leadership agreed because the cost was visible, not because I find tidy code emotionally satisfying.
| Vanity refactor | The one that paid off |
|---|---|
| "This code is ugly" | "This code costs us X per sprint" |
| Solves my aesthetic itch | Solves a measured, shared pain |
| No way to prove it helped | Before/after numbers prove it |
| Done in stealth, all at once | Scoped, justified, incremental |
| Risk with no clear payoff | Risk with a clear, sized payoff |
The other thing that made it pay off: I didn't do the dangerous thing. No big-bang rewrite where you delete the old module and pray.
The order code had almost no tests, which is why it was so scary to change. So step one wasn't refactoring at all — it was wrapping the existing behavior in tests. I characterized what the messy code actually did, bug-for-bug, and pinned it down with tests. Only then did I have a safety net. This is exactly the payoff I describe in the testing habit I wish I'd started earlier — tests are what make change safe. Martin Fowler's writing on refactoring makes the same point: never restructure working code without a test harness pinning its behavior first.
Then I changed it in small, shippable steps:
Each step was small enough to review, ship, and reverse if needed. At no point was the codebase in a scary half-migrated state for weeks. This is the unglamorous discipline that separates a refactor that lands from one that becomes a six-month death march everyone resents.
// the order I worked in:
// 1. char tests pin the messy behavior (safety net first)
// 2. extract one tangled function (small, green, shipped)
// 3. simplify one state transition (small, green, shipped)
// ...never broken for more than an hour
Photo by Luke Chesser on Unsplash
Because I'd measured the before, I could prove the after. And it was real.
Order-related features stopped being three times slower; within a couple of months they were roughly in line with everything else. The bug cluster shrank noticeably. And the softer signal mattered too: people stopped dreading that area. The thing with a nickname became just… a normal part of the codebase.
That's how you know a refactor paid off — not "it feels cleaner" but "the thing it was costing us, it costs us less now, and here are the numbers." If you can't articulate that payoff before you start, you probably shouldn't start.
The contrast with my vanity refactors was total. Those, I could never have measured, because there was never a problem to measure in the first place. This one announced its own payoff because it started from a real, sized pain.
Q: How do I tell a "worth it" refactor from a vanity one? Ask: can I name the concrete pain this solves, and can I measure it? "Features here take 3x longer" is worth it. "I'd have written this differently" is vanity. If the only problem is that the code offends your taste, leave it alone and ship something users care about.
Q: Should I refactor ugly code that nobody touches? Almost never. Stable code that no one needs to change costs nothing, no matter how ugly. Refactoring it spends risk for zero benefit. Save your effort for the messy code that sits on the critical path of active work.
Q: Is a big rewrite ever the right call over incremental refactoring? Rarely, and only with eyes wide open. Big-bang rewrites are where you reintroduce every old bug and discover the original code was solving problems you forgot about. Incremental, test-backed refactoring is almost always safer and lets you stop the moment the payoff is captured.
Q: How do I get time for a refactor when product wants features? Frame it in their language: cost and velocity, not cleanliness. "This area is why feature X will take three weeks instead of one." Bring the data. A refactor sold as a tax cut gets approved; one sold as tidiness gets deferred forever.
Q: Can AI tools help with big refactors? They can take a lot of the grind out of it — generating characterization tests, suggesting safe extractions, and helping you understand tangled code before you touch it. Used carefully behind a test net, AI assistants make incremental refactoring faster and a bit less terrifying. They don't change the core rule: refactor where it hurts.
For years I refactored to satisfy myself, and most of it was risk I spent on a feeling. The one refactor that genuinely paid off was the one that started somewhere completely different: a real, measured pain the code was inflicting on the whole team.
Don't refactor code because it's ugly. Refactor it because it's expensive — and bring the receipts that prove it.
The skill isn't being able to clean code. Anyone can do that. The skill is knowing which mess is worth the risk of touching, doing it incrementally behind tests, and being able to show the payoff in numbers afterward.
If this stops you from spending real risk on one vanity rewrite, it earned its keep — bring numbers to your next refactor pitch, and the senior-developer pillar has more on judging which messes are worth touching.
So before your next big cleanup, ask the question that separated my one good refactor from all the wasteful ones: what is this mess actually costing us — and can I prove it?
No following, no network, no luck. Just an unglamorous system I ran for eighteen months. Here's exactly what I did.

I went from 200 to 11,000 subscribers without hiring anyone. AI didn't write my newsletter — it did everything around it.

I chased big, audacious goals for years and burned out every time. Then I built my whole life around wins so small they felt like cheating.

Comments
Sign in to join the conversation
No comments yet. Be the first to share your thoughts!