Instead of Rewriting, Refactor
Companies often get into a situation where their code is falling apart. They keep having outages. The product is slow. It can’t handle a heavy load. It takes forever to develop new features. Customers are unhappy. Engineers are unhappy. Should we rewrite it from scratch? companies ask themselves.
The answer is no. Rewrites take ages, during which the project may be canceled, leaving you neither here nor there. People can get fired for it. Even if a rewrite succeeds, it takes longer than estimated 1, and you don’t see the benefits of the investment till the end, when it’s ready to deploy. Teams sometimes end up doing parallel development on both the old and the new system, which is wasted effort. At times, the rewrite fizzles out, leaving a half-built system, and nobody is sure whether to work on the new or the old system. Like this tower that consumed millions of dollars of investment but isn’t generating value for anyone:
Even if a rewrite completes successfully, switching over from the old to the new system can be a project in itself, with its attendant risks like data migration and users rejecting the new system.
When you look at the big picture, rewriting is too risky to be a prudent business decision.
What, then, should you do?
Refactor it instead.
Refactoring means changing the internals of the system without changing how it behaves externally. It has the same features, same UX, and so on. When I get my car serviced, its internals have been cleaned and improved, and some parts replaced, but to me as a layman, the car works the same. It has similar comfort, features, performance and speed. Refactoring is like that.
People sometimes object to refactoring with “But the code is terrible and needs to be completely rewritten.” Even if that’s true, you should get to that outcome by refactoring one component after another 2. If you have a car that needs to be completely replaced, instead of building a new car, replace its engine. Then its brakes. Then its body. You eventually have a new car but you got there by changing one component after another. As opposed to building a separate car in parallel to the older one and, when it’s ready, abruptly switching to the new car.
Refactoring yields visible benefits3 sooner than rewriting. You avoid maintaining two systems in parallel. You make progress every sprint, rather than getting bogged down in a month-long architecture debate. You stay grounded to code-level problems rather than getting carried away by flights of fancy, as a rewrite tends to tempt people to do. Your code is launching every week, thus increasing your confidence and spotting bugs sooner, rather than working in a vacuum for quarters and then launching, which is dangerous. When you refactor, since you’re making a bunch of small changes rather than one big change, the risk of failure is low.
So, refactor, not rewrite 4.
Partly because of irrational exuberance that we can do it quickly. If you could, why did it take so long to build the first version?
Partly because you typically don’t know all the edge cases in the first version.
And partly because all projects are behind schedule.
High-level languages help, by limiting the number of things that can go wrong even the code is badly written. Things like memory safety, garbage collection, and language guarantees that hold no matter what you do make it much more pleasant to work with bad Java code than bad C++ code. Put differently, Java code can’t get as bad as C++ code.
Fewer outages, higher productivity, and so on — all the reasons you wanted to rewrite in the first place.
An exception is if the entire tech is rotten, from the mobile apps to the web app to the backend to the DevOps.