When enterprises try to modify or upgrade this code, they face a few barriers—one of which is that every small change in the app’s code could have far-reaching, unexpected consequences. With this in mind, everyone’s a little bit afraid of disturbing the legacy code.
This issue has been gradually becoming more pressing (for decades, in some cases), but it’s a shame to let legacy systems continue to hold back growth now that tools finally exist to start modernising outdated applications. With recent advances in AI, the gridlock caused by large-scale legacy code can finally be broken.
In many large organisations, software development teams are up-to-date with the modern code they use on a regular basis, but they no longer have an accurate record of the source code for many of the applications the company relies on. This is how legacy code forms. Legacy code doesn’t have documentation and is no longer actively developed, but is often updated. It might exist in any language, but was probably written in COBOL or Java, and nobody is really sure how it works anymore. The general consensus is to leave it alone.
Since these applications don’t have tests, it can be extremely difficult for developers to see the connections between lines of code and anticipate the impact of the changes they make or new code they add before bugs or other issues are introduced. It can be even harder to trace the source of the bug and resolve it. As a result, applications built from legacy systems start to feel inaccessible and locked away by a fear of breaking something critical.
Legacy systems can create huge financial burdens for businesses, costing companies millions of dollars each year to work around, without accounting for the opportunity cost of not being able to adopt newer, more efficient tools. In some cases, legacy code is written in languages that most developers don’t know anymore, and editing it means organisations have to entice developers out of retirement to resolve critical issues. It’s not unusual for 80% of an annual IT budget to be allocated to maintaining core legacy applications.
Currently, these challenges are dealt with manually by slogging through the code, refactoring when possible, writing tests as needed and fixing bugs as they appear. This is slow work; in fact, the majority of developers spend less than half of each workday engaging in active development. Instead, their time is taken up by maintenance activities like testing and debugging. These are a waste for a highly skilled development team, and a lot less interesting than coding of new features, but somebody has to do it—don’t they?
When systems are as massive and entwined as those found in enterprise codebases, it can be impossible for people to grasp the connections between them and untangle them into better code—especially with the loss of knowledge that occurs when developers move jobs or switch companies. But humans don’t necessarily need to be able to understand these connections.
AI technology applied to software is equipped to crawl entire codebases in a tiny fraction of the time it would take a human to do the same, instantly and accurately mapping the links between pieces of code. AI for code is a new category of AI that is capable of going one step further and creating original code based on existing development. This next level AI has far-reaching applications and implications.
In an example of a case that’s perhaps most relevant to the challenges of legacy applications, AI for code can create comprehensive new tests for an entire legacy codebase, making previously impenetrable parts of the company’s software accessible by instantly alerting developers to any issues or bugs their code changes may have caused. Even edge cases and side cases that wouldn’t be picked up by a human developer can be spotted by AI.
With this feedback, developers can resolve breaking changes at the earliest stages of the software delivery lifecycle, saving ten times the work it would take to implement fixes later down the line. Legacy code can be refactored and new features can be created without waking the legacy beast. Suddenly, code becomes modifiable again, without fear of negatively impacting mission-critical software.
It’s been a long time in the making, but the technology for AI that can write code has finally arrived. In the same way that automation has revolutionised software delivery, AI for code promises to drastically improve the way developers interact with legacy systems. It offers an efficient way—arguably, one of the only viable ways—to upgrade core business applications, speed up the software delivery lifecycle and remove the costs of legacy code.
Daniel Kroening, Co-founder & Chief Scientist at Diffblue. Professor of computer science at @UniofOxford. Advancing AI and the future of code.