“We need to stop everything!” shout out Dave, your colleague developer, during a meeting with the stakeholders of BigBuckEcommerce, the company you work for. “Our application is a legacy system. We need to rewrite the whole thing before it’s too late! It will explode! We can’t manage this beast any longer!”
If I had received a coin each time the term “legacy” was thrown in a discussion about software development, I would write this article from a private island in my personal mansion. We all use these words as if they were widely known and accepted concepts, using them as arguments to take action.
But when you ask developers, product managers, or CTOs, to define this concept of “legacy”, the answer is pretty blurry. People will understand “legacy” differently, even if it’s only slightly.
Using a concept which is not clearly defined can lead to miscommunication, wrong assumptions, and misconceptions. The Three Beasts of the Apocalypseâ„¢ I personally swore to eradicate from the surface of Earth.
That’s why I would like today to try to define this idea of “legacy system”. More precisely, we’ll see:
- What really means “legacy” and “system”.
- Defining legacy systems by their possible internal properties.
- Defining legacy systems by their outcomes.
- Defining legacy systems depending on their contexts.
Defining Legacy System
Let’s begin our descent into the Dungeon of the Legacy Systemâ„¢ with a question: where does this idea of legacy come from?
The Origins of the Legacy
Outside the glorious field of software development, the term legacy is seen as a positive one; it’s a gift. It can be the legacy of artists gifting their fantastic songs or wonderful paints to the world. It can be the legacy of an ancestor giving you a good reputation, some fame, or a bunch of properties. Kevlin Henney defines it better than me:
A positive meaning, a gift of wealth from the past to the present for the future.
But in our little Software development world full of rainbows and roads made of chocolate cookies, the term “legacy” is negative. I did some research to find out who had the wonderful idea to take a well-accepted concept and to inverse its meaning, but I didn’t find anything. According to Wikipedia, it was “maybe in the 70s”. We’ll have to live with that till the mystery is revealed.
This term “legacy” was used at first for technologies which are not supported anymore. It includes hardware, software, and everything depending on them. Nowadays, the concept is more general: a legacy system is a “bad” system. It’s a system everybody wants to fix, or to get rid of.
Before trying to define the term a bit more precisely, let’s see first what means the “system” part of the concept.
What’s a System?
First things first: when I speak about legacy systems or software systems in this article, I speak about systems in a very large sense. For software system, it includes anything running on a computer: software, OS, web applications, and whatnot.
The concept of “system” is a very interesting one. According to the Oxford dictionary, a system can be defined as:
A set of things working together as parts of a mechanism or an interconnecting network; a complex whole.
There are two main ideas here: “a set of things” and “working together”. A system is indeed made of two sets:
- A set of “objects” in a broad sense.
- A set of relationships between these objects.
There is also this idea of “complex” in the definition: we’ll come to that shortly.
Your codebase can be a system of variables, functions, packages, classes, all more or less interconnected. Your application can be a system of different modules, services, microservices, databases, and so on. If we see a piece of software as a succession of layers of abstraction, each layer can be considered a system and the layer below as a subsystem.
When we code, we design a system. When we draw some architecture, we design a system. When we have the idea of a software, we begin to think in terms of functionalities and their relationships; we design a system.
What we call legacy code or legacy software are all legacy systems. The fact that they are at different level of abstraction make them different and more or less useful to look at, depending on your needs. That’s what a system is, too: a way to look at the world. A point of view, useful to define and to look at, for fulfilling your goals.
Now that we know what a system is, let’s ask the question: how do we know that a given system is a legacy system?
And the System Became Legacy
Spoiler ahead: there are no consensus about the boundary between a system considered “healthy” or “legacy”. That said, I saw many supposed legacy systems in my career, and many people speaking about them. Here’s my experience.
Any System Is a Liability
Some developers, me included, consider every piece of code as a liability. Each line needs to be understand and maintain. Each line makes the system bigger. Each block can be extended with more code. This code can be a source of bug, and, as an indirect result, time and money wasted.
This doesn’t only apply to code. Each component of a software system could be seen as a liability. Having 10 microservices is better than 1000, because there are fewer parts to maintain. Developers won’t add any new API or anything around non-existing part.
The problem with this definition is that the term “legacy” lose its meaning. If the entire software system is legacy, why do we call it legacy at the first place? Let’s call it “system” and we’re done. It’s a bit like highlighting every single word in a document; what’s the point to highlight them in the first place?
Don’t get me wrong: it is true that a system is a liability. It’s always better to have less of it (less code, fewer microservices, fewer packages) as soon as it does what we want it to do (or close enough). But a legacy system should be something more precise than that.
Systems Difficult to Understand
Let’s try to increase the precision of “legacy” as we move forward. A legacy system can be defined as a system difficult to understand. But difficult to understand for whom? The whole population? Every developer? Some developers? The stakeholders? The users?
Let’s say that you’re a designer of the system: you’re a developer, an architect, or a product owner coming up with new functionalities (and, therefore, contributing to the shape of the application). A system difficult to understand is rarely a system you’ve just designed. It’s something you’ve done in the past (designed by your “old self”, different from the present one) or designed by somebody else.
After all, we all have assumptions, beliefs, biases, and misconceptions, at different points in time. Our knowledge is different, too. This baggage will influence the way we take decisions while designing our system. As a result, a system you didn’t design will always be harder to understand.
The creators themselves can be in your immediate surrounding (your company or your team, for example) or outside. If they’re in your company or, even better, in your team, understanding the system is easier: you can ask them directly what they tried to convey. You can even work with them to explore and try to improve the system.
If they are external to your organization, it becomes more difficult (or downright impossible) to ask them what they tried to express with their design.
This question of expressiveness when designing a system can be fixed to some degree with standardization. As I was saying, we’re all different; standardization is an attempt to mitigate these differences by using common design rules. To take the most boring example: agreeing to use tabs instead of space in every codebase of the organization.
Most of the tools we use to design software systems (including programming languages) are made to be adaptable depending on the business knowledge we codify in our systems. These tools provide already some sort of standardization, but often a very loosely one. After all, they’re meant to be as flexible as possible, to design any type of software.
All we know about a system built with these tools: it does some computation. Even if it’s something important, it’s not really useful to understand what the system really does.
Can we define a legacy system as a system difficult to understand? It’s definitely one of its property, but there can be more.
Systems Difficult to Change
A legacy system could also be defined as a system difficult to change and maintain. Often, it will come from two set of difficulties:
- The system is considered hard because of a lack of knowledge from the ones who try to change or maintain it.
- The system is considered complex because it has too many elements with too many entwined relationships.
I’ve already written about these two notions in my article about complexity and the KISS principle. The first case is actually the easier to solve: if you don’t know something when you look at a system, you can learn it.
The second case is harder to deal with. Our brain power is limited; we can’t think about too many interconnected elements at once. We have all the same brain (more or less), so we’ll all struggle equally when we need to put the system into our head for scaling it or maintaining it. Said differently, we’ll have difficulties to create a mental model of a complex system.
Since we speak about “legacy systems” as a general idea most of the time (a legacy stays legacy whoever is involved in its creation, maintenance, and expansion), we could see the term “legacy” as synonym of “complex”. A system very often gets bigger and more complex overtime, an idea I’ve already covered in my other article about software entropy.
I like the definition of a legacy system as a system difficult to understand, change, and maintain. Is that all?
System Quality
A legacy system could also be seen as a system which is of poor quality. It begs the question: how to define quality?
According to the fantastic book accelerate, the quality of the system is high if it has:
- Low change / fail rate.
- Low percentage of the time passed on rework.
- Low percentage of the time passed on bugs.
- Positive perception of the developers working on the app.
Trying to measure the first three properties can give you a good idea of the level of entropy of a system. It will show you if you need to act now to manage its complexity.
We have tendency to discard things we believe hard or impossible to measure, but the fourth point shouldn’t be discarded! If more and more developers complain about a system, it’s a good sign that something’s wrong.
Till now, we tried to define a legacy system by looking inside the system and see what’s wrong. What if we try to define “legacy” by looking outside? Could we consider a system as “legacy” according to its impact?
Business Impact
More often than not, the goal of a piece of software is to serve a business and provide valuable outcomes to its users. What’s inside a system is rarely the point: a user doesn’t care what technology or what good principles the designers used.
Present Outcomes
A software should deliver what’s the users expect, as fast as possible, and without bugs. A legacy system being a bad system, it could be defined as a system which:
- Doesn’t deliver what’s the user expect.
- Is too slow.
- Has too many bugs.
Remember: these are the conclusions of the people who use the system, not from the ones who design it. Negative outcomes are likely to increase the churn of customers and prevent new ones to onboard.
Measuring the consequences of the negative outcomes for the users and, as a result, for the organization (often in money and time) is crucial.
Future Outcomes
It’s difficult to foresee the future, but preventing is always better than curing. Trying to manage the complexity of a system before it impacts the business negatively is always the best solution. Analyzing software systems require some experience and some tools which can point out where the complexity should be first attacked.
Trying to prevent a system to grow in complexity and keeping it flexible enough to adapt to fast changing markets is a delicate balance. On one side, if everything is made extremely flexible, the complexity is likely to increase. If nothing is flexible, the software can’t change when the business needs to change.
Trying to foresee the future will always be a guess. But a guess has more or less uncertainty; the goal is to reduce this uncertainty. Ask around, try to find out if you really need to abstract this or that. Abstraction often create indirection, which can result in unbearable complexity.
Legacy Systems and People Systems
Complex, difficult to understand systems with poor outcomes are not created on purpose. Nobody wants to do a bad job on purpose. I saw, however, people who weren’t motivated enough, who didn’t see any incentive to bring the best work they can, or who didn’t fit the organization due to a poor hiring process. The company vision, mission, and culture, can also influence heavily the design of an application.
An organization is also a system. People and processes can have complex relationships. This system change and evolve overtime, too.
Because we’re coding the business knowledge into our applications, we should consider the organization we work with as the primary system. The software systems would be its subsystems (or embedded systems).
If the managers of a company don’t know what outcomes they want, how to organize the teams, what part of the market they want to address, the software itself will get more complex, difficult to understand, buggy, in one word: legacy.
What about the relationships between the different part of the organization? If the communication is not flowing as it should, the design of the applications will be affected, too. After all, the specifications result from this communication.
We could see communication in an organization as an ordered hierarchy following the organization chart. Let’s take the example of the famous company BigBuckEcommerce. The different arrows represent the communication flowing between the different part of the system:
But, in practice, the communication flow is rarely ordered: information are exchanged during coffee breaks, on a chat, or by emails. In reality, the communication looks like this:
There’s an interesting question we could ask here: does the way an organization communicate influence the design of their software systems? As Conway show in his paper (which became later Conway’s Law, coined in The Mythical Man Month), the answer is a big yes.
It’s even more visible in IT: if different teams are responsible for different part of the codebase, the communication will affect its design. For example, you’ll see hacks and shortcuts growing in your codebase if two teams have difficulties to agree with each other.
It doesn’t only concern the technical side, however. There are many people involved in building an application, and the communication between them can also affect the design of the final system.
Look at the organization you’re working with, the way it is structured, who communicate with whom and in what ways, and look at the design of your applications. You might find some interesting parallels showing some possible problems in the organization itself, affecting directly the systems you’re designing. Without solving the problem in the organization, it’s often impossible to solve the problem in the software systems themselves.
That’s why it’s important for us, developers, to find out companies which are not legacy systems themselves.
What is a Legacy Software System?
At the end, drawing the boundary between a “legacy” and a “healthy” system depends on three main things:
- The internal complexity of the system.
- The external outcomes of the system.
- The complexity of the systems around, including the organization itself (the context).
These three components form a multiplicative system: if one of them tends to 0, the result tends to 0, too. One can’t balance out the other, the three properties need to be high enough for the application to be successful, and to be able to evolve overtime.
Because a system design is heavily influenced by its context, the boundary between a “healthy” and a “legacy” system should be decided by the designers, the stakeholders, and indirectly by the users of the system. The definition of a legacy system shouldn’t be general to all organizations.
If we really want a general term for “legacy system”, I would come back to its first meaning: a system which is not supported anymore. It would already avoid many confusions and subjective judgements.
A final word: it’s easy to qualify a system as “legacy” because it doesn’t follow the rules you take for granted, or because it doesn’t satisfy your sense of aesthetic. But the definition of a legacy system should be less subjective. Attaching metrics to the three properties above can help in this quest for objectivity.
We’ll see in further articles how to measure the “legacy” of a system, and what to do about it in different scenarios.