Evolution of Software Applications

If you develop software long enough, you notice patterns. One pattern that isn’t talked about enough is how systems evolve over time.

The software industry is so focused on the flavor of the week that we lose perspective. Most of what is “invented” today was created decades ago. Most problems we face today were solved by someone else.

Software developers don’t have a good understanding of our own history.

In the spirit of that, I present to you my take on how software tends to evolve and why.

Overview

Before starting, I must define a term - Software Gravity.

Software Gravity - the force that pulls features, complexity, and resources towards a software system over time.

I believe software gravity is the driving force behind software evolution.

Software tends to grow in complexity over time. Feature requests and user expectations create gravity around software. That gravity pulls complexity towards software. Complexity requires resources.

Katamari Damacy Effect

I call this the Katamari Damacy Effect. Like in the video game, features roll into an ever growing ball of complexity.

What starts small and simple, inevitably grows into a giant ball of stuff. Eventually your ball of features (…err software) might be mistaken for a moon.

Periodically, the ball is too complex to work with and is restructured to fit the available resources.

Because of this, software tends to evolve in a somewhat predictable manner.

Stages of Software Evolution

Stage 0: Humans, Paper, and Spreadsheets
Stage 1: Simple Script
Stage 2: Pile Of Files
Stage 3: The Framework
Stage 4: Beyond The Framework
Stage 5: Modularization
Stage 6: Network System

All software exists somewhere on this spectrum. There is a natural progression between stages to deal the Katamari Damacy Effect. As systems grow in complexity, different approaches are needed to solve problems.

Complexity determines what stage of evolution your system is in. The stage of evolution determines what your code and team looks like. It’s not the other way around.

The one recurring theme in this process is divide and conquer. As complexity increases, the most effective approach is to divide the problem into smaller parts.

Often I see developers arguing for one tool as the best solution to all problems. This has more to do with that developer’s experience with a certain level of application complexity than the needs of your specific application.

There is no one perfect language, tool, database, or framework for every system. As requirements change, your approach will change. Otherwise, you waste time and money.

Stage 0: Humans, Paper, and Spreadsheets

Software doesn’t start with software. Most software exists to automate an existing process or to efficiently communicate information.

Software starts with people solving a problem using some combination of paper, excel files, and other means of communication. That is what I would call a Stage 0 system.

For example, double entry accounting started as a paper based process. Accountants kept two sets of records and compared them to eliminate errors. That is why it’s called double entry accounting.

Software took that and moved it into spreadsheets, databases, and online transaction processing. Fundamentally, it is the same thing. Digital accounting systems are faster or cheaper, but they still give you the same end result.

No matter your system complexity, you are solving the same fundamental problems.

In many cases, a Stage 0 system can be more desirable than a complex software system.

For example, there are a million todo list apps out there. For me, nothing beats a simple pocket notebook and pen for my daily goals.

Software must offer a significant communication or automation benefit over non-software solutions. Otherwise, software has no reason to exist.

We would all be wise to remember we are in the business of improving communication and automation. We are not in the business of writing code.

Stage 1: A Simple Script

Given an existing physical system in Stage 0, eventually it gets turned into software. It starts with a simple script.

A simple script is exactly what it sounds like. It starts as a single file written in a scripting language. This could be PHP, Ruby, Python, Perl, or Bash.

It doesn’t matter.

What defines the simple script is that it’s a single file, has a single purpose, and has very little functionality. It isn’t meant to be a public facing product.

It’s something to solve the problem at hand.

The simple script is created and maintained by a single developer. The code style is dictated by the experiences and personality of the developer. Each script is different, but it should be easy to understand.

A quick demo or proof of concept application is usually Stage 1.

In Stage 1 you can keep the whole application in your head. It should be easy to understand and debug.

A useful script will get users and feature requests. So begins the Katamari Damacy Effect. Your ball of software starts rolling around, gaining more features.

Feature complexity is the first type of complexity caused by software gravity.

When your simple script becomes too complex, you break it into multiple files. Thus begins Stage 2 - The Pile Of Files.

The ball of software keeps growing.

Stage 2: A Pile Of Files

In Stage 1, your software served a single purpose. Over time, software gravity pulls in more features.

As features pile up, the simple script doesn’t cut it. It must be broken up into a pile of files. That is Stage 2.

We break code into files to make it easier to understand. Most people can’t keep a 5,000 line file in their head. But, they can understand ten different 500 line files.

Divide and conquer.

A pile of files can be handled by a single programmer for a while. If the software is useful enough, it grows in feature complexity beyond the scope of a single developer.

At that point, a team forms around the system. It begins with “full stack” developers, but specialization happens as the team grows.

In the beginning, all the roles are filled by one developer. Eventually each role is passed off to specialists. Each role has a different title - designer, backend developer, frontend developer, project manager and so on.

Team complexity is the second type of complexity caused by software gravity. It begins in Stage 2. As feature complexity grows, it creates team complexity.

The problems of team complexity are seen in communication processes. Ideas like Agile Software Development were born out of the need to solve communication problems related to team complexity.

Stage 2 apps are designed around a library based approach to tooling. Each system will have a familiar, but slightly different combination of libraries.

Over time common patterns tend to emerge. The system will usually have database objects, system flow control, and a UI view/layout system.

You end up with a homegrown version of something resembling a framework. This is common and becomes apparent as software grows.

As feature complexity grows, developers are less interested in reinventing the wheel and more interested in solving application specific problems.

As team complexity grows, low friction communication demands a common language and toolset to solve problems.

When feature complexity and team complexity become more significant than can be handled with a pile of files, a system will move to Stage 3 and will adopt a framework.

The ball of software keeps growing.

Stage 3: The Framework

Frameworks exist to solve common problems and make communication easier.

A framework is a set of conventions and libraries that work together to solve common problems. There is nothing special about a framework.

It’s a tool to mitigate feature and team complexity. By solving common problems with common patterns, you move much faster as a team.

Frameworks make hiring easier. Stage 3 projects hire around framework based skillsets. Most professional software development happens in Stage 3.

A project being built with a framework doesn’t mean it is a Stage 3 system. I see Stage 1 or 2 projects start with frameworks frequently.

Professional developers are paid to work on Stage 3+ systems. Most are familiar with a particular framework. Thus, they use a framework when it’s not needed.

People tend to reach for what is familiar more often than what is most appropriate.

As always, team and feature complexity push your software into an ever larger system.

Late in Stage 3, you might run into a third kind of complexity - data complexity.

Data complexity is the third type of complexity caused by software gravity. As feature complexity grows, it creates data complexity.

The problems of data complexity are seen in structure, quantity, and usage of data in your system.

Complex features can imply complex data structures beyond what your framework is designed for. You see this in complex database models that require complex or slow queries to retrieve data.

A large quantity of data is equally problematic. Every bit of data you store makes your system slower and harder to manage. Eventually you end up with problems we now call “big data”.

Data usage is a third kind of data complexity. High traffic or complex reporting needs necessitate separate caching or reporting infrastructure. Many frameworks aren’t built with this in mind.

As complexity of a system increases, you bump up against the limits of a framework. When that happens, a project will start to move beyond the framework and into Stage 4.

The ball of software keeps growing.

Stage 4: Beyond The Framework

Every framework has limits. There is no framework that will solve all of your problems.

At some point, your feature, team, and data complexity will push you beyond the abilities of your framework.

You don’t change frameworks in Stage 4.

It is tempting to consider changing frameworks. But, changing frameworks is trading one kind of complexity for another.

Stage 4 is murky. It is hard to see when you transition into Stage 4. Here are a few hallmarks of Stage 4.

Teams will start “creating” or discovering new patterns to adopt that don’t come with the framework. This is a natural evolution that must take place on both the front and back ends of the system.

When the UI is complex enough, a front end framework or separate client applications will emerge. Patterns your framework doesn’t include like Presenters or MVVM are discovered in this stage.

On the back end, complexity pushes teams towards an internal service object pattern, data model decorators, and multiple data systems.

Another “tell” of being in Stage 4 is if you have senior developers worrying about architecture or reading Martin Fowler’s Patterns of Enterprise Architecture book.

In many ways, Stage 4 and Stage 2 are similar. There are fewer conventions than in Stage 3. A project can stay in Stage 4 for a long time.

In Stage 4, you will experience another kind of complexity - operational complexity.

Operational complexity is the fourth type of complexity caused by software gravity. As overall complexity grows, it creates operational complexity.

Operational complexity comes in the form of infrastructure and support complexity. What once ran on one server now requires dozens. Managing multiple data systems, backups, security, updates, etc. becomes a project unto itself.

This is where DevOps comes from.

Stage 4 reaches a limit when complexity overwhelms the ability for any individual to meaningfully impact the system. At that point, the cost to add features and fix bugs by any single developer is higher than the value of the features added.

Many people refer to these systems as “A Big Ball Of Mud”.

Once the team realizes the quagmire they are in, they decide to solve the problem in one of two ways. They either rewrite the system, or the cut it into smaller pieces.

The Big Rewrite is a classic blunder. It’s failed many times. Don’t do it.

Cutting the system into smaller pieces is a better solution. This is the Stage 5 approach - Modularization.

The ball of software keeps growing.

Stage 5: Modularization

When a system is large enough that it can’t be reasoned about on a whiteboard, it will be divided into smaller pieces. We use the fancy word modularization for this, but there is nothing special about it.

It’s divide and conquer.

There are natural lines in software around which you can draw boundaries. There are two ways to cut up a system. You will see teams using both.

First, you can divide around distinct functionality. That might be a set of features like a reporting system, a communication system, a document sharing system, and so on.

By the time a project reaches Stage 5, there are often a dozen or more distinct functional “modules” in the system.

Second, you can divide around shared infrastructure. That might be user authentication, file storage, image processing, sending email, queueing, and so on. If you look at cloud providers like AWS, you will see a large system divided around shared infrastructure needs.

In Stage 5, you see teams of 5-10 people working on separate modules. You might have an public api team, individual module teams, mobile app teams, and so on.

The software will grow by adding teams and dividing the system into modules that can be handled by teams of 5-10 people.

Boundaries allow teams to work independently and communicate with other tools via agreed upon communication protocols.

In the beginning, a modularized system will live in the same codebase with a simple folder structure to define boundaries. Eventually, it becomes necessary to define stronger boundaries between modules.

The natural evolution of module boundaries is to physically separate systems. This is a very real manifestation of Conway’s law.

As modules separate into distinct systems, your system will move to Stage 6 - a network system.

The ball of software keeps growing.

Stage 6: Network System

A networked system is a series of smaller systems that communicate using common protocols over a network.

Networks glue smaller systems together to achieve a larger objective. Given enough complexity, all systems become a collection of smaller systems collaborating together to achieve a larger objective.

In Stage 6, you must understand the rules around building network systems.

A good network system is defined by common standards, good documentation, and ease of use. If you’ve ever worked with a public API, you understand how vital those things are.

In Stage 6, each system must be treated like a public API.

As average system complexity rises, network systems become popular. We call them Service Oriented Architecture in the large, or Microservices in the small.

It’s two approaches to the same problem. Both create a series of smaller systems that talk to each other over the network.

Successful network system design and architecture is less about data modeling and more about designing communication protocols between components.

Language and platform become heterogenous in Stage 6. It doesn’t matter to the overall system what language an individual system is using.

The protocol between the systems is what matters.

A good way to imagine a network system is a series of Stage 1-5 systems working together over a network.

A Stage 6 system will look like dozens of Stage 2 or 3 systems working together as an orchestra. If you look at the systems of companies like Google, Facebook, Microsoft, etc. you will find a very similar design.

The ball of software keeps growing.

Is There A Stage 7?

I haven’t seen a meaningful evolution beyond network systems. The basic pattern of larger systems being a networked collaboration of smaller programs seems to hold.

In the large, you have the internet. It is the very definition of a network system.

In the small, you have operating systems composed of various services or even computer hardware which is composed of smaller processing systems that communicate over a message bus.

If there is something beyond the network, my guess would be a higher level abstraction around networks that allows for the composition of larger systems.

If the network systems of today are like assembly language or machine code, Stage 7 is like a higher level language on top of that.

I don’t think we’ll see Stage 7 develop until system complexity goes beyond Stage 6 and we are forced to manage terribly complex network systems.

Conclusion

I believe that this model is useful to understand our software and our behavior. Many decisions are mde with a mistaken understanding of tools, people, and complexity.

Understanding fundamental ideas of software gravity and the Katamari Damacy Effect, we can make better decisions when developing software.

There are other areas I did not cover in this article, such as how the evolution of existing software systems impacts new systems and how software evolves side by side in a marketplace.

Also, I didn’t go into great detail about how software complexity is increasing industry-wide and the downstream effects of that trend.

Perhaps future articles will go into more detail on those areas.

Now, back to rolling the ball of software around, watching it grow.

Credits: