Managing Dependencies with 6 Easy Questions
Nothing is free; everything comes with a cost. As developers, many might not acknowledge it, but we are always making trade-offs. With every new line of code we write, we are compromising the integrity of the system. The changes we make affect the complexity, stability, performance, flexibility, and cost to implement.
For example, let's say we decide to move duplicated logic to an abstracted method. We are improving stability and efficiency at the cost of development time and potentially increased complexity. Another example is choosing to use a 3rd party library instead of building the logic ourselves. This reduces development time at the cost of adding risk, complexity, and reduced flexibility.
That being said, the open-source community is a great resource. There are thousands of great pieces of software that can be used as building blocks to create your software systems. They are easily available and can be great tools to quickly create awesome things. However, when used, our projects become dependent on them, which has a cost. As projects grow, they change and so do their dependencies. These changes can create problems of their own like compatibility issues, inflexibility, and slowness for future development.
What is a dependency?
In order to really understand how to manage dependencies, I think it’s best to first understand what a dependency is. In software, most developers would define a dependency as a library or package that is used in a software project. However, there are many other dependencies like your market, employees, or choice of software platform. Because of this, one could define a dependency as anything that your system relies on to operate.
With each dependency type, there are unique strategies for working with them. However, for the purpose of this article, I’ll just focus on the software packages and libraries as dependencies. When considering dependencies, here are a series of questions that can help you make the best decisions about how to manage them.
1) Which dependencies do I have?
Whether you are starting a greenfield project or working on legacy code, you have to first acknowledge what dependencies you have in order to effectively make decisions about how to manage them. For projects built on node or rails, you have a very discrete list of dependencies logged in your package.json
or Gemfile
files, respectively.
This is a great starting point for understanding the dependencies of your project and is often enough to make the first few decisions. However, it is important to note that many of these packages have intrinsic dependencies of their own. These secondary dependencies often oblige you to certain core library versions that pose conflicts with some of your other dependencies. For this reason, secondary dependencies can have a large impact on your application and should not be ignored.
2) Why do I need it?
Now that we noted our project's dependencies, we need to understand the impact they have. Every dependency that is included in a project has a cost that may not be realized until many months down the road. Projects with lots of dependencies often lose flexibility and can become increasingly complex to manage. When considering a new package, you should first determine the value it brings and weigh that against the cost - not just the cost to implement but the future cost of maintenance, increased complexity, and loss of flexibility.
One of the best strategies for simplifying dependency maintenance is to reduce your number of dependencies. This can be done both by removing existing ones or abstaining from adding a new one.
3) Should I change it?
Assuming you've determined the dependency is worth keeping or adopting, the next question is to determine what version is appropriate. It's everyone's instinct to always use the most current version of the package. This enables the use of the latest features and prevents getting stuck using outdated/unsupported versions.
However, there are also several reasons an older version may be a better choice. For instance, many packages release beta builds that are not stable and can contain several bugs. Alternatively, the new version of a package could include some breaking changes that may require large amounts of rework to function correctly with your project. Furthermore, this new version could include some dependencies that conflict with your current list of packages and cannot be easily adopted.
For some of these reasons, it may be worthwhile to continue to use the existing version rather than updating.
4) When should I change it?
Timing can play a large role in package management as well. It’s important to follow up with the latest developments on your dependencies, as most packages & core libraries only support the latest two major versions. Even if the newest features are not needed, it might be good to consider updating. This is especially true if you are more than one major version behind the latest. Overall, it is typically easier to maintain your dependencies as you go, rather than doing bulk updates when forced to.
5) What change(s) do I need to make?
After deciding that you need the package and should update it, the next task is to determine the course of action. For many packages, it's as simple as bumping the version in your package manager and reinstalling. However, other cases may demand a more complex upgrade plan. Some packages may require you to update another package or core library at the same time. Furthermore, other packages may have breaking changes that will require significant refactoring before the new version can be properly used.
To best determine this, read the documentation ahead of time and reference the change log to see the impact of the changes you are adopting.
6) How do I change it?
Now that you know the changes that need to be made, the next step is to actually make the changes. For the simple cases, this step is pretty trivial. However, for more complex upgrades, this can be the most important step and the most painful.
In order to make the changes as painless as possible, it's best to make several small changes rather than one giant group of changes. With this concept in mind, I recommend the following approach:
1. Refactor (manipulate logic to be flexible to change)
2. Update core libraries & sub-dependencies
3. Refactor (if necessary)
4. Update primary package version
5. Refactor (to resolve breaking changes)
6. Update documentation & tests (if necessary)
In addition to making the changes, one of the most challenging parts of updating dependencies is determining how to effectively release the updates. Often times, package updates have global scope and affect large portions of your application, making effective code review and testing difficult.
Many teams resort to calling all-hands-on-deck to perform global code review & app regression testing to expedite the release process. While this may work for smaller teams, this process is often inefficient, costly, and increasingly ineffective for large applications. Furthermore, as apps adopt more dependencies, the frequency for package updates increases, making this release process often unreasonable.
That being said, here are some alternatives that will help reduce the release pains.
-
Incremental Changes
Wherever possible, try to break up the required refactoring and changes into reasonably small chunks. This will expedite the code review process, as it is easier to provide effective code review on small changes. Ideally, these changes are isolated from one another and can be released independently as they are approved. -
Side Branch
Create a side branch with the package update and necessary refactor changes. Have developers pull this branch into their feature work for several days to allow time for unnoticed conflict to be recognized. This enables issues to be caught early, before releasing to production, and does not require an app-wide regression.
Additionally, using git-flow greatly aides in increasing release confidence. Once most of the issues have been resolved, merge the changes into the development branch. This provides another chance to highlight instabilities or issues before affecting your production environment. -
Segmented Rollout
To further reduce the impact of the changes, applications with distributed systems can consider using a segmented rollout plan by selectively choosing which instances of your application should receive the new changes. This can be a great tool for doing A/B testing on the updates and enables you to easily rollback in case of emergency. -
Upgrade Services First
For applications with a distributed architecture, explore updating packages on smaller services first. This will reduce the impact of issues and provide more time to discover instabilities with new packages without affecting your entire application.
These are just a few strategies that may be helpful in releasing the dependency changes of your application.
In Closing
Maintaining dependencies is an often overlooked challenge of software development. It’s important to remember that even open-source packages are not free. They all come with a cost, and the trade-offs should be considered. Similarly, dependency management looks different for everyone because software applications are often so different in both form and function. That being said, I hope that asking these few questions will help you determine the value and costs of a dependency and develop an effective process for keeping your packages up to date.