Monorepo controversy, does it worth it?
Monorepo concept is a hot topic these days. Many companies, including Facebook, Google, using monorepos to keep their massive codebase in one repository. A monorepo (mono repository) is a single repository that stores code for more than one logical project.
I have been using monorepos for some time now for a couple of my projects. In this post, I will be sharing my experience and perspective on the concept. It will be up to you to decide if it worth it for your organization or not.
The benefits of separating your codebase into decoupled modules are indisputable. The biggest question is how to organize these modules within your solution. Package managers such as NPM, NuGet, PIP serve the general-purpose libraries to your codebase. While it is a huge time saver, you still need to have modules specific to your solution. It is possible to multiple repositories or private packages to share code between the applications. However, this approach generally ends up causing more troubles than convenience. Let’s go over the advantages and disadvantages of monorepos comparing with the more traditional solutions.
Traditionally, organizations keep their code in separate repositories. This way, it is easy to restrict access to piece of code inside the organization. However, maintaining these packages becomes cumbersome as the complexity of the functionality grows. As the whole idea of sharing modules is to share the functionality between multiple applications or libraries, each update will be affecting more than one module. Each update will create a chain of pull requests, test runs, and perhaps new deploys. It is also quite common that a change breaks an app that depends on it.
In a monorepo, all the code is in one place. The repo serves as a single source of truth. Combined with the right tooling choices, any change that affects multiple packages can be done in one atomic commit, and a pull request. Furthermore, some tools even offer a chance to run tests only on the affected packages.
As organizations grow, the code base grows with them. The solution architecture changes come with refactoring. As team size increases, it gets harder and harder to control bugs and keep consistency. While monorepo is not a solution for growth pain, it is a tool for orchestrating the change. One of the projects I was working on was dealing with similar issues. While mobile and web applications share the same functionality, there was no code sharing between them. After a few prototypes, I decided to move the platform to monorepo before making any architectural changes. As the codebase is on one repo, I was able to pass down the vision easily. We managed to share all business logic between apps. The approach ended up with faster iterations and fewer bugs. Furthermore, It was more convenient to check the progress while the team is getting used to creating decoupled modules.
It is a discussion either monorepos encourages tight coupling as it is much easier to change. I believe it is not the problem of the strategy but more of a management issue. However, creating meaningful, isolated modules requires an experienced team. So I would recommend for an early-stage startup to keep things simple in the first days.
Automated testing and deployment are crucial for stable iterations. Especially a broken build is released to production. It is much easy to set up the flows in a monorepo environment. While setting up is easy, build times will be much longer due to the large codebase. However, the tooling is improving every day. When things start getting slower, it is possible to optimize the build process with caching, parallel building.
Monorepos are not fit for all solutions. First, as I mentioned before, they require an experienced team. Not every library, framework, or service work with monorepo directly. Deep knowledge of how their internal build system working is a requirement for a successful monorepo setup.
Size is also another consideration. As all codebase in one place, git performance would be poor for large scale organizations. However, Facebook tries to resolve issues with VCS scalability by patching Mercurial. Probably soon, this will not be such a big issue.
Last but not least is the issue of security. In a monorepo, you will be sharing the whole codebase. Large companies like google handle access control via internal tooling solutions. Meanwhile, you can check out Github’s code owner feature to have granular control over who takes ownership of a subdirectory.
A monorepo can be as simple as putting all projects into one repository or setting up a bleeding edge repository with tools like Bazel or NX. The first option is definitely not scalable. Here is my overall decision process:
- Is there a need for code sharing between multiple applications or projects? While building an MVP or isolated applications, I always prefer to keep things as simple as possible. The effort should be on the product. The team shouldn’t be distracted by technology.
- Is the development team is ready for the monorepo? As I mentioned a couple of times, the seniority level of the development team is crucial. There will be problems during development. If these problems affect the release cycles, monorepo doesn’t worth it at all. In addition to codebase issues, architectural decisions are crucial. Too much coupling in code may create a spaghetti that would be hard to untangle in the future.
- Hybrid or monorepo approach? If there is a need, and a team to execute, I still prefer to consider alternatives. Size, access control, release, and time considerations are important factors. Perhaps, not all code should be in one repo to improve productivity.
Briefly, in the world of engineering, there is no one-size-fits-all solution. While making the decision, you need to consider your organization, goals, and your team. The most important thing is to build the product and deliver it to your users.