Mono or Multirepo: It's All About Quality Engineering

We explore the repository models ecosystem in our previous series of articles. We look at its definition, myths, and practices in other actors. It is still not enough to select a repo model.

We can jump too quickly to solutions that are hazardous without being clear on the problem to solve. A problem that is not solely technical and starts with the business.

This article aims to guide your repository model decision using a holistic quality engineering perspective and process.

Start with why, then for whom

It is worth investing time in the first place to clarify the “Why” of your initiative. Why are you considering the question of your code repo? Why would it be necessary for your company? What are you not doing while working on the repo question?

The first trivial case is for a start-up. The first lines of code need storage and, ideally, a proper structure from the start. When starting from scratch, a monorepo is usually the way to go. Some exceptions can exist starting with a larger team or number of IT components.

Evaluating the opportunity cost of a repo change for an existing company is more challenging. The added value is mainly in the mid-term, and while doing the changes, you can slow down the delivery of the features. It means that your initiative will impact your stakeholders, hence the importance of considering them.

Mono or multirepo: “Start with Why”, then “for Whom”.
Antoine Craske

“For Whom” is the second valuable question after “Start With Why”. The identification of stakeholders is the first step to work on empathy. You need to clarify “What’s in it for me?” for each one. Their perception of value is at the roots of their potential acceptance. You are threatening them if your initiative is only seen as slowing down the product velocity.

The citation “Quality is value to someone” takes all its meaning in that context. The repos subject is risky to address only from a technology prism; the value perception of your stakeholders is vital. We need a more holistic approach to our company and its ecosystem to evaluate the repository model.

Align An Holistic Quality Engineering Perspective

We have to combine various perspectives to get a broader view. The order in which we analyze our ecosystem is structuring our reasoning. We can apply the Quality Engineering model iterating on enterprise architecture (including business architecture), product, organizational system, and engineering.

We have to deal with our repos as long as the company exists. Organizations are living in cycles of birth, growth, maturity, and decline. Those cycles apply to the sector your company is acting in and to the company itself. A typical cycle for a company lasts for 3 to 5 years. Your company can be at any stage and even at the transition from one to another. The stage we are in gives clues on how the company will behave and which products and services will offer.

We arrive at value-driven product management, translating business priorities into valuable products. Why should we care about product management for our repos? Products translate into software requirements our organizations will have to deliver, and more precisely, to iterate. Our repository model has to enable faster iterations of value creation, whatever the products we have to build.

Our repository model has to enable faster iterations of value creation.
Antoine Craske

Our organizational system will support our company objectives using its capabilities. We have to understand our current, future, and coming state for various elements. Organizational structure, interactions, and culture are crucial elements to clarify. Some typical questions arise around the number of teams, their expected objectives, and results. The model needs alignment with your organization; there is no single correct answer. Google is, for example, a centralized networked organization, whereas Netflix opted for a more decentralized model. Keep in mind that the organigram is not everything, interactions are structuring.

We can move to engineering once we know what to do, for whom, and by whom. Our engineering system has to clarify its architectural style, codebase organization, technological choices, etc. Your decision can be specific to each context, for example, having a different ecosystem on the front, back, and data engineering. You can even discover that the repos model is not the main topic to address, letting you work on the right priority.

You must have a very clear picture of the current state, being the foundation to draw possible targets and transitions. You will probably need iterations between architecture, product, organization, and engineering. We also have choices to make whatever the repo model we will decide.

Decide what have to decide anyway

Your repo is not everything, even if it’s structuring. You can consult this article for more details about repo’s myths. The interrelated elements of our repo need clarification to improve our repository model decision.

Your repo is structuring, but not everything.
Antoine Craske

Choices are required. As we will see later on, any option is subject to trade-offs; there is no perfect solution. Our “Why” will guide our decisions, avoiding a never-ending comparison table of pros and cons. We have to focus on the main differentiators that articulate our system.

The significant elements of our repos lie in our communication flows, dependencies management, and architectural style. Interactions balance between centralization and decentralization. Your cursor needs adjustment depending on your organizational system. Dependencies exist in any case; we have to choose how to manage them. Similarly, we can rely on a centralized model (e.g. shared libraries) or a decentralized one (e.g. code duplication that needs management).

Our architectural style is the last element to integrate. It needs alignment with our quality engineering perspective, as well as its interactions and dependencies. I strongly suggest being clear on the anti-patterns and pitfalls of architectural style before deciding here. We can end up with a different architecture for different contexts.

The value of the process is to focus us on the “right product” before trying to get the “product right”. At this stage, we are ready to balance possible options for our repository model.

Be clear about the trade-offs of each repo option

Trade-offs also apply to engineering decisions and so our repos ones. Our objective is not to perform the most exhaustive comparison matrix but to select the most appropriate option for our context, clearly aware of both pros and cons, keeping flexibility for evolution.

We normally need three good reasons to decide one repo model over another. From my point of view, the main inflection points come from the previous step: which company we want (architecture), to deliver what (product), organized in a particular way (organization), to iterate with software increments (engineering).

Three good reasons guide our decision between one repo model over another.
Antoine Craske

Good reasons for a monorepo are centralized communication, governance, dependencies management, early integration, and a homogeneous stack. What we gain with monorepo comes with a cost: we have to manage complex central tooling for both building and deployment.

We can try to accommodate some trade-offs by implementing a split-repo model, but we are still dealing with the trade-off again. With a growing codebase, we will benefit from a central view, with easier refactoring at scale, if we invest in the required storage, search, versioning, reviews, and refactoring mechanisms. Else, we can be looking for a multirepo model.

Multirepos make sense for an organization tending to decentralized communication, integration, and deployment. Teams in search of faster iterations will benefit from separated repositories. A pizza team will have more autonomy in the build, release, and deployment process. But trade-offs are also present, especially in the mid-term.

Siloed teams will tend to optimize their context; this is natural in systems. The problem is when it impacts our initial company objectives. Imagine a multirepos growing up to 100 repositories, each with an individual stack, logging mechanisms, APIs format, language, etc. This can happen with a monorepo, but much more likely with a monorepo. Balance is vital to get value from this model. We have to accelerate the team velocity while guaranteeing the process replicability and technical debt containment.

Our repository choice is fundamental as we will plug other applications into this system. Flexibility, evolutivity, and maintainability are therefore structuring requirements. If you don’t find clear evidence towards an option, it is better to start with a monorepo we can split easier later on, rather than trying to regroup repos built to be independent in the first place.

Our repo model decision is still of little value until understood by our stakeholders.

Share, align options and get commitment

Engaging with our stakeholders is key to achieving our repo initiative. We have to work cross-functionally and transversally within the organization to get sponsor, budget, and implementation support. From early on, we can capitalize on our answers to the “Start with whom”.

We have limited resources to share with the involved parties. We can build a plan using a stakeholders matrix for balancing our effort. That way, we clarify the level of investment required from minimal monitoring to a close relationship establishment.

For our repository initiative, we impact the whole engineering value-chain, so how to prioritize the actors? Not all individuals and teams are equal. Power is not only coming from a job title; knowing your organization is vital to identify the true influencers in addition to the VP, Senior and C-level roles involved. For the teams, the best candidate is one for which your proposition will solve a real challenge they are facing now.

The combination of these individuals and teams is the basis of your guiding coalition. This unit is necessary to lead your initiative up to completion, supporting your organizational capabilities. The implementation will not happen overnight. It will take time, iterations, and perseverance.

Iterate horizontally, then vertically

Your repo change initiative will start a new cycle in your organization. Even if we see this cycle as a single curve, the underlying implementation consists of iterations. Each incremental step has to validate or invalidate assumptions of value creation and scale.

Whatever the repo model, we have to quickly test our assumptions from what I call a “horizontal perspective”, focusing on an end-to-end integration. Our candidate team must be used at that stage, focusing on the value-hypothesis rather than an internal activity measurement. Once we validate the value, we can see how to expand our model.

Our repo is like a startup, it must find its first users, then expand.
Antoine Craske

Our horizontal perspective has to switch to a vertical one, looking at the overall repository. One strong driver for evolving a repo is usually to support the growth of the codebase. Scaling is therefore essential, as well as proper automation. A poorly designed manual process is inefficient and even more when automated. This is where you need to leverage your guiding coalition to expand your model.

The rest of the deployment is traditional project management outside the scope of this article. Keep in mind that you are inside a cycle that evolves and, at some point, can change.

Your repo, a business asset of today and tomorrow

Our repository hosts the organization’s vital asset, its codebase, at the heart of its digital product creation. Its alignment with the business goals, architecture, products, organization, and engineering system is key. At least during the cycles you are in.

At a particular point in time, we can be subject to assess our model again. We can decide to stick to it, improve it, or change it again. From a code repository perspective, that means you can evolve between monorepo, multi repo, and its possible variations. We cannot change it every year due to its impacts, thus the value of this process for a mid-term alignment.

One takeaway is that our repository is structuring but not everything. Be clear about the trade-offs to avoid frustration while using your guiding coalition to lead the initiative to completion.

An initiative that will bring value only if valued by someone.

References

The Lean Startup, Eric Ries. https://www.amazon.com/Lean-Startup-Entrepreneurs-Continuous-Innovation/dp/0307887898

https://github.com/joelparkerhenderson/monorepo_vs_polyrepo

Mono or Multirepo: It’s All About Quality Engineering

Mono or Multirepo: It’s All About Quality Engineering<span class="wtr-time-wrap after-title"><span class="wtr-time-number">10</span> min read</span>