The Epic Story of Mono vs. Multi-Repo is Not New

The performance of a system is at the level of its processes.

Code repositories are foundations in the organization of our software development life-cycle, organization and collaboration mechanisms.

A choice of mono or multi-repository is therefore structuring and impacting, hence its debate in the community.

We were able to share this theme during the round table “How can a choice of mono or multi-repository improve your quality?”

The purpose of this article is to take a step back from the developments that led to the emergence of mono and multi-repositories.

So let’s talk about monolith, microservices and perspectives around this hotly debated topic.

The mono-repo was historically the only possible option

Some of the so-called legacy systems are already part of museums.

The time when everyone had necessarily known these systems, their green screens, slowly started to disappear.

We would even tend to want to hide owning such a system to be able to attract recent talents.

In short, most of these legacy systems are rooted in the Mainframe and other AS400s.

Built and operated on a single, centralized platform, their components are built on a common base.

Additionally, the ecosystem was far from providing tools such as Git, Jenkins and being able to adopt CI/CD practices.

The context, maturity of the ecosystem and lack of interoperability were all factors brought together to collaborate exclusively on the mono-repository model.

Monolith and mono-repo had their set of true challenges

Remember that the challenges were to run batches on time windows for resource constraints.

It was also difficult to deliver evolutions in parallel, confronting the limiting factor of its organization.

The integration of a new member in the team was also difficult by the volume of information to integrate and the complexity of the system.

We could already have talked about Cognitive Load.

It was globally difficult to contain the entropy of such a system, the complexity had to be managed, it was inevitably found in this monolith.

These various issues and business needs have led to the creation of more distributed systems.

The ecosystem has evolved towards distributed architectures enabling multi-repo

New platforms and means of communication have therefore appeared.

We note, for example, the appearance of remote procedure calls such as Corba, which then evolved into more standard web service protocols.

Keep in mind that their goal was to meet the challenges of acceleration by parallelizing services between different teams.

Secondly, SOA appeared with the promise of reuse of services to accelerate and capitalize on investments.

It is in this ecosystem that multi-repos architectures appeared, supporting a distribution of the complexity of the overall system.

So we were saved?

Not that much.

Distributed systems and multi-repo came with their issues

This famous reuse implemented by strong synchronous service couplings has created architectures often called spaghetti-code.

Although often associated with microservices, I am convinced that any type of architecture can be misconceived.

Spaghetti architectures are not just for microservices, we could already design batches very badly in a Mainframe.
Antoine Craske

So mono-repos had become completely obsolete?

They survive in a good number of organizations which still have a Mainframe on core business processes today.

The distribution of applications has in fact made the subject of repositories a global subject of the software development system.

Frequently encountered problems are related to purely technological choices made to decide one model upon another.

For instance, we decide to have a repository per team and technology, what could be more logical?

The loss of sight of the system as a whole and of its objectives is what is lacking in the weighting of the model to be retained.

Then came the microservices.

The risks of Microservices and multi-repo by default

Microservices have become a fashion, the de-facto standard to follow in order to be up to date.

Being by nature even more distributed and finer-grained than our famous SOA services, the multi-repository has followed the trend.

One can nevertheless wonder if the multi-repository is the only valid model for microservices?

I don’t think, in fact, several levels of equilibrium are possible.

You can choose to assign a repository to each microservice, close to a function, with the finest grain.

This type of choice can be attractive at first glance, it should nevertheless be kept in mind that certain elements necessary for several components will be difficult to maintain.

The Domain, architecture and organizational perspective must be aligned

We may therefore prefer to have a repository for each major application, operating normally in its functional area, possibly resulting from a DDD process.

This type of grouping will lead us to more consistency within an application, which will also often be maintained by the same team.

Interesting because in talking about technique we had easily lost sight of the organizational and human aspect of software development.

As with trends, players may even end up losing sight of the objective initially pursued.

This is a priori what happens a lot with microservices, where to use the expression one can easily “jump on the bandwagon”.

We see that both mono and multi-repositories are possible even for microservices architectures, keeping a step back on its challenges therefore remains key.

We could almost find ourselves lost in so many choices.

A return to equilibrium is currently taking place in the ecosystem

For several years, we have regularly encountered successes in mono-repositories and without microservices.

Coming back from our extremes, it is indeed easier to take a step back.

We pose the problem more clearly.

Our choice of mono or multi-repository must be aligned with the objectives of the system, and evolve to support its development.
Antoine Craske

Concretely, a start-up starting its product with little maturity in its field, with a small team will be rather subject to go for a mono-repository.

Less obviously, a company providing a product as a SaaS platform can benefit from a mono-repo to promote visibility, collaboration and end-to-end integration of its product.

On the contrary, a growing company and growing development teams will have every interest in distributing its system, organization, repository.

An organization with a minimum of existence will have accumulated often different technologies, and may by context decide on a single or multi-repo according to its modernization trajectory.

The context is therefore key in the choice of repository, which must be able to evolve in accordance with the organization.

An AI automating our choice would make our life much easier.

Are automation, AI and other bots going to help us?

This may be what will happen, in the meanwhile some progress is nevertheless visible.

For example, we see the emergence of tools that automatically deal with the update of dependencies common to different projects, such as dependabot.

The reality is that distributed systems will continue to be necessary and to become more complex in their integration, for example with mobile and IoT.

The appearance of these solutions is therefore similar to that of CI/CD pipelines: a true need for a structural problem to be addressed.

So that AI comes into play, let’s keep in mind that Data Science in production is starting to democratize. In any case, it requires a lot of data to be applicable.

Nevertheless, for some existing repetitions a good statistical algorithm can already do the job.

We will at least have more time to choose our mono or multi-repository, and if we really need microservices.

In the meanwhile, Google is one of the largest mono-repo organization.