Broken master. This can happen when CI pipelines run on the master branch (or default branch), but don't
pass all tests. A red cross mark is shown in the project's top page, signalling unstable source
code and eroding the trust of users. Broken master could also be a blocker against
a continuous deployment/delivery stream line in which deployment jobs
are executed after the test stage passed in master pipelines.
All maintainers want to avoid this critical state,
but how can we prevent it?
Let's look at how master is broken in the first place
Let's say you're one of the maintainers of a project. It's a busy repository with hundreds of merges
to master every day. A developer assigns a merge request (MR) to you. The MR passed all of the tests in the CI pipelines,
has been reviewed thoroughly by code reviewers, all open discussions have been resolved, and the MR has been
approved by the relevant code owners.
You would press the "Merge" button without a second thought, but how are you confident that
a pipeline running on master branch after the merge will pass all tests again?
If your answer is "It might break the master branch," then
you're right. This could happen, for example, if master has advanced by some
new commits, and one of them changed a lint rule. The MR in question
still contains an invalid coding style, but the latest pipeline on the MR passes,
because the feature branch is based on an old version of master.
Enter two new GitLab features: Pipelines for Merged Results
and Merge Trains.
Let me show you how they works and how to enable them.
How to continually run CI pipelines on the merge commit
Let's break down what went wrong in the scenario above. Even though the pipeline on the
merge request passed all the tests, it ran on a source (feature) branch
which could be based on an outdated version of master. In such a case,
the result of pipeline is considered as untrusted, because there may be a huge difference
between an actual-and-future merge commit and the commit in question.
As a boring solution, developers can continually rebase their MR
on the latest master, but this is annoying and inefficient, given the speed of
growth of the master branch.
It causes a lot of friction between developers and maintainers, slowing down the development cycle.
To address this problem, we introduced Pipelines for Merged Results
in GitLab 11.10.
Simply put, the main difference between pipelines for merged results and normal pipelines is that
pipelines run on merge commits, instead of source branches, before the actual merge happens.
This merge commit is generated from the latest commits of target branch and
source branch and written in a temporary place (refs/merge-requests/:iid/merge
).
Therefore, we can run a pipeline on it without interfering with master.
Here is a sample workflow with the above scenario:
- A developer pushes a new commit to a merge request.
- GitLab creates a merge commit from the HEAD of the source branch and HEAD of the target branch.
This merge commit is written inrefs/merge-requests/:iid/merge
and does not change commit history of master branch. - GitLab creates a pipeline on the merge commit, but this pipeline fails because the latest master changed a lint rule.
- A maintainer sees a failed pipeline in the merge request.
As you can see, the maintainer was able to hold off merging the dangerous MR
because the latest pipeline on the MR didn't pass. The feature actually saved
master from a broken state.
As a bonus, this workflow freeds developers from continual
rebasing of their merge requests.
All they need to do is develop features with Pipelines for Merged Results.
GitLab automatically creates an expected merge commit and validates the merge request prior to
an actual merge.
How to get started with Pipelines for Merged Results
You can start using this feature
today, with just two steps:
- Edit the
.gitlab-ci.yml
config file to enable pipelines for merge requests / merge request pipelines. - Enable the "Merge pipelines will try to validate the post-merge result prior to merging" option at Settings > General > Merge requests in your project.
Note: If the configurations in your .gitlab-ci.yml
file are too complex, you might stumble at the first point.
We're currently working on improving the usability of pipelines for merge requests / merge request pipelines.
Please leave your feedback in the issue if that's the case.
How to avoid race condition of concurrent merges
With Pipelines for Merged Results,
we can confidently say that MRs are continually tested against the latest master branch.
However, what if multiple MRs have been merged at the same time?
For example:
- There are two merge requests: MR-1 and MR-2. The latest pipelines have already passed in both MRs.
- John (maintainer) and Cathy (maintainer) merge MR-1 and MR-2 at the same time, respectively.
Later on, it turns out that MR-2 contains a coding offence which has just been introduced by MR-1.
Maintainers hit merge without knowing that, and
needless to say, this will result in broken master. How can we handle this race condition properly?
In GitLab 12.1, we introduced a new feature,
Merge Trains.
Basically, a Merge Train is a queueing system that allows you to avoid this kind
of race condition.
All you need to do is add merge requests to the merge train, and it
handles the rest of the work for you.
It creates merge commits according
to the sequence of merge requests and runs pipelines on the expected merge commits.
For example, John and Cathy could have avoided broken master with the following workflow:
- John and Cathy add MR-1 and MR-2 to their Merge Train, respectively.
- In MR-1, the Merge Train creates an expected merge commit from HEAD of the source branch and HEAD of the target branch.
It creates a pipeline on the merge commit. - In MR-2, the Merge Train creates an expected merge commit from HEAD of the source branch and the expected merge commit of MR-1.
It creates a pipeline on the merge commit. - The pipeline in MR-1 passes all tests and merged into master branch.
- The pipeline in MR-2 fails because it violates a lint check which was changed by MR-1. MR-2 is dropped from the Merge Train.
- Developer revisits MR-2, fixes the coding offence, and asks Cathy to add it to the Merge Train again.
As you can see, the Merge Train successfully rejected MR-2 before it could break the master
branch. With this workflow, maintainers can feel more confident when they
decide to merge something. Also, this doesn't slow down development lifecycle
that pipelines are built on optimistic assumption that, in the above case,
the pipeline in MR-1 and the pipeline in MR-2 start almost simultaneously.
MR-2 builds a merge commit as if MR-1 has already been merged, so that maintainers
don't need to wait for long time until each pipeline finished. If one of the
pipelines failed, the problematic merge request is dropped from the merge train
and the train will be reconstructed without it.
How to get started with Merge Trains
You can start using Merge Train
today, if you've already enabled Pipelines for merged results. Click "Start/Add merge train" button in merge requests.
A quick demonstration of Merge Trains
Here is a demonstration video that explains the advantage of Merge Train feature.
In this video, we'll simulate the common problem in a workflow without
Merge Trains, and later, we resolve the problem by enabling a Merge Train.
Wrap up
Running pipelines on expected merge commits allows us to predict what will happen
in the future and avoid broken master proactively. It soothes the headache of
release managers and gives maintainers and developers more confidence that their code
is reliable enough to be merged and shipped. In addition, Merge Trains allow you
to merge things safely without slowing down the development cycle.
Give this advanced CI/CD feature a try today!
For more information, check out the documentation on merge trains and pipelines for merge requests / merge request pipelines.
Cover image by Dan Roizer on Unsplash