In Gitlab 11.5 (to be released on Nov. 22, 2018), GitLab's Elasticsearch integration
will support Elasticsearch version 6, and will no longer support versions 5.5 or earlier.
Please make plans to upgrade Elasticsearch to version 5.6 or 6.x immediately before upgrading to GitLab 11.5. After you upgrade GitLab, you will also need
to perform a reindex,
as the changes required to support these Elasticsearch versions are incompatible with the indexes of previous versions.
In summary, starting with 11.5, GitLab will support:
- Elasticsearch version 5.6
- Elasticsearch version 6.x
If you are using GitLab.com, this does not impact you in any way. This is only relevant
for self-managed GitLab.
GitLab uses Elasticsearch for Advanced Global Search
and Advanced Syntax Search.
Why are we doing this?
Elasticsearch version 6 brings with it two large changes that were incompatible with the way we currently index:
- The removal of mapping types.
- Parent-child relationships are now established via a
join
datatype.
We'll go into some detail on how each of these changes affects GitLab.
Removal of mapping types
In Elasticsearch 6, all documents under the same index must be of the same 'type.' We need to keep all documents under the same index
in order to be able to query based on project membership and permissions, so this change forced us to implement our own
type
field in order to still be able to query only a single type (for example, issues).
This removal of mapping types also affected the way parent-child relationships work.
join
datatype
With the mapping type change comes a change to the way parent-child relationships
are expressed. Elasticsearch 5.6 and 6.x have introduced a join
datatype
that GitLab 11.5 puts to use. (As of 6.0, it is the required method for defining these relationships.)
When using join
, all insertions and deletions must be routed relative to their
parent – which means we must send the parent's ID in the routing
field. In 5.6,
this means that the _parent
field is ignored, and in 6.x it is removed.
Why Elasticsearch 5.6 remains compatible
As noted in the schedule for removal of mapping types,
version 5.6 is the first Elasticsearch version where the join
datatype is available, as well as the first version where single_type
behavior can be enabled.
We tested versions 5.5 and below, and unfortunately they have no support for join
datatypes, so we need to end support for these versions as of GitLab 11.5.
We're especially looking forward to supporting Elasticsearch version 6 as it brings with it some great improvements, including:
- Major improvements for sparsely populated fields
- Faster query times with sorted indices
- Search scalability across shards