What is seq_no and primary_term in elasticsearch?

the _seq_no and _primary_term as parameter needed to implement the optimistic locking.

Elasticsearch keeps tracks of the sequence number and primary term of the last operation to have changed each of the documents it stores. The sequence number and primary term are returned in the _seq_no and _primary_term fields in the response of the GET API:

primary_term

The primary term increments every time a different shard becomes primary during failover. This helps when resolving changes which occurred on old primaries which come back online vs. changes which occur on the new primary (the new wins).

The primary term for a replication group is just a counter for how many times the primary shard has changed.

These primary terms are incremental and change when a primary is promoted. They’re persisted in the cluster state, thus representing a sort of “version” or “generation” of primaries that the cluster is on.

To ensure an older version of a document doesn’t overwrite a newer version, every operation performed to a document is assigned a sequence number by the primary shard that coordinates that change.

Let’s say your index is made up of 5 primary shards (that was the default prior to version 7). Indexing and Update-Requests are performed against primary shards. If you have multiple primary shards, elasticsearch is able to parallelize/distribute incoming requests (e.g. huge bulk-requests) to multiple shards in order to enhance performance.

So the primary_term gives information about the primary shard (#1, #2, #3, #4 or #5 in this example) that executed/coordinated the change/update.

seq_no

Sequence numbers have been introduced in ES 6.0.0. Just before that release came out.

Once we had the protection of primary terms in place, we added a simple counter and started issuing each operation a sequence number from that counter. These sequence numbers thus allow us to understand the specific order of index operations that happened on the primary

version is a sequential number that counts the number of time a document was updated
_seq_no is a sequential number that counts the number of operations that happened on the index
So if you create a second document, you’ll see that version and _seq_no will be different.


Let's create three documents:

POST test/_doc/_bulk
{"index": {}}
{"test": 1}
{"index": {}}
{"test": 2}
{"index": {}}
{"test": 3}
In the response, you'll get the payload below.

{
  "took" : 166,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "d2zbSW4BJvP7VWZfYMwQ",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "eGzbSW4BJvP7VWZfYMwQ",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "eWzbSW4BJvP7VWZfYMwQ",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 201
      }
    }
  ]
}

As you can see:

for all documents, version is 1
for document 1, _seq_no is 0 (first index operation)
for document 2, _seq_no is 1 (second index operation)
for document 3, _seq_no is 2 (third index operation)

Reference

  • https://www.elastic.co/blog/elasticsearch-sequence-ids-6-0
Rajesh Kumar
Follow me