Does Point-in-time (PIT) API available in OpenSearch 1.3.x?

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 1.3.x

Describe the issue:

The guide Point in Time API - OpenSearch documentation says that OpenSearch 1.3.x has API for PIT, but release notes says - no, it available start from 2.4.0.

Is it documentation bug? Or 1.3 really has PIT feature.

Configuration:

Relevant Logs or Screenshots:

It’s a documentation bug, PIT is not available in 1.3.

hello gaobinlong,from the elasticsearch 7.10.x documentation, we see that pit and search after are used together.

You can see the use of search after in the opensearch 1.3 document. The pit function was released in 2.4. Is it possible to use the search after function normally in opensearch 1.3 according to the document?Or is this also a documentation bug?

Search after can be used in all versions of OpenSearch because it was inherited from es 7.10, you see es’s 7.10 documentation contains PIT, that’s because PIT was a x-pack feature in es so OpenSearch didn’t have it in 1.x and re-implemented and released the feature in 2.4.

Thank you for your reply. I would like to ask another question. I saw in the documentation of es 7.10.x about the role of pit.

“If a refresh occurs between these requests, the order of your results may change, causing inconsistent results across pages. To prevent this, you can create a point in time (PIT) to preserve the current index state over your searches.”

If I use search after in an opensearch 1.3 production environment, but without the pit function, will there be the above problem, or can I use search after in production?

Yeah, using search after without pit will have that problem, but if the unstable search results are acceptable or you are sure that there will be no change in the index(no index and update operation), you can use search after in production, the performance is better than scroll and resource consumption is lower than scroll.

There is another question, I plan to use opensearch 1.3 search_after in the production environment, the dsl statement is as follows, @timestamp is the data time, _id is the index unique identifier.

GET my-test/_search
{
   "size": 3,
   "query": {
     "match_all": {}
   },
   "sort": [
     {
       "@timestamp": "desc"
     },
     {
       "_id": "desc"
     }
   ],
   "search_after": [
     1712020085226,
     "my_test_raw+0+123126514"
   ]
}

I’m a little confused now about this usage:

  1. Is it possible not to use tie_breaker_id
  2. Whether _id can be used as tie_breaker_id?Because I saw in the documentation that it is not recommended to use _id as tiebreak in versions before es 7.10.x. I didn’t see it in es 7.10.x. Is it possible to use _id as tiebreak?
  1. If you are sure that the field @timestamp in different documents are also different, then a tiebreaker field is not required, if not, it’s better to include a tiebreaker field in the sort to avoid unexpected results.
  2. _id is still not recommended to be used for sorting because of too much memory consumption, you can choose another field which contain a unique value for each document to be a tiebreaker field.
1 Like

Thank you very much for your patient reply.