Versions:
OpenSearch 2.9.0 cluster
Describe the issue:
We have a service running in spark cluster that reads a lot of documents in a batch from OpenSearch cluster. Due to recent change of cluster to use HTTPS, we have observed high increase in scrolls which eventually breaches limit of 500 scrolls, stopping the service.
Error:
Caused by: org.opensearch.hadoop.rest.OpenSearchHadoopInvalidRequest: org.opensearch.hadoop.rest.OpenSearchHadoopRemoteException: rejected_execution_exception: Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.
This is not the case when cluster uses HTTP. It feels strange that HTTPS can have impact on scroll number.
Method used for fetching data: sparkContext.openSearchJsonRDD(resource, query, cfg)
Query used: { "query": {"query_string": {"query": "someTime:[\"2023-01-01T01:27:00.334Z\" TO \"2023-01-01T01:30:00.334Z\"]"}}}
One solution to this would be to increase scroll limit, but that impacts cluster performance.
Does anyone know why is there increase in scrolls after HTTPS update? API issue?
Please let me know if you need more info to clarify the problem.