Reindex job failing with search phase execution exception

Opensearch 1.0

Reindex job failing with search_phase_execution_exception. We tried decreasing the batch size from default 1000 to 100 and still see the same issue. Any other options to try with?


Can you show more information about your problem? Such as the full log of the search_phase_execution_exception and the reindex parameters.

API is very simple as below. We are running 10 parallel reindex jobs and under load we hit this

POST /_reindex
     "size": 100

Below is the response from task api of that reindex operation

"completed": true,
"task": {
"node": "abc",
"id": 182462425,
"type": "transport",
"action": "indices:data/write/reindex",
"status": {
"total": 1629142,
"updated": 0,
"created": 128300,
"deleted": 0,
"batches": 1283,
"version_conflicts": 0,
"noops": 0,

{ "bulk": 0, "search": 0 }

"throttled_millis": 0,
"requests_per_second": -1.0,
"throttled_until_millis": 0
"description": "reindex from [sourceIndex] to [destIndex1][_doc]",
"start_time_in_millis": 1694273514552,
"running_time_in_nanos": 1339667737279,
"cancellable": true,
"headers": {}
"error": {
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
"shard": -1,
"index": null,

{ "type": "search_context_missing_exception", "reason": "No search context found for id [61314384]" }


This likely means that the underlying scroll expired. You could increase the timeout (which is really for processing each page), but the default scroll should be 5m. So I’m assuming your OpenSearch cluster can’t keep up with the load, that it can’t process a page in 5 minutes…

Maybe you can run less reindexing jobs in parallel?