ScanError while scrolling more than 10k docs

Versions
2.5

Describe the issue:

I am getting ScanError (ScanError(‘Scroll request has only succeeded on 7 (+5 skipped) shards out of 15.’)) when the search results is large (mostly when it is more than 10k).

I have a few questions about it:

  1. What is the underlying reasons for this issue ?
  2. Is this related to number of shards or the number of search results ?
  3. Is there a way to handle this by some elasticsearch property setting or query parameters or scaling or shard count/settings ?
  4. I found a suggestion that can avoid this by using the flag (raise_on_error in the python library) but that will result in suppressing the exception and returning incomplete results.

Please let me know what’s the correct way of solving this issue.

Configuration:

Node Config

Number of nodes

3

Storage type

EBS

EBS volume type

General Purpose (SSD) - gp2

EBS volume size

2000 GiB

Dedicated master nodes

Enabled

Yes

Instance type

r6g.xlarge.search

Number of nodes

3

Warm and cold data storage

UltraWarm data nodes enabled

No

Relevant Logs or Screenshots: