Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch version - 2.11
Describe the issue:
We are creating a API to retrieve more than 10k records from opensearch. Maximum might be around 500k and we are not worried about realtime data
What would be the better approach to retrieve the data - search_after or PIT API with search_after with memory, cpu and storage considerations?
Also, what would happen in case of single node or shrad failure?
Configuration:
Data Node
r6g.2xlarge.search
Number of data nodes
3
Storage type
EBS
EBS volume type
General Purpose (SSD) - gp3
EBS volume size (GiB)
200
Provisioned IOPS
6000 IOPS
Provisioned Throughput (MiB/s)
250 MiB/s
Master Node
Instance type
m6g.large.search
Number of master nodes
3
Relevant Logs or Screenshots: