Best approach to retrieve large set of data for download use case

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch version - 2.11

Describe the issue:
We are creating a API to retrieve more than 10k records from opensearch. Maximum might be around 500k and we are not worried about realtime data

What would be the better approach to retrieve the data - search_after or PIT API with search_after with memory, cpu and storage considerations?

Also, what would happen in case of single node or shrad failure?

Configuration:

Data Node
r6g.2xlarge.search

Number of data nodes

3

Storage type

EBS

EBS volume type

General Purpose (SSD) - gp3

EBS volume size (GiB)

200

Provisioned IOPS

6000 IOPS

Provisioned Throughput (MiB/s)

250 MiB/s

Master Node

Instance type

m6g.large.search

Number of master nodes

3

Relevant Logs or Screenshots: