How to create the following setup easily in OpenSearch

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Any

Describe the issue:
Splitting OpenSearch into write and read clusters.

Hi,

I am currently working on a search system where the read latency is very important. Realizing that having read and write on the same node increases read latency, I am thinking of the following setup.

  1. Setup a write cluster
  2. Ingest data into the write cluster
  3. Store the data into a storage like S3
  4. Use a different cluster for read (Use the data from the S3 cluster) to create multiple replicas and read.

We are thinking of creating a POC for this and would love ideas on what is the easiest way to do it. Will be potentially contributing back to OpenSearch if there is a cleaner way to address this.

Configuration:

Relevant Logs or Screenshots:

@rogern I think the best approach is to start with OpenSearch Documentation.

  1. Setup a write cluster
    Installing OpenSearch - OpenSearch Documentation
    Installing OpenSearch Dashboards - OpenSearch Documentation

  2. Ingest data into the write cluster
    Logstash - OpenSearch Documentation
    Data Prepper - OpenSearch Documentation

  3. Store the data into a storage like S3
    I think this would be S3 and managed AWS OpenSearch or snapshot to S3 bucket
    Using an OpenSearch Ingestion pipeline with Amazon S3 - Amazon OpenSearch Service
    Take and restore snapshots - OpenSearch Documentation

Thank you @Pablo !

For the snapshot and restore functionality, would the read cluster would be able to accept read traffic when a snapshot is being restored ?

I want a pattern where the read cluster is available when the snapshot is being restored - mostly like a continual update of the index.

@rogern Why do you want to update the cluster with a snapshot? Wouldn’t be better to use Cross Cluster Replication?