"store, forward, delete" option to move data from remote clusters to a centralized cluster?

jaimie.livingston · August 28, 2023, 4:46pm

Is there a way to leverage cross-cluster replication to setup a resilient “store & forward” relationship where remote clusters with limited storage and poor-quality network links forward data to a centralized cluster for long term storage, query, and management.

The idea is that the remote clusters would keep data locally long enough to confirm transfer to a centralized cluster, then remove the local data to free up storage. The primary goal is to account for “less than 100% reliable” WAN links between the remote clusters and the centralized cluster.

Is there a better method for doing this than trying to use cross-cluster replication?

Thanks,
Jaimie Livingston

radu.gheorghe · September 21, 2023, 10:15am

Maybe you can use something like Kafka to buffer data until it’s delivered to the central cluster?

Or pretty much any log shipper that can buffer. Like Logstash and friends.

jaimie.livingston · September 28, 2023, 4:23pm

One of the scope requirements that I have to work with is that the logs be locally availble to techs who cycle though the remote facilities during connectivity loss periods, which can last up to several days.

otherwise, buffered logstash would probably work just fine.

radu.gheorghe · October 3, 2023, 3:51pm

How about this: from Logstash/others, send the data to two destinations: a small local OpenSearch and remote OpenSearch. The local one could have low retention.

Actually, Logstash is a bad example, because it will block the queue if one of the destinations isn’t available (i.e. the central OpenSearch), so the local OpenSearch wouldn’t get the data, either.

But it doesn’t have to be Logstash, see the link above. rsyslog comes to mind, because it can have separate queues per action. You’ll see a diagram in this old post. The point is, as long as the action queue is large enough, you’ll be good. And you can decide what rsyslog does when the queue overflows (block, drop only events of a certain severity when the queue is almost full, etc.).

system · December 2, 2023, 3:51pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cross Cluster Replication Cross-Cluster Replication	4	2002	February 23, 2021
Disaster Recovery (restore) a cluster from Remote-backed Storage? OpenSearch	4	224	December 7, 2024
How to create the following setup easily in OpenSearch OpenSearch	3	63	August 14, 2024
How to enable the logstash input plugin to send OpenSearch Index events Cross-Cluster Replication configure	2	556	October 18, 2022
Logstash: Is it possible to forward data to other system (not opensearch) using Logstash OpenSearch	2	79	January 20, 2025

"store, forward, delete" option to move data from remote clusters to a centralized cluster?

Related topics