Consistent data for backups and snapshot disk space

lindhe · September 11, 2024, 8:39am

Hi!

I’m running an application in Kubernetes, and that application depends on OpenSearch. So now I have to learn how to maintain OpenSearch! Right now, I am trying to wrap my head around how to properly backup OpenSearch.

I found that it’s possible to configure snapshots in OpenSearch. Clearly, if I configure snapshots to run regularly and save those on disk, that data will be consistent if I back it up using some external solution (for example Velero).

Snapshots is probably a good thing to use in any case, but just for my own understanding I want to know: if I backup OpenSearch’s PVC data without any snapshots, does that risk data inconsistency?

A related question I have is regarding the incremental nature of snapshots. It says in the docs:

Snapshots store only incremental changes since the last snapshot. Thus, while taking an initial snapshot may be a heavy operation, subsequent snapshots have minimal overhead.

I understand that the initial snapshot may be relatively slow. But does this also mean that the initial snapshot takes up extra disk space? So if I expect my live data to be 100 GiB on disk, I would need to provision more than 200 GiB disk to accommodate snapshots. Right?

Cheers!

system · November 10, 2024, 8:40am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Request for feedback on backup and archival in OpenSearch General Feedback configure , index-management	0	214	March 5, 2024
Enabling Snapshots in Multi-Node OpenSearch Cluster OpenSearch	1	507	November 10, 2023
To store opensearch snapshot, the storage should be shared file system? OpenSearch	1	207	November 9, 2023
Searchable snapshots and initial cache size Index Management configure	1	15	April 11, 2025
Searchable Remote Snapshots Hardware Guidance OpenSearch	3	341	June 1, 2023

Consistent data for backups and snapshot disk space

Related topics