How to create a policy that maintains the total amount of log documents within interval

fredrik · November 18, 2024, 4:51pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
using docker images opensearch:latest and opensearch-dashboard:latest

Describe the issue:
I have an OpenSearch cluster that I use as a backend for various tests and experiments. The traffic is fluctuating a lot, depending on the experiments that are currently running. To ensure that the cluster don’t run out of disc space, I want ISM policies that periodically deletes the oldest documents based on the total size of the index, alias or index pattern.

Examples:
Lets say that I have an index called “logs”. I want to ensure that the index don’t grow too large to fill the disc, regardless of how many data that is ingested. I also want to keep data as long as possible, so I don’t want to delete data after, for instance, 30days if it is not yet full. An idea would be to, whenever the index gets larger than, 100GB, I remove the oldest logs until I’m down to 50GB.

I have tried to use “Rollover” to rollover the index when ever it reaches 50GB. When 100GB of data has been ingested (assuming perfect job intervalls), I will have “logs-000001” of 50GB, “logs-000002” of 50GB and “logs-000003” of 0GB. How can I in this scenario delete only “logs-000001”?

I am most likely overcomplicating the problem, so if you ahve a better suggestion for how to maintain the size of a set of indices without filtering on age, please help me!

fredrik · November 18, 2024, 5:00pm

Sorry, just realize that the question is a duplicate (even though I tried to find it in advance). But I still need a solution. I have found an old feature request from opendistro-for-elasticsearch repo called “Transition based on cluster free available space · Issue #260”, are there any similar for OpenSearch Index Management?

Related questions:

ssablan · November 20, 2024, 6:37pm

Indexes can be deleted, moved, rollover, etc by ISM. So you have to decide on a strategy based on # documents, age, size (in bytes). You described something like:

index to logs001
when logs001 reaches 30GB rollover
index to logs002
when logs002 reaches 30Gb rollover and so on.
then periodically have ISM check if an index is more than 60 days old delete it.

There are more details but this is the idea. For example, if you want to rollover when you hit 30GB and then delete an index when it is older than 60 days:

  [
    {
        "name": "rollover",
        "actions": [
            {
                "rollover": {
                    "min_index_age": "7d",
                    "min_primary_shard_size": "30gb"
                }
            }
        ]
    },
    {
        "name": "delete_old_indexes",
        "actions": [
            {
                "delete": {
                    "min_index_age": "60d",
                    "min_primary_shard_size": "30gb"
                }
            }
        ]
    }
]

reference:

system · January 19, 2025, 6:38pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Policy for a set of rolled-over indices Index Management	2	585	November 3, 2020
DataStream configuration and rollover OpenSearch Dashboards troubleshoot , configure , index-management	11	383	October 29, 2024
Suggestion needed on Index Rollovers that are created by logstash OpenSearch	4	56	August 28, 2024
Needing some help with ISM Policy Index Management	3	693	April 2, 2024
Need some help with ISM Policy OpenSearch discuss , index-management	1	89	September 18, 2024

How to create a policy that maintains the total amount of log documents within interval

Related topics