DataStream configuration and rollover

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

AWS Opensearch 2.17

Describe the issue:

I’m new to OpenSearch, I was given a task to configure four data streams and also rollover the data streams using ISM policy if the size reaches 50 GB and delete old data to manage storage. please help with the configuration part.
Data comes from fluent bit, with each index reaching around 200 GB daily.

Hi @Dart,

here you can find guidance on your task: ISM Error Prevention - OpenSearch Documentation

Data streams: Data streams - OpenSearch Documentation

If my understanding is correct, you are looking to get something like this(?):

Just a sample for guidance please review the docs.

{
  "policy": {
    "description": "Rollover the index at 50GB and delete old indices after a certain period (30 days)",
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "rollover": {
              "min_size": "50gb"
            }
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "30d"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {}
          }
        ]
      }
    ]
  }
}

best,
mj

Thanks @Mantas, I have followed the documentation and set the initial ISM to rollover and delete the index. Is it possible to rollover the Index and then delete the data in indexes older than 2 days ?

@Dart, are you trying to count the days after the rollover?

i.e index X gets to 50gb rollover to Y, delete X - 2 days after Y?

Is my understanding correct here?

Best,
mj

@Mantas Thanks for the response, the size of the incoming logs is very large, so I can’t delete them based on the days, I want to preserve the latest 50GB and delete all logs older than that, So at every instance, the last 50GB of data will be available.

What about something like this (please let me know if it needs clarification, and keep in mind this is just a sample for the ideas):

{
  "policy": {
    "description": "Rollover index at 50GB and keep the last 2 indices",
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "rollover": {
              "min_size": "50gb"
            }
          }
        ],
        "transitions": [
          {
            "state_name": "warm",
            "conditions": {
              "min_index_age": "1d"
            }
          }
        ]
      },
      {
        "name": "warm",
        "actions": [
          {
            "replica_count": {
              "number_of_replicas": 1
            }
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "1d"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {
              "min_index_count": 2
            }
          }
        ]
      }
    ]
  }
}
1 Like

This looks good, but the 50GB will be reached within 4-8 hours based on the traffic. Can’t store rolledover data for 1 day. also what’s the use of min_index_count ?

you can find available options here: Policies - OpenSearch Documentation

what about 8h?

best,
mj

Thanks, I will look into it, just help me with one more thing,

My ISM policy on reaching 50GB trigger rollover, but it fails with error “missing alias or not write index”. I have the rollover alias setting in my index template index.plugins.index_state_management.rollover_alias:"prod-data"

prod-data is the alias name of my index and prod-data-000094 is the current index which is also write index
ism policy was not able to change the write index from current index to the new index.

would you mind sharing the output of the following:

GET /_cat/aliases

or/and confirm that your allies have set "is_write_index": true
thanks,
mj

GET /_cat/aliases
prod-data prod-data-000094 - - - true

GET /_alias/prod-data

{
  "prod-data-000092": {
    "aliases": {
      "prod-data": {
        "is_write_index": false
      }
    }
  },
  "prod-data-000094": {
    "aliases": {
      "prod-data": {
        "is_write_index": true
      }
    }
  },
  "prod-data-000093": {
    "aliases": {
      "prod-data": {
        "is_write_index": false
      }
    }
  }
}