ISM policy not always correctly applied after index rollover

Oldo · April 22, 2025, 12:00pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

OpenSearch 2.14

Describe the issue:

We has two Opensearch cluster (prod-1 and prod-2) with a lot of indices. Most of indices created on daily pattern (i.e. index_name-YYYY-MM-DD) and watched by ISM policy

    {
      "_id": "delete_policy",
      "_seq_no": 917888812,
      "_primary_term": 11,
      "policy": {
        "policy_id": "delete_policy",
        "description": "Delete policy",
        "last_updated_time": 1745321459747,
        "schema_version": 21,
        "error_notification": null,
        "default_state": "hot",
        "states": [
          {
            "name": "hot",
            "actions": [],
            "transitions": [
              {
                "state_name": "delete",
                "conditions": {
                  "min_index_age": "718h"
                }
              }
            ]
          },
          {
            "name": "delete",
            "actions": [
              {
                "retry": {
                  "count": 3,
                  "backoff": "exponential",
                  "delay": "1m"
                },
                "delete": {}
              }
            ],
            "transitions": []
          }
        ],
        "ism_template": [
          {
            "index_patterns": [
              "*"
            ],
            "priority": 0,
            "last_updated_time": 1745321459747
          }
        ]
      }
    }

For some index pattern we create new rollover policy and index template
Policy:

    {
      "_id": "ocp-prod-int-prod",
      "_seq_no": 917886289,
      "_primary_term": 11,
      "policy": {
        "policy_id": "ocp-prod-int-prod",
        "description": "Rollover policy for ocp-prod-int-prod",
        "last_updated_time": 1745321440295,
        "schema_version": 21,
        "error_notification": null,
        "default_state": "rollover",
        "states": [
          {
            "name": "rollover",
            "actions": [
              {
                "retry": {
                  "count": 3,
                  "backoff": "exponential",
                  "delay": "1m"
                },
                "rollover": {
                  "min_primary_shard_size": "15gb",
                  "copy_alias": false
                }
              }
            ],
            "transitions": [
              {
                "state_name": "delete",
                "conditions": {
                  "min_rollover_age": "30d"
                }
              }
            ]
          },
          {
            "name": "delete",
            "actions": [
              {
                "retry": {
                  "count": 3,
                  "backoff": "exponential",
                  "delay": "1m"
                },
                "delete": {}
              }
            ],
            "transitions": []
          }
        ],
        "ism_template": [
          {
            "index_patterns": [
              "ocp-prod-int-prod-*"
            ],
            "priority": 10,
            "last_updated_time": 1745321440295
          }
        ]
      }
    }

Template:

    {
      "name": "ocp-prod-int-prod",
      "index_template": {
        "index_patterns": [
          "ocp-prod-int-prod*"
        ],
        "template": {
          "settings": {
            "index": {
              "number_of_shards": "5",
              "opendistro": {
                "index_state_management": {
                  "policy_id": "ocp-prod-int-prod",
                  "rollover_alias": "ocp-prod-int-prod"
                }
              },
              "plugins": {
                "index_state_management": {
                  "policy_id": "ocp-prod-int-prod",
                  "rollover_alias": "ocp-prod-int-prod"
                }
              }
            }
          }
        },
        "composed_of": [],
        "priority": 5,
        "version": 1
      }
    }
  ]
}

Indexes and alias created and start correctly.

But, after few rollover circles newest created indexes get wrong ISM policy (delete_policy) and just grow in size without rollover. After manual ISM changed to correct policy (ocp-prod-int-prod) its rollover to new indexes, then after few rollovers its happen again.
For indexes with wrong applied policy we seen difference in index settings

        "number_of_shards": "5",
        "plugins": {
          "index_state_management": {
            "policy_id": "ocp-prod-int-prod",
            "rollover_alias": "ocp-prod-int-prod",
            "auto_manage": "false"
          }
        }

and ISM explain:

{
  "ocp-prod-int-prod-000008": {
    "index.plugins.index_state_management.policy_id": "delete_policy",
    "index.opendistro.index_state_management.policy_id": "delete_policy",
    "index": "ocp-prod-int-prod-000008",
    "index_uuid": "0iAeiA9RTzuCwLmvfNzVCw",
    "policy_id": "delete_policy",
    "policy_seq_no": 874381947,
    "policy_primary_term": 11,
    "index_creation_date": 1745059907830,
    "state": {
      "name": "hot",
      "start_time": 1745223088184
    }

As you can see, in index settings there correct policy_id from index template (and rollover_alias), but ISM matched index as “delete_policy”

index                    pri.store.size pri
ocp-prod-int-prod-000018         38.6gb   5
ocp-prod-int-prod-000005         75.4gb   5
ocp-prod-int-prod-000004         75.5gb   5
ocp-prod-int-prod-000014         75.6gb   5
ocp-prod-int-prod-000006         75.7gb   5
ocp-prod-int-prod-000017         75.8gb   5
ocp-prod-int-prod-000002           76gb   5
ocp-prod-int-prod-000012         76.1gb   5
ocp-prod-int-prod-000003         76.4gb   5
ocp-prod-int-prod-000011         76.8gb   5
ocp-prod-int-prod-000016         77.5gb   5
ocp-prod-int-prod-000007           78gb   5
ocp-prod-int-prod-000009         78.1gb   5
ocp-prod-int-prod-000010        112.8gb   5
ocp-prod-int-prod-000013        115.5gb   5
ocp-prod-int-prod-000001        121.3gb   5
ocp-prod-int-prod-000015        169.5gb   5
ocp-prod-int-prod-000008        792.6gb   5

As you can see its happen with indexes 000001, 000008, 000010, 000013, 000015
on cluster prod-2

On cluster prod-1 its happen only once, with indice 000004:

index                    pri.store.size pri
ocp-prod-int-prod-000016         59.3gb   5
ocp-prod-int-prod-000002         75.3gb   5
ocp-prod-int-prod-000011         75.8gb   5
ocp-prod-int-prod-000007         75.9gb   5
ocp-prod-int-prod-000001         76.3gb   5
ocp-prod-int-prod-000010         76.4gb   5
ocp-prod-int-prod-000013         76.4gb   5
ocp-prod-int-prod-000012         76.5gb   5
ocp-prod-int-prod-000009         77.2gb   5
ocp-prod-int-prod-000015         77.3gb   5
ocp-prod-int-prod-000006         77.5gb   5
ocp-prod-int-prod-000003         77.6gb   5
ocp-prod-int-prod-000014         77.7gb   5
ocp-prod-int-prod-000008         77.7gb   5
ocp-prod-int-prod-000005         78.1gb   5
ocp-prod-int-prod-000004        843.8gb   5

We are double checked configs, master node stats, etc and not have any clue. Why its happen and why so random?
In logs its only

2025-04-22 02:50:52	
[2025-04-21T23:50:52,103][INFO ][o.o.i.i.ManagedIndexCoordinator] [prod-cluster-opensearch-cluster-masters-1] Index [ocp-prod-int-prod-000015] matched ISM policy template and will be managed by delete_policy

Why ISM periodically match indice with delete_policy, if in template and index settings we got

        "plugins": {
          "index_state_management": {
            "policy_id": "ocp-prod-int-prod"

and ocp-prod-int-prod policy has more priority (10 against 0)

pablo · April 22, 2025, 4:30pm

@Oldo How many nodes do you have in your cluster? Do you see any errors in OpenSearch logs regarding ISM policy execution?
If you noticed any ISM errors, were they located on the same node?

Oldo · April 22, 2025, 4:50pm

@pablo
Two clusters, each 49 nodes (40 data, 3 masters, rest is coordinators)

There is no any ISM errors. Rollover with wrong ISM policy in logs:

2025-04-22 02:50:48	
[2025-04-21T23:50:48,205][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-masters-1] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]Show context
2025-04-22 02:50:48	
[2025-04-21T23:50:48,206][INFO ][o.o.c.m.MetadataCreateIndexService] [prod-cluster-opensearch-cluster-masters-1] [ocp-prod-int-prod-000015] creating index, cause [rollover_index], templates [ocp-prod-int-prod], shards [5]/[1]
2025-04-22 02:50:50	
[2025-04-21T23:50:50,546][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-6] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]
2025-04-22 02:50:50	
[2025-04-21T23:50:50,554][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-22] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]
2025-04-22 02:50:50	
[2025-04-21T23:50:50,561][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-1] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]
2025-04-22 02:50:50	
[2025-04-21T23:50:50,561][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-0] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]
2025-04-22 02:50:50	
[2025-04-21T23:50:50,573][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-33] PluginService:onIndexModule index:[ocp-prod-int-prod-000015/ViLj75CJTC2QRinHu2y_BA]
2025-04-22 02:50:52	
[2025-04-21T23:50:52,103][INFO ][o.o.i.i.ManagedIndexCoordinator] [prod-cluster-opensearch-cluster-masters-1] Index [ocp-prod-int-prod-000015] matched ISM policy template and will be managed by delete_policy
2025-04-22 02:50:52	
[2025-04-21T23:50:52,125][INFO ][o.o.j.s.JobScheduler     ] [prod-cluster-opensearch-cluster-data-17] Will delay 10153 miliseconds for next execution of job ocp-prod-int-prod-000015

Oldo · April 22, 2025, 5:50pm

Logs with correct policy applied:

2025-04-22 12:28:56	
[2025-04-22T09:28:56,087][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-masters-1] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:28:56	
[2025-04-22T09:28:56,088][INFO ][o.o.c.m.MetadataCreateIndexService] [prod-cluster-opensearch-cluster-masters-1] [ocp-prod-int-prod-000017] creating index, cause [rollover_index], templates [ocp-prod-int-prod-new], shards [5]/[1]
2025-04-22 12:28:57	
[2025-04-22T09:28:57,872][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-36] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:28:57	
[2025-04-22T09:28:57,888][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-24] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:28:57	
[2025-04-22T09:28:57,894][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-20] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:28:57	
[2025-04-22T09:28:57,951][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-31] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:28:57	
[2025-04-22T09:28:57,964][INFO ][o.o.p.PluginsService     ] [prod-cluster-opensearch-cluster-data-30] PluginService:onIndexModule index:[ocp-prod-int-prod-000017/EVL0ptf_S86PK2CycnaiaA]
2025-04-22 12:29:03	
[2025-04-22T09:29:03,060][INFO ][o.o.i.i.ManagedIndexCoordinator] [prod-cluster-opensearch-cluster-masters-1] Index [ocp-prod-int-prod-000017] matched ISM policy template and will be managed by ocp-prod-int-prod

Oldo · April 22, 2025, 6:13pm

@pablo
Can we use conditions inside ism_template, such as for delete_policy:

{
  "ism_template": [{
    "index_patterns": ["*"],  
    "priority": 0,
    "conditions": {
      "exclude": ["ocp-prod-int-prod-*"]  
    }
  }]
}

So delete_policy shouldn’t take precedent over ocp-prod-int-prod policy?

reshippie · April 28, 2025, 9:14pm

Your delete_policy is applied to all indices, *. Your ocp-prod-int-prod policy is applied to ocp-prod-int-prod-*.
I bet the 2 index patterns are fighting. Make your delete_policy pattern more specific, to avoid the overlap.

Oldo · April 29, 2025, 7:46am

We are found root cause.
It’s opensearch-k8s-operator. delete_policy created via opensearch API when ocp-prod-int-prod creared via CRD
Due this bug (Fix ISM policy reconcile condition by evheniyt · Pull Request #972 · opensearch-project/opensearch-k8s-operator · GitHub) operator periodically validate policy with request to wrong key in CRD, which escalate policy update. Update change policy creation timestamp. Policy applied to only indices which has oldest creation date (compare to policy creation time).
This is race condition, and why sometime delete_policy applied. When we move delete_policy to operator CRD, we take index without any policy upplied.
At this time we move all policies to opensearch API, none in CRD and all work as intended.

system · June 28, 2025, 7:47am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with automatically adds policy Index Management	14	2132	April 2, 2021
General and simple questions regarding index management polcies Index Management	6	213	April 28, 2025
ISM policy doens't apply Index Management	1	487	February 7, 2023
Find the reason why policy was not applied Index Management	2	1190	June 21, 2021
ISM policies not getting applied to index templates Index Management troubleshoot , configure	1	1299	March 17, 2022

ISM policy not always correctly applied after index rollover

Related topics