ManagedIndexRunner stops rollover and transition attempts

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 1.3.6
RHEL 8.10

Describe the issue:
I am experiencing an issue where some or all indices fail to rollover despite matching the criteria that has been put on the ISM Policy. I am troubleshooting this issue across multiple environments that include an opensearch component where the pattern can vary slightly with each environment.

In one environment(5 node Opensearch cluster), I can see a change in the opensearch log behavior prior to an OS patching event that caused the servers to be rebooted and opensearch services restarted. The entries prior to patching looked similar to the following:

[2024-10-15T00:00:02,454][INFO ][o.o.i.i.ManagedIndexRunner] [192.0.2.71] Executing attempt_transition_step for dataindex_primarymetric-000044
[2024-10-15T00:00:02,455][INFO ][o.o.i.i.ManagedIndexRunner] [192.0.2.71] Finished executing attempt_transition_step for dataindex_primarymetric-000044
[2024-10-15T00:00:02,489][INFO ][o.o.j.s.JobScheduler     ] [192.0.2.71] Will delay 96923 miliseconds for next execution of job dataindex_secondarymetric-000165
[2024-10-15T00:00:02,616][INFO ][o.o.j.s.JobScheduler     ] [192.0.2.71] Will delay 132407 miliseconds for next execution of job dataindex_tertiarymetric-000161

And after patching only showed the JobScheduler staggering events but no entries related to ManagedIndexRunner

[2024-10-16T00:00:05,740][INFO ][o.o.j.s.JobScheduler     ] [192.0.2.71] Will delay 100262 miliseconds for next execution of job dataindex_primarymetric-000047
[2024-10-16T00:00:05,998][INFO ][o.o.j.s.JobScheduler     ] [192.0.2.71] Will delay 121054 miliseconds for next execution of job dataindex_secondarymetric-000169
[2024-10-16T00:00:06,037][INFO ][o.o.j.s.JobScheduler     ] [192.0.2.71] Will delay 31765 miliseconds for next execution of job dataindex_tertiarymetric-000163

For this environment, even after a successful manual rollover of one of the larger indices that had failed to automatically rollover, there were no new ManagedIndexRunner entries showing up in the logs. My team is looking to do a clean stop and rolling start of services at the earliest opportunity to see if a clean start of services “resolves” the issue.

In a separate environment(single node Opensearch instance), only one large index in particular had no ManagedIndexRunner rollover/transition evaluation checks being shown while all other indices had check entries and were rolling over just fine. In this environment, after a manual rollover of the problem index was performed, the ManagedIndexRunner entries started showing up for this index.

I have not found any bug reports of this version of Opensearch describing this type of issue and I am having difficulty finding a common thread as to why we are encountering these situations where sometimes just one index and sometimes most indices on an environment stop having rollover/transition evaluations occur on the running opensearch instance(s). If anyone has seen this behavior or could provide guidance/insight, it would be greatly appreciated. Happy to provide any further information that I can to aid in troubleshooting.

Configuration:

{
    "id": "policy_standard_rollover",
    "seqNo": 0,
    "primaryTerm": 1,
    "policy": {
        "policy_id": "policy_standard_rollover",
        "description": "Standard Rollover Policy",
        "last_updated_time": 1713902386945,
        "schema_version": 18,
        "error_notification": null,
        "default_state": "hot",
        "states": [
            {
                "name": "hot",
                "actions": [
                    {
                        "retry": {
                            "count": 3,
                            "backoff": "exponential",
                            "delay": "1m"
                        },
                        "rollover": {
                            "min_size": "50gb"
                        }
                    }
                ],
                "transitions": [
                    {
                        "state_name": "warm"
                    }
                ]
            },
            {
                "name": "warm",
                "actions": [
                    {
                        "retry": {
                            "count": 3,
                            "backoff": "exponential",
                            "delay": "1m"
                        },
                        "replica_count": {
                            "number_of_replicas": 1
                        }
                    }
                ],
                "transitions": [
                    {
                        "state_name": "delete",
                        "conditions": {
                            "min_index_age": "30d"
                        }
                    }
                ]
            },
            {
                "name": "delete",
                "actions": [
                    {
                        "retry": {
                            "count": 3,
                            "backoff": "exponential",
                            "delay": "1m"
                        },
                        "delete": {}
                    }
                ],
                "transitions": []
            }
        ],
        "ism_template": [
            {
                "index_patterns": [
                    "dataindex_primarymetric-*",
                    "dataindex_secondarymetric-*",
                    "dataindex_tertiarymetric-*"
                ],
                "priority": 1,
                "last_updated_time": 1713902386945
            }
        ]
    }
}

Relevant Logs or Screenshots: