Action delete with timeout in ISM

Hi, I just begin using ISM to auto-delete old indices.

My policy

{
    "policy_id": "delete_old_index_25d",
    "description": "Delete old index older than 25 days.",
    "last_updated_time": 1630510660300,
    "schema_version": 1,
    "error_notification": null,
    "default_state": "open",
    "states": [
        {
            "name": "open",
            "actions": [],
            "transitions": [
                {
                    "state_name": "delete",
                    "conditions": {
                        "min_index_age": "25d"
                    }
                }
            ]
        },
        {
            "name": "delete",
            "actions": [
                {
                    "delete": {}
                }
            ],
            "transitions": []
        }
    ],
    "ism_template": {
        "index_patterns": [
            "my-index-name-*"
        ],
        "priority": 100,
        "last_updated_time": 1630510660300
    }
}

I checked the job result and it was

    "cause": "failed to process cluster event (delete-index [[my-index-name-2021.08.06/YMF7yT0WTY614TpzBDYSxw]]) within 30s",

My daily index is heavy, it is about almost 3Tb. So whenever I have to delete it, I always put a timeout param for it. Like:

DELETE my-index-name?timeout=60s

But how I can do it in ISM?

I haven’t tested this but can you use timeout as described in the documentation?

1 Like

LOL. Why I don’t think about that :joy: :joy: :joy: . Lets me try and will lets you know the result :grin:
I thought timeout would affect in another way in document :smiley:

        {
            "name": "delete",
            "actions": [
                {
                    "timeout": "1h",
                    "delete": {}
                }
            ],
            "transitions": []
        }
2 Likes

Everything working well, no timeout or failed job so far on delete :smiley:

1 Like

Glad it’s working for you!

Semi-related - there is an enhancement currently in the discussion stage about how action settings are handled. Maybe you have some feedback?

https://github.com/opensearch-project/index-management/issues/135

1 Like

Hrmm that timeout shouldn’t be doing what you want @BlackMetalz (adding a timeout to the delete operation API). It instead is a timeout for the action (if the action started and isn’t able to complete within this amount of time).

We don’t currently support modifying the generic parameters of the API calls like timeouts. And a delete call isn’t exactly going and deleting all the data before it treats it as successful, it’s adding an URGENT priority task into the queue on the master node to flag the index as deleted and cleans it up in the background. If deletes are routinely failing because of the 30s timeout then I’d take a look at your cluster to see what is potentially going on, i.e. is your pending tasks getting backed up or master overloaded and can’t process the tasks quick enough, etc.

1 Like

Mostly cluster isn’t having much load at that time I did delete, but the point is deleting a lot of data, like 12-16Tb per query delete DELETE index-2021-08*. That gave timeout for me, but delete a single index as I mention above is about 2-3Tb it doesn’t give a timeout And seem like delete with ism only appear first time since they have to delete a lot of data, after that only 1 index is being get deleted xD