Sequence of multiple actions

The documentation is a little unclear on one point around the sequence of multiple actions. The documentation states that

ISM executes actions in the order in which they are defined. For example, if you define actions: [A,B,C,D], ISM executes A and then enters into a timeout period. After the timeout period is over, ISM executes B, and this process repeats.

What isn’t clear is what happens if the timeout period is not specified. Will Opensearch execute the actions asynchronously, or will it wait as long as it takes to receive a response before moving to the next action?

1 Like

Hey badaresbard, not going to lie I really had to dive deep on this one and I am glad I did. I think the documentation might actually be worded strangely on this one. I’ll link the code I found below but what I found in code seems to imply the timeout is used to stop the executions from running forever.

https://github.com/opensearch-project/index-management/blob/main/src/main/kotlin/org/opensearch/indexmanagement/indexstatemanagement/ManagedIndexRunner.kt#L282

I am going to verify with someone more familiar with the plugins that what I found is the same action timeout mentioned in the documentation.

Back again @badarsebard, I confirmed with @thalurur that what I have told you is correct Actions will execute synchronously. The timeout is only meant to be used as a circuit breaker so that the action will not run forever. I opened an issue on GitHub to clarify if you would like to submit a PR! :smiley:

https://github.com/opensearch-project/documentation-website/issues/393

Thanks for highlighting this!

@dtaivpp The timeout code you showed is related to the actual timeout logic that you specify for an action. Whereas the documentation snippet from @badarsebard seems to be misusing the word “timeout” when it should really just say sleep, i.e. the job executes A and then goes to sleep (based on the job_interval cluster setting) and then wakes back up and executes B. This part has nothing to do with timeouts.

I read this differently from the way you are. Reading over it this is how I interpret it: job_interval seems to be a setting for how frequently it checks for transition jobs. For example if a transition check is scheduled for 5:01 but the job interval is 5 minutes then that job will be executed at either 5:00 or 5:05. The actions [A,B,C,D] should all execute synchronously one after the other as they are all tied to one particular type of transition change.

EG:
Job_Interval = 5 min
Job 1 has actions [A,B,C] and should execute at 5:01
Job 2 has actions [X,Y,Z] should execute at 5:11

5:00 Job 1 executes actions [A,B,C] – 5 min – 5:05 – 5 min – 5:10 Job 2 executes actions [X,Y,Z]

I may be wrong but just wanted to lay out my thinking.

Take this weird state as an example

{
        "name": "foobar",
        "actions": [
          {
            "read_only": {},
            "notification": { ... },
            "replica_count": { ... },
            "snapshot": { ... },
          }
        ],
        "transitions": [
          {
            "state_name": "baz"
          }
        ]
      }

When ISM enters into this state for an index, what will happen is:

  1. Execute read_only action
  2. job_interval time elapses
  3. Execute notification action
  4. job_interval time elapses
  5. Execute replica_count action
  6. job_interval time elapses
  7. Execute snapshot action
  8. job_interval time elapses
  9. Check transition conditions (in this case always true, so set transition_to to “baz”)
  10. job_interval time elapses
  11. Do the first action in “baz” state