Opensearch Job being disabled due to error "Failed validation - [The cluster is breaching the jvm usage threshold [85], cannot execute the transform]"

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

2.18.0

Describe the issue:

I have a transform job scheduled in my opensearch, it runs every 10 mins.
sometimes I observe the transform job getting disabled due to error:
“Failed validation - [The cluster is breaching the jvm usage threshold [85], cannot execute the transform]”

I think this happens when the data role node are executing some heavy ISM activities like forcemerge etc.

what I expect are:

  1. the transform job can be run on a specific node other than a data node. However define a “transform” node role doens’t really helps. after reading opensearch doc, i understand “transform” is not really a node role.

  2. alternativelly, if the data node jvm usage is higher than 85%, the transform job can skip the run, and try again in the next runtime.
    in anycase, the transform job shall not be disabled…

  3. I’d like to learn, if there’s a way to setup this [85] threadhold? if I can use any parameter to set it to 90 or 95 ?

Configuration:

Relevant Logs or Screenshots:

Hi @latituder,

On your Point #3, have you tried using indices.breaker.total.limit or setting indices.breaker.total.use_real_memory: false?

Best,
mj

thanks for your reply, but the “Circuit breaker” setting is the part i feel confuse, because my breaker settings are:

    "indices.breaker.fielddata.limit": "40%",
    "indices.breaker.fielddata.overhead": "1.03",
    "indices.breaker.fielddata.type": "memory",
    "indices.breaker.request.limit": "60%",
    "indices.breaker.request.overhead": "1.0",
    "indices.breaker.request.type": "memory",
    "indices.breaker.total.limit": "95%",
    "indices.breaker.total.use_real_memory": "true",
    "indices.breaker.type": "hierarchy",

so what I expected threadhold is 95% of jvm, but my transform job error was breaked at [85].
it looks like the circuit breaker setting doesn’t override the transform job’s jvm limitation setting?