Since we enabled the ml-plugin and deployed a model we see a high usage of 8 query per minute even on inactive hours like when no user is performing actions. Before enabling the plugin the search rate would be 0 or near 0 on inactive hours.
Checking the Query Insight plugin I can see multiple queries per minute to the .plugins-ml-model index. All of them are the same:
Is this behavior normal? Is there a way to change the rate of this query? I don’t understand why would it be necessary even on a not so used cluster to inquiry something so regularly. If it’s not normal behavior what could be causing it?
(Sorry for the clumped screenshot but as a new user I’m only permitted to post a single image per post… Which undermines my ability to provide good information and get help… But okay…)
@pybot The query that is being performed repetitively is this one.
The model is used to vectorize query on search and ingest pipeline for indexing the vectors. We use a opensearch available pre-trained model. We did not explicitly use batch_predict I’m not sure if anything could trigger it indirectly though.
I have not configured this manually when deploying the model nor did I made any change to my amazon opensearch config. As this have a default value of 3 shouldn’t it be a problem for every user? So is this normal behavior?
also what are the impacts of disabling it? Should I not disable it?
@sousu - this SyncUp job is running to maintain your ml model status across the cluster in data nodes. By default it’s running every 10 seconds. In each run, it would query the ml-model index to get the model status and sync up the status for all the nodes. This is an expected behavior by design. If this query bothers you, please increase the interval through this setting plugins.ml_commons.sync_up_job_interval_in_seconds. ML Commons cluster settings - OpenSearch Documentation
I configured it to 0 and can confirm that query rate immediately dropped. I have a 1 node cluster so as I understand I don’t need to sync it? So I could have it disable with no drawbacks? Maybe it could be disabled by default when discovery.type is set to single-node?