Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): Version 3.6
Describe the issue:
I cannot not deploy a local pretrained or local custom model. I am running an OpenSeach Cluster with 3 Nodes inside K8s Cluster using the Official OpenSearch HelmCharts. However when I try to register and deploy a local model the deployment fails (see logs below). I do have the same issue no matter if I use a pretrained OpenSearch Model or a custom one.
Are there any components missing/needs to be installed to have a model controller available?
Configuration:
POST /_plugins/_ml/models/_register
{
“name”: “huggingface/sentence-transformers/all-distilroberta-v1”,
“version”: “1.0.2”,
“model_format”: “ONNX”,
“model_group_id”: “R6FHtZ0BXmIlsP6jWHUm”
}
POST /_plugins/_ml/models/5wyHuZ0B0VCIyCPOv0cY/_deploy
Relevant Logs or Screenshots:
[2026-04-23T08:56:36,404][ERROR][o.o.m.m.MLModelManager ] [opensearch-cluster-master-0] No controller is deployed because the model 5wyHuZ0B0VCIyCPOv0cY is expected not having an enabled model controller. Please use the create model controller api to create one if this is unexpected.
[2026-04-23T08:56:36,413][INFO ][o.o.r.m.c.i.LocalClusterIndicesClient] [opensearch-cluster-master-0] Updating y52OuZ0BY-yMWhefKmhF from .plugins-ml-task
[2026-04-23T08:56:36,417][ERROR][o.o.m.m.MLModelManager ] [opensearch-cluster-master-2] No controller is deployed because the model 5wyHuZ0B0VCIyCPOv0cY is expected not having an enabled model controller. Please use the create model controller api to create one if this is unexpected.
[2026-04-23T08:56:36,431][ERROR][o.o.m.m.MLModelManager ] [opensearch-cluster-master-1] No controller is deployed because the model 5wyHuZ0B0VCIyCPOv0cY is expected not having an enabled model controller. Please use the create model controller api to create one if this is unexpected.
It is the first OpenSearch Version in which I used the ML Features. I just tried it with a pretrained model and it now seems to work - allthough the log is stating the same error: “No controller is deployed…” But it seems to heal and can deploy the pretrained local model successfully.
My own custom local model does not work and runs in an timeout after 600 seconds with the logs saying constantly “No controller is deployed…”. Maybe there is something wrong with the custom model? Nevertheless I could register the model without any problems.
@pythagoras I’ve done some testing. I get the same error with TORCH_SCRIPT models.
This is not a breaking error. OpenSearch ml plugin will deploy the model and successfully use it.
As per OpenSearch documentation, the controller is used to rate limit per user.
It is not mandatory to use it.
It seems to me that it has something to do with a timeout. Is there some kind of timeout when deploying models? I just tested another pre trained OpenSearch model:
register worked for me with no problem. But deploying it partially failed. It was deployed only on 2 nodes and stucks in the third node with “Not Responding”.
Thanks for your help. I just triggered the _deploy API Endpunkt again for the only partially deployed model. Now is is successfullly deployed on all 3 nodes inside my cluster.
I will try to set up the timeout. I do have the assumption that this is the reason my custom model could not be deployed successfully as it is double the size that the pretrained models. Maybe that`s the reason it never succeeded.