Hi. I have a question. When i am trying to load simple integrated model through API in ElasticSeearch 2.6:
POST /_plugins/_ml/models/_upload
{
“name”: “huggingface/sentence-transformers/all-MiniLM-L12-v2”,
“version”: “1.0.1”,
“model_format”: “TORCH_SCRIPT”
}
Got response with the error:
{
“task_type”: “UPLOAD_MODEL”,
“function_name”: “TEXT_EMBEDDING”,
“state”: “FAILED”,
“worker_node”: [
“yBvD1Q_vSLi8si1SKUvm5Q”
],
“create_time”: 1681260277521,
“last_update_time”: 1681260277551,
“error”: “Native Memory Circuit Breaker is open, please check your resources!”,
“is_async”: true
}
Tasks are always executed on data nodes with 32GB RAM (28GB for JVM), 4TB HDD, and 16 CPU.
Data nodes are not loaded at all.
I think that these resources should be enough to load the model.
Who knows ? What is the reason for this error?
Thank you. This helped in solving this problem.
I set plugins.ml_commons.native_memory_threshold to 100 without restarting nodes. I assumed that this setting is dynamic. Аnd tried loading the model again.
But when i execute:
GET /_plugins/_ml/tasks/W2ikc4cB2GhI_wXsfqRg
The answer must be returned ModelId. After some time, I execute again:
GET /_plugins/_ml/tasks/W2ikc4cB2GhI_wXsfqRg
And I get response with the next error:
{
“task_type”: “UPLOAD_MODEL”,
“function_name”: “TEXT_EMBEDDING”,
“state”: “FAILED”,
“worker_node”: [
“G2_UQ118RcyJEbydg2HWtw”
],
“create_time”: 1681272372831,
“last_update_time”: 1681272503894,
“error”: “Connection timed out”,
“is_async”: true
}
Not sure why its timing out, I assume you restart Opensearch after making configuration. you have way more heap then myself. Kind wonder if there is a cache or something.
Not sure if this will help but here is what i did after making thos configs shown above.
The following example request uploads version 1.0.0 of a natural language processing (NLP) sentence transformation model named all-MiniLM-L6-v2:
POST /_plugins/_ml/models/_upload
{
"name": "all-MiniLM-L6-v2",
"version": "1.0.0",
"description": "test model",
"model_format": "TORCH_SCRIPT",
"model_config": {
"model_type": "bert",
"embedding_dimension": 384,
"framework_type": "sentence_transformers"
},
"url": "https://github.com/opensearch-project/ml-commons/raw/2.x/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_embedding/all-MiniLM-L6-v2_torchscript_sentence-transformer.zip?raw=true"
}
Then
OpenSearch responds with the task_id and task status:
{
"task_id" : "ew8I44MBhyWuIwnfvDIH",
"status" : "CREATED"
}
This example request uses the task_id from the upload example.
This is odd. Not sure whats going on. Looks like you have then upload already , judging from the screenshot. Have you check all your logs to find a clue?
EDIT: Maybe better yet , just an idea, remove those modules and start over?