Error When Loading Embedding Model Into Memory

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

  • AWS OpenSearch Service 2.7 Dashboard
  • Google Chrome

Describe the issue:

I’m attempting to load a pretrained embedding model into memory using:

POST /_plugins/_ml/models/_register
{
“name”: “huggingface/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2”,
“version”: “1.0.1”,
“model_format”: “TORCH_SCRIPT”
}

This completes successfully and I’m provided with a model ID when I run this:

GET /_plugins/_ml/tasks/k4w9r4kBBiUmBL-z11AC

Response:

{
“model_id”: “lIw9r4kBBiUmBL-z3FC-”,
“task_type”: “DEPLOY_MODEL”,
“function_name”: “TEXT_EMBEDDING”,
“state”: “COMPLETED”,
“worker_node”: [
“l8xgkWdfQNGhuez3MPMAqw”
],
“create_time”: 1690862212827,
“last_update_time”: 1690862310966,
“is_async”: true
}

However, when I try to load this model into memory using this:

POST /_plugins/_ml/models/lIw9r4kBBiUmBL-z3FC-/_load

I’m given this response:

{
“model_id”: “lIw9r4kBBiUmBL-z3FC-”,
“task_type”: “DEPLOY_MODEL”,
“function_name”: “TEXT_EMBEDDING”,
“state”: “FAILED”,
“worker_node”: [
“l8xgkWdfQNGhuez3MPMAqw”,
“NNPgI_0_QUaENi_weYIkNQ”
],
“create_time”: 1690863161315,
“last_update_time”: 1690863312176,
“error”: “”“{“l8xgkWdfQNGhuez3MPMAqw”:“model content changed”,“NNPgI_0_QUaENi_weYIkNQ”:“model content changed”}”“”,
“is_async”: true
}

I cannot find any information regarding what this error message means. Also, I was able to successfully load this model before and use it to create vector embeddings which I still have saved in an index.

Are you running on MacOS? There is a known issue [BUG]Model content hash can't match original hash value · Issue #844 · opensearch-project/ml-commons · GitHub

No, this in being run on AWS. I saw this post your reference as well, but I think it is regarding something different.

Got it, can you share your cluster settings? Like how many data nodes, the EC2 instance types. We need to reproduce the error to dive deep.

NOTE: I just changed this to 4 nodes but it was 2 nodes during the issue outlined.

Did you use dedicated master node?

NOTE: I just changed this to 4 nodes but it was 2 nodes during the issue outlined.

Changing to 4 nodes can solve the problem?

The problem seems intermittent – sometimes the model load successfully, other times not. I’ve also tried using the “_deploy” method rather than “_load” which I read somewhere as an alternative approach, but it doesn’t seem to matter.

I was able to reproduce the issue in my end. In my case first time when invoked _load api, it was partially loaded and I was able to generate embedding. But when I invoked _unload api and then invoke _load api, I was able to see the issue.

I’ll try to deep dive more into this issue.

In the meantime, I tried to reproduce this issue with bigger instances, but couldn’t reproduce there.

We are transitioning to _deploy from _load. In the long run _load will be deprecated.

On a related note, another thing I’ve noticed is that sometimes a model loaded into memory (verified using GET /_plugins/_ml/profile/models) will sometimes seemingly _unload on its own after a given amount of time (again, verified using GET /_plugins/_ml/profile/models with an empty response of {}).

  • Is this behavior expected?
  • Is this due to idle time?
  • Is it best practice to _unload a model after use?

@dhrubo may help reproduce and dive deep for this problem too.

I haven’t done any research yet. But I guess that may be caused by small EC2 instance type. t3.small.search has just 2 vCPU, and 2 GB memory. That looks too constrained to run model. Guess that maybe cause some unexpected error and model unloaded.

@ylwu Yes, I upgraded the instance and it seems to be performing better without throwing the error this time.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.