TorchScript interpreter exceptions during index operations

dswitzer2 · December 13, 2023, 1:41pm

I apologize in advance, because I’m just now getting into ML so this is all very new and I’m still trying to really get a firm handle on things.

I’ve followed the various OpenSearch tutorials and have configured my OpenSearch 2.11.1 instance with a very basic setup to test semantic searches. I’ve got the following configured, which appears (for the most part) to be working correctly:

Have a model group configured for my models
Have a single model (huggingface/sentence-transformers/msmarco-distilbert-base-tas-b) configured in the model group
Created my ingest pipeline to map which fields need vector embeddings
Created my index which uses the ingest pipeline
Have my search pipeline created to normalize “hybrid” search results
Can successfully perform searches against indexed documents.

However, when I try to bulk index content, some of my documents are failing to be indexed. In examining the output from my bulk index operations, what I’m seeing failures in the inference stage. When I send my bulk index request, for articles that fail, I’m getting something like the following error:

{
  "took" : 0,
  "ingest_took" : 45,
  "errors" : true,
  "items" : [
    {
      "index" : {
        "_index" : "ml_semantic_search_testing",
        "_id" : "60329",
        "status" : 500,
        "error" : {
          "type" : "m_l_exception",
          "reason" : "Failed to inference TEXT_EMBEDDING model: cvZ8X4wBErSzX7VK46s0",
          "caused_by" : {
            "type" : "privileged_action_exception",
            "reason" : null,
            "caused_by" : {
              "type" : "translate_exception",
              "reason" : "ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript, serialized code (most recent call last):\n  File \"code/__torch__/sentence_transformers/SentenceTransformer.py\", line 14, in forward\n    input_ids = input[\"input_ids\"]\n    mask = input[\"attention_mask\"]\n    _2 = (_0).forward(input_ids, mask, )\n          ~~~~~~~~~~~ <--- HERE\n    _3 = {\"input_ids\": input_ids, \"attention_mask\": mask, \"token_embeddings\": _2, \"sentence_embedding\": (_1).forward(_2, )}\n    return _3\n  File \"code/__torch__/sentence_transformers/models/Transformer.py\", line 11, in forward\n    mask: Tensor) -> Tensor:\n    auto_model = self.auto_model\n    _0 = (auto_model).forward(input_ids, mask, )\n          ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 13, in forward\n    transformer = self.transformer\n    embeddings = self.embeddings\n    _0 = (transformer).forward((embeddings).forward(input_ids, ), mask, )\n                                ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\nclass Embeddings(Module):\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 38, in forward\n    _3 = (word_embeddings).forward(input_ids, )\n    _4 = (position_embeddings).forward(input, )\n    input0 = torch.add(_3, _4)\n             ~~~~~~~~~ <--- HERE\n    _5 = (dropout).forward((LayerNorm).forward(input0, ), )\n    return _5\n\nTraceback of TorchScript, original code (most recent call last):\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(130): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(578): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/usr/local/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py(66): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/container.py(204): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(976): trace_module\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(759): trace\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/opensearch_py_ml/ml_models/sentencetransformermodel.py(778): save_as_pt\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/test.py(34): <module>\nRuntimeError: The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1\n",
              "caused_by" : {
                "type" : "engine_exception",
                "reason" : "The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript, serialized code (most recent call last):\n  File \"code/__torch__/sentence_transformers/SentenceTransformer.py\", line 14, in forward\n    input_ids = input[\"input_ids\"]\n    mask = input[\"attention_mask\"]\n    _2 = (_0).forward(input_ids, mask, )\n          ~~~~~~~~~~~ <--- HERE\n    _3 = {\"input_ids\": input_ids, \"attention_mask\": mask, \"token_embeddings\": _2, \"sentence_embedding\": (_1).forward(_2, )}\n    return _3\n  File \"code/__torch__/sentence_transformers/models/Transformer.py\", line 11, in forward\n    mask: Tensor) -> Tensor:\n    auto_model = self.auto_model\n    _0 = (auto_model).forward(input_ids, mask, )\n          ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 13, in forward\n    transformer = self.transformer\n    embeddings = self.embeddings\n    _0 = (transformer).forward((embeddings).forward(input_ids, ), mask, )\n                                ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\nclass Embeddings(Module):\n  File \"code/__torch__/transformers/models/distilbert/modeling_distilbert.py\", line 38, in forward\n    _3 = (word_embeddings).forward(input_ids, )\n    _4 = (position_embeddings).forward(input, )\n    input0 = torch.add(_3, _4)\n             ~~~~~~~~~ <--- HERE\n    _5 = (dropout).forward((LayerNorm).forward(input0, ), )\n    return _5\n\nTraceback of TorchScript, original code (most recent call last):\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(130): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/modeling_distilbert.py(578): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/usr/local/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py(66): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/container.py(204): forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(976): trace_module\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/_trace.py(759): trace\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/opensearch_py_ml/ml_models/sentencetransformermodel.py(778): save_as_pt\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/test.py(34): <module>\nRuntimeError: The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1\n"
              }
            }
          }
        }
      }
    }
  ]
}

Based on that error stack, it seems the root problem is with the TorchScript interpreter:

The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1

From what I can tell, this is because one of the sentences in the document being indexed is too long.

I was under the impression that this was something that OpenSearch would be managing under the hood, but perhaps I have a problem in my configuration or understanding. I’m seeing this kind of problem in about 5-10% of the documents I’m trying to index.

How should I go about resolving these kinds of issues?

ylwu · December 13, 2023, 3:33pm

Seems you are using 1.0.1 version of model “huggingface/sentence-transformers/msmarco-distilbert-base-tas-b” which doesn’t support auto-truncation. From the error, your input text is longer than the model’s max token input 512.

Suggest try the latest version of this model: 1.0.2, you can refer to https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#sentence-transformers to check if the model support auto-truncation or not

dswitzer2 · December 13, 2023, 4:44pm

Thanks for the response!

When I try to register v1.0.2 using the following:

POST /_plugins/_ml/models/_register
{
	"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
	"version": "1.0.2",
	"model_group_id": "X_b9XowBErSzX7VKo6sU",
	"model_format": "TORCH_SCRIPT"
}

When I try checking the task, I’m seeing:

{
  "task_type" : "REGISTER_MODEL",
  "function_name" : "TEXT_EMBEDDING",
  "state" : "FAILED",
  "worker_node" : [
    "yx9Jh9u1R6SfrxFSbboQ6A"
  ],
  "create_time" : 1702485387084,
  "last_update_time" : 1702485387357,
  "error" : "This model is not in the pre-trained model list, please check your parameters.",
  "is_async" : true
}

So while the Sentence transformers page says 1.0.2 is the version, it appears 1.0.1 is the latest version. The docs also state it should support Auto-truncation.

I’m only choose msmarco-distilbert-base-tas-b as a starting point, because the OpenSearch documentation indicated that gave them the best results (the docs also state it supports up to 768 max input).

Do you have recommendations for a better model to use?

ylwu · December 13, 2023, 5:10pm

@dhrubo Seems we haven’t added 1.0.2 to pretrained model list file for model huggingface/sentence-transformers/msmarco-distilbert-base-tas-b.

@dswitzer2 You can try huggingface/sentence-transformers/all-mpnet-base-v2 , 1.0.1

dswitzer2 · December 13, 2023, 7:00pm

@ylwu,

Thanks for the suggestion. The huggingface/sentence-transformers/all-mpnet-base-v2 model seems to be working (basically) like I was expecting.

One situation I did run into was some of the documents I was trying to index actually ended up having empty content.

Is there a way to tell the index that it’s okay to just ignore empty content when creating the embedding instead of haivng it fail the index operation?

ylwu · December 13, 2023, 7:29pm

For now, we don’t have auto way to filter out the empty content. The easy way is to filter out the doc with empty content before ingesting.

Feel free to create a Github issue if you think filtering empty content is a good feature. Issues · opensearch-project/ml-commons · GitHub

dswitzer2 · December 13, 2023, 7:51pm

My only issue is that I may have other fields in the document that do need to be indexed, so I can’t skip the entire document. I really need a way to just ignore the vector embedding mapping when the specific field value is empty.

Any other suggestions?

ylwu · December 13, 2023, 8:22pm

Got it, @Navneet Do you know if it’s possible to skip the field with empty content in ingest processor today ?

If not, I think we can build a feature to support this

Navneet · December 13, 2023, 8:31pm

You mean empty string and not null string right?

@ylwu

dswitzer2 · December 13, 2023, 8:55pm

@Navneet,

I did mean empty string, not null, but I can just null when the property contains no information. That should work for my use case and that appears to work just fine.

Thanks for the feedback!

dhrubo · December 13, 2023, 9:16pm

@dhrubo Seems we haven’t added 1.0.2 to pretrained model list file for model huggingface/sentence-transformers/msmarco-distilbert-base-tas-b.

Model is released. @dswitzer2 you should be able to register this model now:

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.2",
  "model_format": "TORCH_SCRIPT"
}

Please let me know if you see any issue. Thanks.

Navneet · December 13, 2023, 9:54pm

Yes if you set null it will work.

dswitzer2 · December 13, 2023, 9:59pm

@dhrubo,

Thanks so much! I can now register msmarco-distilbert-base-tas-b v1.0.2 and it resolves the errors I was having with v1.0.1 in TorchScript!!!

system · February 11, 2024, 10:00pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Feedback] Neural Search plugin - experimental release General Feedback releases	42	3693	July 18, 2023
Vectorizing big chunk of data returns errors Machine Learning	3	284	April 12, 2024
Indexing Causing Index to Enter Red State k-NN	12	1317	September 4, 2020
Model weights for sparse encoders Machine Learning	20	630	February 9, 2024
[Feedback] Machine Learning Model Serving Framework - Experimental Release General Feedback releases	48	3090	July 12, 2023

TorchScript interpreter exceptions during index operations

Related topics