TorchScript interpreter exceptions during index operations

I apologize in advance, because I’m just now getting into ML so this is all very new and I’m still trying to really get a firm handle on things.

I’ve followed the various OpenSearch tutorials and have configured my OpenSearch 2.11.1 instance with a very basic setup to test semantic searches. I’ve got the following configured, which appears (for the most part) to be working correctly:

  • Have a model group configured for my models
  • Have a single model (huggingface/sentence-transformers/msmarco-distilbert-base-tas-b) configured in the model group
  • Created my ingest pipeline to map which fields need vector embeddings
  • Created my index which uses the ingest pipeline
  • Have my search pipeline created to normalize “hybrid” search results
  • Can successfully perform searches against indexed documents.

However, when I try to bulk index content, some of my documents are failing to be indexed. In examining the output from my bulk index operations, what I’m seeing failures in the inference stage. When I send my bulk index request, for articles that fail, I’m getting something like the following error:

  "took" : 0,
  "ingest_took" : 45,
  "errors" : true,
  "items" : [
      "index" : {
        "_index" : "ml_semantic_search_testing",
        "_id" : "60329",
        "status" : 500,
        "error" : {
          "type" : "m_l_exception",
          "reason" : "Failed to inference TEXT_EMBEDDING model: cvZ8X4wBErSzX7VK46s0",
          "caused_by" : {
            "type" : "privileged_action_exception",
            "reason" : null,
            "caused_by" : {
              "type" : "translate_exception",
              "reason" : "ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript, serialized code (most recent call last):\n  File \"code/__torch__/sentence_transformers/\", line 14, in forward\n    input_ids = input[\"input_ids\"]\n    mask = input[\"attention_mask\"]\n    _2 = (_0).forward(input_ids, mask, )\n          ~~~~~~~~~~~ <--- HERE\n    _3 = {\"input_ids\": input_ids, \"attention_mask\": mask, \"token_embeddings\": _2, \"sentence_embedding\": (_1).forward(_2, )}\n    return _3\n  File \"code/__torch__/sentence_transformers/models/\", line 11, in forward\n    mask: Tensor) -> Tensor:\n    auto_model = self.auto_model\n    _0 = (auto_model).forward(input_ids, mask, )\n          ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\n  File \"code/__torch__/transformers/models/distilbert/\", line 13, in forward\n    transformer = self.transformer\n    embeddings = self.embeddings\n    _0 = (transformer).forward((embeddings).forward(input_ids, ), mask, )\n                                ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\nclass Embeddings(Module):\n  File \"code/__torch__/transformers/models/distilbert/\", line 38, in forward\n    _3 = (word_embeddings).forward(input_ids, )\n    _4 = (position_embeddings).forward(input, )\n    input0 = torch.add(_3, _4)\n             ~~~~~~~~~ <--- HERE\n    _5 = (dropout).forward((LayerNorm).forward(input0, ), )\n    return _5\n\nTraceback of TorchScript, original code (most recent call last):\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/usr/local/lib/python3.9/site-packages/sentence_transformers/models/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/ trace_module\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/ trace\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/opensearch_py_ml/ml_models/ save_as_pt\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/ <module>\nRuntimeError: The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1\n",
              "caused_by" : {
                "type" : "engine_exception",
                "reason" : "The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript, serialized code (most recent call last):\n  File \"code/__torch__/sentence_transformers/\", line 14, in forward\n    input_ids = input[\"input_ids\"]\n    mask = input[\"attention_mask\"]\n    _2 = (_0).forward(input_ids, mask, )\n          ~~~~~~~~~~~ <--- HERE\n    _3 = {\"input_ids\": input_ids, \"attention_mask\": mask, \"token_embeddings\": _2, \"sentence_embedding\": (_1).forward(_2, )}\n    return _3\n  File \"code/__torch__/sentence_transformers/models/\", line 11, in forward\n    mask: Tensor) -> Tensor:\n    auto_model = self.auto_model\n    _0 = (auto_model).forward(input_ids, mask, )\n          ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\n  File \"code/__torch__/transformers/models/distilbert/\", line 13, in forward\n    transformer = self.transformer\n    embeddings = self.embeddings\n    _0 = (transformer).forward((embeddings).forward(input_ids, ), mask, )\n                                ~~~~~~~~~~~~~~~~~~~ <--- HERE\n    return _0\nclass Embeddings(Module):\n  File \"code/__torch__/transformers/models/distilbert/\", line 38, in forward\n    _3 = (word_embeddings).forward(input_ids, )\n    _4 = (position_embeddings).forward(input, )\n    input0 = torch.add(_3, _4)\n             ~~~~~~~~~ <--- HERE\n    _5 = (dropout).forward((LayerNorm).forward(input0, ), )\n    return _5\n\nTraceback of TorchScript, original code (most recent call last):\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/transformers/models/distilbert/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/usr/local/lib/python3.9/site-packages/sentence_transformers/models/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _slow_forward\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/nn/modules/ _call_impl\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/ trace_module\n/Users/dhrubo/Library/Python/3.9/lib/python/site-packages/torch/jit/ trace\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/opensearch_py_ml/ml_models/ save_as_pt\n/Volumes/workplace/opensearch-py-ml/src/opensearch-py-ml/ <module>\nRuntimeError: The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1\n"

Based on that error stack, it seems the root problem is with the TorchScript interpreter:

The size of tensor a (611) must match the size of tensor b (512) at non-singleton dimension 1

From what I can tell, this is because one of the sentences in the document being indexed is too long.

I was under the impression that this was something that OpenSearch would be managing under the hood, but perhaps I have a problem in my configuration or understanding. I’m seeing this kind of problem in about 5-10% of the documents I’m trying to index.

How should I go about resolving these kinds of issues?

Seems you are using 1.0.1 version of model “huggingface/sentence-transformers/msmarco-distilbert-base-tas-b” which doesn’t support auto-truncation. From the error, your input text is longer than the model’s max token input 512.

Suggest try the latest version of this model: 1.0.2, you can refer to to check if the model support auto-truncation or not

Thanks for the response!

When I try to register v1.0.2 using the following:

POST /_plugins/_ml/models/_register
	"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
	"version": "1.0.2",
	"model_group_id": "X_b9XowBErSzX7VKo6sU",
	"model_format": "TORCH_SCRIPT"

When I try checking the task, I’m seeing:

  "task_type" : "REGISTER_MODEL",
  "function_name" : "TEXT_EMBEDDING",
  "state" : "FAILED",
  "worker_node" : [
  "create_time" : 1702485387084,
  "last_update_time" : 1702485387357,
  "error" : "This model is not in the pre-trained model list, please check your parameters.",
  "is_async" : true

So while the Sentence transformers page says 1.0.2 is the version, it appears 1.0.1 is the latest version. The docs also state it should support Auto-truncation.

I’m only choose msmarco-distilbert-base-tas-b as a starting point, because the OpenSearch documentation indicated that gave them the best results (the docs also state it supports up to 768 max input).

Do you have recommendations for a better model to use?

@dhrubo Seems we haven’t added 1.0.2 to pretrained model list file for model huggingface/sentence-transformers/msmarco-distilbert-base-tas-b.

@dswitzer2 You can try huggingface/sentence-transformers/all-mpnet-base-v2 , 1.0.1


Thanks for the suggestion. The huggingface/sentence-transformers/all-mpnet-base-v2 model seems to be working (basically) like I was expecting.

One situation I did run into was some of the documents I was trying to index actually ended up having empty content.

Is there a way to tell the index that it’s okay to just ignore empty content when creating the embedding instead of haivng it fail the index operation?

For now, we don’t have auto way to filter out the empty content. The easy way is to filter out the doc with empty content before ingesting.

Feel free to create a Github issue if you think filtering empty content is a good feature. Issues · opensearch-project/ml-commons · GitHub

My only issue is that I may have other fields in the document that do need to be indexed, so I can’t skip the entire document. I really need a way to just ignore the vector embedding mapping when the specific field value is empty.

Any other suggestions?

Got it, @Navneet Do you know if it’s possible to skip the field with empty content in ingest processor today ?

If not, I think we can build a feature to support this

You mean empty string and not null string right?



I did mean empty string, not null, but I can just null when the property contains no information. That should work for my use case and that appears to work just fine.

Thanks for the feedback!

@dhrubo Seems we haven’t added 1.0.2 to pretrained model list file for model huggingface/sentence-transformers/msmarco-distilbert-base-tas-b.

Model is released. @dswitzer2 you should be able to register this model now:

POST /_plugins/_ml/models/_register
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.2",
  "model_format": "TORCH_SCRIPT"

Please let me know if you see any issue. Thanks.

Yes if you set null it will work.


Thanks so much! I can now register msmarco-distilbert-base-tas-b v1.0.2 and it resolves the errors I was having with v1.0.1 in TorchScript!!!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.