Deploying huggingface_transformers on opensearch cluster

Describe the issue:
I am trying to deploy on opensearch cluster a transformer model but not a sentence transformer model. The minimal code for saving a model in torchscript together with tokenizer is:

################################################################

Export to TorchScript

from transformers import BertTokenizer, BertConfig, BertModel

device = “cpu”
modelPreTrained = BertModel.from_pretrained(“bert-base-uncased”,torchscript=True)
modelPreTrained.to(device)
modelPreTrained.eval()
tokenizer = BertTokenizer.from_pretrained(‘bert-base-uncased’, do_lower_case=True)

text = “[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]”
tokenized_text = tokenizer.tokenize(text)

Masking one of the input tokens

masked_index = 8
tokenized_text[masked_index] = “[MASK]”
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

Creating a dummy input

tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

traced_model = torch.jit.trace(modelPreTrained, [tokens_tensor, segments_tensors])
torch.jit.save(traced_model, “traced_bert.pt”)

with open(“tokenizer.json”, ‘w’) as f:
json.dump(tokenizer.get_vocab(), f)
################################################################

After running the code I got to files: “traced_bert.pt” and “tokenizer.json” which I pack to a single zip file “TransformerModel.zip” and compute shasum:
shasum -a 256 TransformerModel.zip

Then I prepare a “config.json” file e.g.:
{
“name”: “my_model”,
“version”: “1.0.0”,
“model_format”: “TORCH_SCRIPT”,
“model_content_hash_value”: “dbb914064ed6cc9617d72747b00616865388a18dc9210dba381fa41be091b9f5”,
“model_config”: {
“model_type”: “bert”,
“embedding_dimension”: 768,
“framework_type”: “huggingface_transformers”
}
}

Given “TransformerModel.zip” and “config.json” I can register the model on opensearch cluster in some model group, which I have created earlier:

import opensearch_py_ml as oml
from opensearchpy import OpenSearch
from opensearch_py_ml.ml_commons import MLCommonClient

host = ‘sime.host.com
port = 3005
auth = (‘joedoe’, ‘qwerty’)

client = OpenSearch(
hosts = [{‘host’: host, ‘port’: port}],
http_auth = auth,
use_ssl = True,
verify_certs = False,
ssl_assert_hostname = False,
ssl_show_warn = False,
)

ml_client = MLCommonClient(client)

model_path = ‘./TransformerModel.zip’
model_config_path = ‘./config.json’

ml_client.register_model(model_path, model_config_path, model_group_id = “JzMOKJIBC9ZdJM8aKaCz”,isVerbose=True,deploy_model=False,wait_until_deployed=False)

The model is correctly uploaded to the cluster, registered and receives ID. Here is the end of the log from registering:
uploading chunk 39 of 41
Model id: {‘status’: ‘Uploaded’}
uploading chunk 40 of 41
Model id: {‘status’: ‘Uploaded’}
uploading chunk 41 of 41
Model id: {‘status’: ‘Uploaded’}
Model registered successfully
‘CTPrcpIBC9ZdJM8a-rme’

Now I go to the cluster and want to deploy the model. First I run:
GET /_plugins/_ml/models/CTPrcpIBC9ZdJM8a-rme

and get response:
{
“name”: “my_model”,
“model_group_id”: “JzMOKJIBC9ZdJM8aKaCz”,
“algorithm”: “TEXT_EMBEDDING”,
“model_version”: “6”,
“model_format”: “TORCH_SCRIPT”,
“model_state”: “REGISTERED”,
“model_content_size_in_bytes”: 405489551,
“model_content_hash_value”: “dbb914064ed6cc9617d72747b00616865388a18dc9210dba381fa41be091b9f5”,
“model_config”: {
“model_type”: “bert”,
“embedding_dimension”: 768,
“framework_type”: “HUGGINGFACE_TRANSFORMERS”
},
“created_time”: 1728504920734,
“total_chunks”: 41,
“is_hidden”: false
}

So I run deploying:
POST /_plugins/_ml/models/CTPrcpIBC9ZdJM8a-rme/_deploy

and got error “DEPLOY_FAILED”
{
“name”: “my_model”,
“model_group_id”: “JzMOKJIBC9ZdJM8aKaCz”,
“algorithm”: “TEXT_EMBEDDING”,
“model_version”: “6”,
“model_format”: “TORCH_SCRIPT”,
“model_state”: “DEPLOY_FAILED”,
“model_content_size_in_bytes”: 405489551,

Deploying sentence transformer models works well, according to opensearch-py-ml tutorial.

But how to correctly deploy transformer model like BertModel.from_pretrained(“bert-base-uncased”,torchscript=True)?

I will appreciate any suggestions.