Which rerank models are supported by OS 2.12.0 and how to deploy them?

asfoorial · February 27, 2024, 1:03pm

Hi all,

Which rerank models (in huggingface) are supported by OS 2.12.0 and how to deploy them?

Your response is highly appreciated.

dhrubo · February 27, 2024, 5:59pm

Currently we have two pre-trained models in our model server. You can register these two models:

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/cross-encoders/ms-marco-MiniLM-L-6-v2",
  "version": "1.0.2",
  "model_format": "torch_script" //onnx will work too
}

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/cross-encoders/ms-marco-MiniLM-L-12-v2",
  "version": "1.0.2",
  "model_format": "torch_script" //onnx will work too
}

Deploying the models will be similar like sentence embedding models.

2 articles to follow:

You can also bring your own traced re-ranker BGE model. A related PR to look at: add cross-encoder tracing, config-generating, and uploading by HenryL27 · Pull Request #375 · opensearch-project/opensearch-py-ml · GitHub

Please let me know if you have any further questions.

Thanks
Dhrubo

asfoorial · February 27, 2024, 7:04pm

Great! Thanks.

When are we expecting the new release of opensearch_py_ml?

asfoorial · March 3, 2024, 7:32am

The new script in add cross-encoder tracing, config-generating, and uploading by HenryL27 · Pull Request #375 · opensearch-project/opensearch-py-ml · GitHub did work for re-ranker BGE model but failed to work for

I tried both torch_script and onnx. The torch_script failed to continue saving the trace giving
Could not export Python function call ‘XSoftmax’. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to constants:

While the onnx saved the onnx and zip files but failed during prediction giving
{
“error”: {
“root_cause”: [
{
“type”: “translate_exception”,
“reason”: “translate_exception: java.lang.IllegalArgumentException: Input mismatch, looking for: [input_ids, attention_mask]”
}
],
“type”: “m_l_exception”,
“reason”: “m_l_exception: Failed to inference TEXT_SIMILARITY model: ZwQzA44BK0CtZG_zHaFw”,
“caused_by”: {
“type”: “privileged_action_exception”,
“reason”: “privileged_action_exception: null”,
“caused_by”: {
“type”: “translate_exception”,
“reason”: “translate_exception: java.lang.IllegalArgumentException: Input mismatch, looking for: [input_ids, attention_mask]”,
“caused_by”: {
“type”: “illegal_argument_exception”,
“reason”: “Input mismatch, looking for: [input_ids, attention_mask]”
}
}
}
},
“status”: 500
}

dhrubo · March 3, 2024, 7:05pm

@asfoorial Thanks for looking into this. Could you please comment in the PR?

system · May 2, 2024, 7:05pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to deploy a reranker model from huggingface in OS 2.12.0? Machine Learning	1	363	April 25, 2024
Pretrained Model Download/Register Fails for TorchScript Machine Learning	1	210	September 27, 2024
Uploading a sentence transformer model of medical domain to OpenSearch Machine Learning troubleshoot	2	398	January 29, 2024
How to deploy Model2Vec Embedding Models Machine Learning	2	74	December 30, 2024
[Feedback] Machine Learning Model Serving Framework - Experimental Release General Feedback releases	48	2944	July 12, 2023

Which rerank models are supported by OS 2.12.0 and how to deploy them?

Related topics