Which rerank models are supported by OS 2.12.0 and how to deploy them?

Hi all,

Which rerank models (in huggingface) are supported by OS 2.12.0 and how to deploy them?

Your response is highly appreciated.

Hi @asfoorial,

Currently we have two pre-trained models in our model server. You can register these two models:

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/cross-encoders/ms-marco-MiniLM-L-6-v2",
  "version": "1.0.2",
  "model_format": "torch_script" //onnx will work too
}
POST /_plugins/_ml/models/_register
{
  "name": "huggingface/cross-encoders/ms-marco-MiniLM-L-12-v2",
  "version": "1.0.2",
  "model_format": "torch_script" //onnx will work too
}

Deploying the models will be similar like sentence embedding models.

2 articles to follow:

  1. Reranking search results - OpenSearch Documentation

  2. ml-commons/docs/tutorials/rerank/rerank_pipeline_with_Cohere_Rerank_model.md at main · opensearch-project/ml-commons · GitHub

You can also bring your own traced re-ranker BGE model. A related PR to look at: add cross-encoder tracing, config-generating, and uploading by HenryL27 · Pull Request #375 · opensearch-project/opensearch-py-ml · GitHub

Please let me know if you have any further questions.

Thanks
Dhrubo

Great! Thanks.

When are we expecting the new release of opensearch_py_ml?

1 Like

The new script in add cross-encoder tracing, config-generating, and uploading by HenryL27 · Pull Request #375 · opensearch-project/opensearch-py-ml · GitHub did work for re-ranker BGE model but failed to work for

I tried both torch_script and onnx. The torch_script failed to continue saving the trace giving
Could not export Python function call ‘XSoftmax’. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to constants:

While the onnx saved the onnx and zip files but failed during prediction giving
{
“error”: {
“root_cause”: [
{
“type”: “translate_exception”,
“reason”: “translate_exception: java.lang.IllegalArgumentException: Input mismatch, looking for: [input_ids, attention_mask]”
}
],
“type”: “m_l_exception”,
“reason”: “m_l_exception: Failed to inference TEXT_SIMILARITY model: ZwQzA44BK0CtZG_zHaFw”,
“caused_by”: {
“type”: “privileged_action_exception”,
“reason”: “privileged_action_exception: null”,
“caused_by”: {
“type”: “translate_exception”,
“reason”: “translate_exception: java.lang.IllegalArgumentException: Input mismatch, looking for: [input_ids, attention_mask]”,
“caused_by”: {
“type”: “illegal_argument_exception”,
“reason”: “Input mismatch, looking for: [input_ids, attention_mask]”
}
}
}
},
“status”: 500
}

@asfoorial Thanks for looking into this. Could you please comment in the PR?