OpenSearch Plugin for Name-Entity-recognition (NER) - specifically around extracting names of people

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

OpenSearch 2.15

Describe the issue:

I am exploring OpenSearch plugins that will integrate with OpenSearch to perform Name-Entity-recognition (NER)
I have a use case where I have text (english) and I want to extract names of people in that text.
The names are names of people in Unites States.
These names can be present within sentences or as a name-only-list sentence.
Does OpenSearch support any plugin or integration for the machine learning use case?

Configuration:

3 node deployment

Relevant Logs or Screenshots:

Hi @classmates_dev ,

This is possible using the ML commons plugin. Info on it can be found here - https://opensearch.org/blog/explore-opensearch-2-14/#:\~:text=This%20release%20introduces%20an%20ML,model%20to%20enrich%20your%20pipeline.

Leeroy.

1 Like

Thanks for your reply @Leeroy
I am looking for a specific ML model plugin that I can integrate with OpenSearch for NER

Hi @classmates_dev, you can use NER model in ml-commons plugin, check out the repo GitHub - opensearch-project/ml-commons: ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.

You can try deploy a NER model in SageMaker or other remote model services. then use the NER model during ingest or search using ml inference processors, here are some steps for NER.

steps:

POST /_plugins/_ml/connectors/_create
{
  "name": "Sagemaker NER model connector",
  "description": "Connector for NER model dslim-bert-base-NER",
  "version": 1,
  "protocol": "aws_sigv4",
  "parameters": {
    "region": "us-east-1",
    "service_name": "sagemaker"
  },
   "credential": {
     "access_key": "<access_key>",
    "secret_key": "<secret_key>",
    "session_token": "<session_token>"    
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/dslim-bert-base-NER/invocations",
      "headers": {
        "content-type": "application/json"
      },
      "request_body": """{"inputs":"${parameters.inputs}"}"""
    }
  ]
}

 


POST /_plugins/_ml/models/_register
{
  "name": "dslim-bert-base-NER",
  "version": "1.0.1",
  "function_name": "remote",
  "description": "test remote NER model",
  "connector_id": "7ySVK48BrpZG38I5wojT"
} 


POST /_plugins/_ml/models/9CSVK48BrpZG38I52oj4/_deploy

POST /_plugins/_ml/models/9CSVK48BrpZG38I52oj4/_predict
{
  "parameters": {
    "inputs": "My name is Sarah Jessica Parker but you can call me Jessica"
  }
}

There are some similar use case using remote models and ml inference processors, you can refer here: https://github.com/opensearch-project/ml-commons/tree/main/docs/tutorials/ml_inference