Custom ml-common

Is there a way to create custom plugins based on the ml-common plugin? For example, can I create some logic or API (not necessarily ml model logic) and have it executed in ml nodes?

Hi Hasan, in 2.4 we’re going to be releasing a framework that will allow you to upload ML models that you train externally into OpenSearch and use them as part of your ingest pipelines and for supporting semantic search queries.

Extensibility is something that we plan to improve for ML workloads. For instance, we’re looking to provide users with the ability to integrate their preferred model serving technology (eg. TorchServe, Triton or Tensorflow Serve) that is running off the OpenSearch cluster to support various ML use cases.

Can you describe your use case? What is the solution you are planning to build?

1 Like

@dylan thanks for your response. My case is to build an API for extracting entities from existing attributes in an index or from an API parameter and return the outcome or store it on an index attribute. I could have done this outside but I want to benefit from the existing nodes running OpenSearch and also benefit from other aspects such as security.

Hi Hasan,

thanks, for sharing your use case. Can you share the type of entities you’re looking to extract and the type of data that you have indexed?

If you’re looking to use a name-entity-recognition model, is this something your team has experience training and tuning?


In fact I am not looking for an ML trained Model here but rather for a Java (or python) nlp-based logic. So if I wanna build something like jaccard similarity functionality between multiple indexed documents would it be possible to extend ml-common to do that?

Hi Hasan, at the moment, we don’t support extending ml-commons in the way that will support your use case. However, we are building out a roadmap to make ml-commons more extensible.

In the interim, you’ll have to perform your similarity scoring process at the application layer, or you will have to build a custom plugin.

Thanks you for sharing your use case. We’ll take note of it as we plan for improving extensibility of ML workloads on OpenSearch.