[Feedback] Machine Learning Model Serving Framework - Experimental Release

Hi Hasan,

Agreed, I will take this feedback as something we should do on our way to GA.

Thanks, for the insight Hagay. The heavy-lifting that you described is exactly what we’re trying to reduce for the development community–the custom logic required to integrate the model into ingest processes and the application layer required to integrate ML inference and queries. We’re going to continue to eliminate the scaffolding work to integrate and operationalize these models on OpenSearch. All the while, I would like to provide users with the option to use their existing ML infrastructure because that can offer better cost optimization and allow you to inherit things like your existing operational support infrastructure.

Which raises a further question:

Most of the heavy lifting (code wise, hardware wise it’s definitely the model) is on the tokenizer level, which can include some complex logic at times.

How do you intend to integrate tokenization? Only within the model itself? Relying on the existing tokenization framework lucene already has?

If it can truly be done end-to-end with a single tokenizer on Opensearch (after some initial configuration) this will really be a dream come true

Hey Hagay, we’re still evaluating options on many fronts, but we’re going to try to make our community’s dreams come true. Bring us your use cases, pain points and requirements–we’ll prioritize our work based on the community’s demand.

@hagayg Thanks for your suggestion. To clarify, we have preliminary support for GPU in 2.4 release. But we still have some gap for GA release. Check more details on this issue [FEATURE] GPU support for model serving framework · Issue #576 · opensearch-project/ml-commons · GitHub, feel free to add your comments/suggestions.

Most of the heavy lifting (code wise, hardware wise it’s definitely the model) is on the tokenizer level, which can include some complex logic at times.

For tokenizer, we support Huggingface tokenizer, will that meet your requirements? We have similar question on [FEATURE] Proposal - Deep Learning Model Uploading and Inference · Issue #302 · opensearch-project/ml-commons · GitHub, feel free to add more details and requirements here or on the issue.

Replied in this topic Documentation for new ML features in 2.4, also paste the doc link here in case someone else have similar question when read this .

This is the neural search doc: Neural Search plugin - OpenSearch documentation ,
This is the model serving framework doc (NLP models runs in it): Model-serving framework - OpenSearch documentation

@hagayg @asfoorial Thanks for your valuable feedback! We just released 2.5 which supports GPU acceleration, refer to GPU acceleration - OpenSearch documentation. Welcome to try this and share your feedback.

And we also support ONNX (only text embedding NLP model) from 2.5, check this doc for some example ml-commons/text_embedding_model_examples.md at 2.x · opensearch-project/ml-commons · GitHub

Thanks @ylwu . Will give it a try. Any expected date for GA?

@asfoorial, we’re currently targeting GA by 2.7, but our plans may change. We have a lot of features planned for the 2.7 including the auto-reloading functionality that you requested.


Thank you for this amazing product!
We’ll be deploying around next week and come back with impressions.
Once again Opensearch is proving to be an amazing product that actually listens to feedback from the community, and in a rapid manner no less.


I would like to thank ml-commons team, neural-search plugin team and the amazing OpenSearch team.

I just read the OpenSearch Project 2022 recap and what's next · OpenSearch and am thrilled and looking forward to new advancements in this amazing product. I was also hoping the blog post had mentioned the roadmap for both ml-common and neural-search. I think that these two plugins will have a huge impact in advancing OpenSearch towards becoming a lead next-gen search platform especially that we have been witnessing marvelous advancements in AI & Deep NLP.

1 Like

@asfoorial, thank you, for your kind words. Active community members like you are the heart of our product.

I will be writing a blog post for our 2.7 release which is when the neural search plugin and the ML serving framework will be GA. At that time, I will also provide insights into our future direction for OpenSearch ML.


Thank you for these comments @asfoorial - we really appreciate this, and we really appreciate you being here - as @dylan said - active community members like you are the heart of this project

1 Like

Something I see all the time is a general use model working OK but client wanting to tune it. I got to this: Demo Notebook for Sentence Transformer Model Training, Saving and Uploading to OpenSearch — Opensearch-py-ml 1.0.0 documentation (opensearch-project.github.io)

But would be nice to have an easier way to do it. E.g: sending the docs to the ML node and the ML node doing the training.

1 Like

@illermaly, thanks for the feedback. We have plans to incrementally build out this framework to simplify the entire ML life-cycle. This includes simplifying the fine tuning and continuous improvement experience for models that are integrated into OpenSearch.

1 Like

Hi @dylan and team,

Just want to check if 2.7 will have GA for ML Model serving and neural search features?


@asfoorial, unfortunately, we have to push the GA date to 2.8. We believe that model-level access controls are required for GA and that feature has slip to 2.8.

However, we will release the model auto-reloading functionality that you requested in 2.7.

1 Like

Thanks @dylan. Looking forward to its GA in 2.8.

Here is another interesting question answering type vblagoje/bart_lfqa · Hugging Face

It expects multiple text sentences or paragraphs and generate an abstractive answer from them. Worth considering in the future of OpenSearch.

@asfoorial, my goal is to make it possible for users to configure the interfaces and bindings. We’ll get there. We’re going to release RAG support, which should make it possible to achieve a SOTA QA type experience. We’re going to make the query workflow more configurable and enable integrations to external APIs and services.

Here’s the feedback form for the extensibility capabilities: [Feedback] Enabling low effort machine learning technology integrations (ml-commons plugin)

1 Like