I’ve been exploring altering our OpenSearch usage and switching our search over to the Hybrid Search plugin, so we can leverage both BM25 and k-NN neural searches.
One of the issues I’m trying to figure out is how to update our provisioning process to make sure that the necessary model groups, model, pipelines are all correctly registered/deployed/etc.
Having gone through the documentations for v2.11.1 and examples, it the
model_id of the model you’re using is needed when creating the various pipelines, indices, etc. However, there does not appear to be a way to create the model groups or models with a specific ID.
IDs are auto-generated when you register the model group/model.
Is there a way to specify the ID to use or perhaps use the name of the model group/model instead of the ID?
What I want to do is being able to come up with a provisioning script which I can run which will configure an environment the same way every time. That way the model group and model IDs are known.
However, from reading the docs that does not look possible. It looks like I’d have to write my provisioning scripts generic to track the IDs being created and every environment would end up with different IDs for the models & groups. Then in our code I’d have to search for the IDs to use for the current instance.
If I can’t create a model group/model with a specific IDs is there a way to just back that information up and then restore it as part of a provisioning process?
Model group name is unique → Not sure if that can help you or not.
We are gathering idea to initiate a
model alias concept. This is the RFC: [RFC][FEATURE] Model alias · Issue #1217 · opensearch-project/ml-commons · GitHub
Please feel free to vote or share your use case more so that we can extend our model alias idea.
I didn’t clearly understand what did you mean by:
back that information up. Could you please explain this a little bit more for me?
Sure! First, thanks for taking the time to respond.
I was wondering if there’s a way to take the model & group configurations and export/back them up so they could be restored to different instances. That way when they’re restored they’d have the same IDs. If I could do this, then I could just deploy the exported configuration as part of our provisioning process instead of having of calling all the various API endpoints to create everything.
So if I could spin up a new instance, take a snapshot/backup of all the ML settings, I could just deploy the snapshot/backup of the ML settings as part of our provisioning. That would at least get us to the point where all the IDs being used are the same across all our deployment locations.
model alias concept indeed looks exactly like the type of functionality I need to solve my problem. I did as you suggested and added a comment to the issue to reflect our use case.
I see, thanks for the clarification. Unfortunately we don’t have this feature now. But we have a related issue about this: [FEATURE] Download model from cluster · Issue #1207 · opensearch-project/ml-commons · GitHub
Please feel free to comment there and share your use case so that we can pick that issue as customer priority.
Issue 1217 actually meets my needs, so that’s definitely the higher priority. If that’s implemented, I don’t need the ability to backup/export the models configurations.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.