Does OpenSearch supports, or will support, Matryoshka Representation Learning (MRL) embedding models? Just recently Jina AI released such model with excellent performance and very small dimension. I have noticed significant effors in OpenSearch to improve vector compression. I believe that supporting MRL will greatly contribute to such efforts.
It is more about how the model is called. Once we upload a model to ml-common, we should be able to configure it to return only the top N values of the embedding it usually generates. For example, if the model generates 1024 values, then ml-common register/deploy API should have a config that sets the return dimension. If we set that return dimension to 64, then whenever that model is called it will always return 64 values. Weather the call is coming from the _predict API, an ingest pipeline, or a neural-search query, it wil always return 64.
In all the above, there is only one vector per document. So if the model is configured to return 64 values, then the knn index should have one vector field of size 64.