Supporting Late Interaction in OpenSearch Vector Engine as a Field Type

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

3.4 and above

Describe the issue:

OpenSearch recently added support for rescoring with late interaction models using painless script scoring. Meanwhile, Lucene 10.3 includes native support for rescoring results using late interaction models. Lucene’s implementation exposes a new LateInteractionField that can index multi-vectors and rescore results for any query using maxSim similarity against a provided target multi-vector.

There are advantages in leveraging the Lucene based support in OpenSearch – 1) it works for users who cannot run painless scripts, 2) any optimizations or vectorization improvements made in Lucene directly feed into OpenSearch. This issue proposes to add the required support in OpenSearch.

We see two main buckets of work:

  1. Add support to index multi-vectors into Lucene’s LateInteractionField.
  2. Add support to rescore query results using configured similarity (maxSim by default) between provided a target query multi-vector and indexed multi-vectors.

More Details can be found here:

Author: vigyasharma (Vigya Sharma) · GitHub

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.