Compute similarty score while ingesting data

wassim.dhib · January 14, 2021, 9:26am

Hi,

I have a pipeline which contains :

a file parser that embedds some text fileds in vectors
an elasticsearch ingest pipeline that process additionnel transformations on other numerical/date fields

I’d like to use Knn for a classification use case. i.e. computing the similarity score of each embedded vector against a set of labelised vectors and adding a label to this vector.

Is it possible to do this using some script processor in ingest pipeline ?

Vijay · January 22, 2021, 1:01am

@wassim.dhib apologies for responding late. I believe script processor is used to perform an operation on individual document like add fields, replace score, etc. from same index.
In your case, I will build another pipeline as , for every document which contains embedded vector, i will use those vector as query vector for search query on index containing labelised vectors , calculate the label from the search result and update document with label.

Topic		Replies	Views
Multiple ingest pipelines for an index Machine Learning	2	718	February 19, 2024
Append processor for vector field OpenSearch configure	6	45	October 12, 2024
Best way to introduce new embedding models to existing indexes? k-NN discuss , configure	1	180	August 10, 2024
Combining KNN score with keyword query k-NN	8	3690	March 11, 2021
Ingestion pipeline for a nested field OpenSearch troubleshoot	3	732	September 19, 2024

Compute similarty score while ingesting data

Related topics