Taking too much time in Data Ingestion in Vector Index

Versions (relevant - OpenSearch):
2.11

Issue:
We are using m5.large.search single node cluster where we are using normal keyword searches for our application, but now we want to use vector search to improve the efficiency, and for this I have created a ingestion pipeline for the fields that I want to vectorise, we have around 8 fields to be vectorise and when I try to ingest data in this index, it taking too much time, normally when we ingest data in the normal index it takes around 20-40 seconds to send around 1300-1400 documents, but when I tried to ingest the same data in the vector index, its taking too much time, around 15-20 mins

Can anyone suggest what is it causing me, is it due to embeddings or something else I need to think of. And if the issue in the strategy to creating the vectors then please let me know.

Configuration:
So basically our use case is, we want to provide a facility to our users where the can query in a prompt format and based on that we will return the data to them related to their query. And for this what we are doin is, we transforming the desired fields into vectors and then searching the prompt in those vectors, is it right approach to do so, please let me know as Im very new to this.

Thanks a lot in Advance!

Hi Hemendra,

What type of model do you use for data ingestion - is it local or remotely hosted?

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.