Vectorizing big chunk of data returns errors

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Opensearch 2.11.0 is the version, and currently deployed in my Windows 11 machine, not a docket deployment.

Describe the issue:
I have successfully, deployed the ML model using the Neural Plugin tutorial

But when ingesting the data, I get this error :
The size of tensor a (2792) must match the size of tensor b (512) at non-singleton dimension 1
The model i used is the huggingface/sentence-transformers/msmarco-distilbert-base-tas-b, from the documentation.

The one thing i could understand is the data which i am trying to upload to vector is a huge document is around 13000 characters.

Please let me know how can i achieve vectorizing this huge data.

Hi @rathankalluri, I assume for the huggingface/sentence-transformers/msmarco-distilbert-base-tas-b model you are using version 1.0.1 which does not have truncation feature. Could you please use version 1.0.2?

Please let me know if you are still facing the issue after using 1.0.2?

Thanks a lot @dhrubo … It worked!

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.