Thanks for the blog. @Navneet
Is it possible to only store byte vectors so that the storage size is reduced more instead of the FP32 vectors.
Does OpenSearch provide such a configuration?
Is there a mechanism to show only byte vectors or Opensearch has any api that allows to get the byte quantized vectors from FP32 vectors as this can still reduce the disk storage.
I am bit confused on the question I will try to ans based on best of my understanding. If you still have questions feel free to put them.
OpenSearch do support different data types for vectors which include fp32, byte and binary. So if you already have vectors within this range I would suggest directly using the data_typefield while creating the index mappings. This will reduce the memory and disk footprint since vectors are already present as byte or binary.
If your vectors are in fp32 but you want to quantize then to byte, binary or fp16. OpenSearch provides different quantization technique to do that. This also reduce the memory footprint and disk but not same as the first option I provided. Since we store both fp32 vectors and quantized vectors on disk. During search we are using only quantized vectors.
Now in quantization if you are asking as a user you can view what were the values of quantized vectors via some API then ans is no, you cannot do that.
I was asking since the pretrained models returns and stores only FP32 vectors. Is there any way we can capture this FP32 vectors and quantize to byte/binary and store them.
Would this cause accuracy of the data to be lost during search time?