Versions
OpenSearch 2.11 managed by AWS
Describe the issue:
I currently have an index with a knn_vector field (Lucene engine). I want to create a new index with the same vectors but stored as byte vectors. For this purpose I would like to use painless scripting since it is faster than downloading the documents, transform them and index them in the target index.
This is what I have so far
POST _reindex?refresh
{
"source": {
"index": "source-index",
"query": {
"match_all": {}
},
"_source": ["vector"]
},
"dest": {
"index": "target-index",
"op_type":"create"
},
"conflicts":"proceed",
"script":{
"lang":"painless",
"source":"""
byte[] quantized = new byte[512];
for (int i = 0; i < 512; i++) {
quantized[i] = (byte)ctx._source.vector[i];
}
ctx._source.vector = quantized;
"""
}
}
This is a minimal example and doesn’t contain the scaling I want to do with the vector yet. I just want to get it running and them improve from there.
Unfortunately I am unable to get this to work. This is the error I am getting
"failures": [
{
"index": "target-index",
"id": "my-id",
"cause": {
"type": "mapper_parsing_exception",
"reason": "failed to parse field [vector] of type [knn_vector] in document with id 'my-id'. Preview of field's value: 'AAAAA [...] AAAAA='",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Vector dimension mismatch. Expected: 512, Given: 0"
}
},
"status": 400
},
I don’t know why it receives only a 0-dimensional vector. Can someone point me to a resource where I can read more about painless scripting and vectors?