Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 3.0.0
Describe the issue:
I need to use hybrid search with inner hits in order to isolate the relevant chunks in a document. I already implemented an index structure that works (each record has a nested field “chunks” that has elements each containing a text field and a knn vector field to keep the chunk’s text and embedding matched). With Opensearch 2.19.1 I was able to use hybrid search to get the record (without the inner hits so I had no information on the chunk that actually matched the search). I also know that the 3.0.0 version includes changes that make retrieving inner hits in hybrid queries possible so I first updated the server without changing the query structure.
Now I have another problem. For some searches, I get the error: “Sub-iterators of ConjunctionDISI are not on the same document!” while doing the exact same thing I was doing before. I found an issue that seems to talk about the error but most of the conversation regards an interval query. Only the last comment mentions hybrid search without further information. I post here the link to the issue:
Example of query that returns the error:
{
"query": {
"hybrid": {
"queries": [
{
"nested": {
"path": "chunks",
"query": {
"query_string": {
"fields":["chunks.text"],
"query": "tipi~1 AND pagamento~1"
}
}
}
},
{
"nested": {
"path": "chunks",
"query": {
"neural": {
"chunks.embedding": {
"model_id": "sWzKyJYBCnSNyPkYXI9N",
"query_text": "tipi di pagamento"
}
}
}
}
}
]
}
}
}
Example of query that work just fine:
{
"query": {
"hybrid": {
"queries": [
{
"nested": {
"path": "chunks",
"query": {
"query_string": {
"fields":["chunks.text"],
"query": "scheda~1 AND tecnica~1 AND prodotto~1"
}
}
}
},
{
"nested": {
"path": "chunks",
"query": {
"neural": {
"chunks.embedding": {
"model_id": "sWzKyJYBCnSNyPkYXI9N",
"query_text": "scheda tecnica prodotto"
}
}
}
}
}
]
}
}
}
Seeing that the query structure is the same, it seems to be a problem with the records in the result…
Can anyone help?