Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch_2_13_R20240520-P4
opensearch-py: 2.6.0
Describe the issue:
Having trouble understanding how to properly limit the number of search results returned by the search() method.
Should I be using the from_ and size parameters in the method call (and why is it named from_ with an underscore)?
Or, should I be setting size alongside the “query” dictionary entry in the query dictionary?
I’ve tried a variety of such settings and I’m still getting the full results set back, up to thousands of matches.
Of note: I’m doing k-nn queries. I pass in the below to the search() method.
knn_query = {“knn”: {“vector”: {“vector”: query_vector, “k”: k}}}
query = {“query”: knn_query}
I have tried: query = {“size”: k, “query”: knn_query} among other permutations and I either get an error or a full result set.
For now, I’m applying the limit in my code by short-circuiting after the desired limit. However, it should be more efficient if this were to happen on the OpenSearch side to potentially reduce latency.
I’d appreciate any pointers or recommendations.
Configuration:
Master nodes: 3 x m6g.large.search
Data nodes: 6 x r6g.4xlarge.search
Relevant Logs or Screenshots:
None for now