I prepare and run a query against an index. I get back 10000 rows/documents. OK I can live with the limit as I understand the implications.
But which rows are returned? If my filter/query would have returned 1mm records, but I only get back 10K, are they the first ones? Sorted how? It looks like a random sampling amidst the time interval.
For clarification, I am specifically asking a query with 1mm documents returned, but then with a CSV report export, I get 10000 rows. I know why the report is limited to 10K. But which rows are returned? First 10K in order of matcvh score? Date? random?
Thanks
Dean
@dangelic0 The search result in OpenSearch is sorted by relevance score " _score".
Why would you need to return all the documents? You should narrow your results by building efficient queries.
I don’t really need all those rows. I need to perform aggregation across a large set of rows/documents. Like a bucket aggregation, or a cardinality aggregation. But I found that even the aggregations only accept 10000 rows from the query.
So I was going to do the aggregation outside of OpenSearch by exporting the rows. So wondering which rows are returned when limited by the 10K limit.
How can I do an aggregation on 1mm rows? I see the doc count in the aggregation set to 10K…