Neural Sparse Search and bucket aggregation filtering

grunt-solaces.0h · April 1, 2024, 6:01pm

Hi,

We are using neural sparse search and we want to keep the best x% matches. To do this, we first run the query, get the max score and then run the query again, adding min_score = x% * max_score. All the filtering clauses (ranges, terms, etc) are in a filter context and just the neural_sparse clause is in a query context (must) so the score is affect just by the sparse search.

However, the problem is that we want to aggregate the data based on a term and we want to keep the top x% results per bucket, not top x% of all the results returned by the initial filtering.

What we have now: filter → keep top x% based on _score → split into buckets
What we would want: filter → split into buckets → keep top x% per bucket based on the _score inside the bucket

How can this be achieved?

Thanks!

Note: Using AOS 2.11

zhichao-aws · May 31, 2024, 3:39am

Hi @grunt-solaces.0h , is the min_score used as filter context in the second query? Or you want to calculate the min_score for each bucket?

zhichao-aws · June 7, 2024, 2:18am

Have you taken a look at the collapse feature? Where we can collapse the search result on a field to get buckets and hits on each buckets?

Topic		Replies	Views
Min_score threshold issues OpenSearch	0	46	March 3, 2025
How can I highlight terms from a neural sparse search? Machine Learning	8	665	July 13, 2024
Aggregation Query filtering on results OpenSearch	8	1884	July 14, 2023
Individual scores when using compound queries OpenSearch	0	121	May 1, 2024
Efficient k-NN filtering with Neural Search OpenSearch	0	57	February 5, 2025

Neural Sparse Search and bucket aggregation filtering

Related topics