Rank based on rarity of a field value

pheeria · January 19, 2023, 11:23pm

Hi

I’d like to know how can I rank lower items, which have fields that are frequently appearing among the results.
Say, we have a similar result set:

"name": "Red T-Shirt"
"store": "Zara"

"name": "Yellow T-Shirt"
"store": "Zara"

"name": "Red T-Shirt"
"store": "Bershka"

"name": "Green T-Shirt"
"store": "Benetton"

I’d like to rank the documents in such a manner that the documents containing frequently found fields,
“store” in this case, are deboosted to appear lower in the results.
This is to achieve a bit of variety, so that the search doesn’t yield top results from the same store.

In the example above, if I search for “T-Shirt”, I want to see one Zara T-Shirt at the top and the rest
of Zara T-Shirts should be appearing lower, after all other unique stores.

So far I tried to research for using aggregation buckets for sorting or script sorting, but without success.
Is it possible to achieve this inside of the search engine?

Many thanks in advance!

radu.gheorghe · January 23, 2023, 2:16pm

Hello,

I don’t think you can get this exact result natively, but there are some options that are close enough, IMO. Here’s one:

you can collapse search results, for example to show one T-shirt per unique store
on a second query, you can show the rest, maybe excluding the ones you already showed

We can think of others that are similar (e.g. the use of the top_hits aggregation). Note that in general, the default similarity de-boosts words that appear more often, if they match words from your query. So if you’d search for “zara OR berkshka”, then Bershka T-shirts will come on top, because they’re “more specific” to your query. But if you just want variance in search results, then you’ll want to do some field collapsing or maybe inject some random score via the function score query: Function score query | Elasticsearch Guide [7.10] | Elastic

Topic		Replies	Views
Issue with Nested Sorting in Opensearch Query Open Source Elasticsearch and Kibana	2	412	September 15, 2024
Term aggregations - cannot order by keyword field OpenSearch	2	643	August 1, 2022
Ranking the table results Open Source Elasticsearch and Kibana	2	296	May 20, 2023
How to index my documents to aggregate similar item listings together OpenSearch	1	283	May 3, 2023
How to sort result based on index name in a query against multiple indices? OpenSearch	2	400	May 11, 2024

Rank based on rarity of a field value

Related topics