Searching multiple fields with custom scoring (e.g hamming)

If I have a bunch of vectors that are binary and I want to compute hamming dist on any of them (in other words, I want my score to be the max score over all the fields), can I do that in one query or is the only way to do a query for each one.

The use case is using PDQ Hash which are fuzzy hashes of images. The output is a binary vector. However, you get hashes for various transformations of the image (e.g. reflection and rotation). So to look up an image to others, ideally you’d compare its main hash with all other hashes.

So the data is indexed as below
image

If I have a candidate vector can I search all of these at once via hamming distance?

If not, if I don’t care about preserving the original name (e.g. all the hashes could be given the same name), could I set up a mapping in a way that I could search them all at once?

@gdd314596 Currently, custom scoring doesn’t support more than one field. IIRC, fuction_score can be used to get max score out of multiple scores from a document. Or, you could create multiple documents with single field ( hash ) and add metadata to distinguish original vs reflect_h vs rotate… etc and perform custom scoring using hammingbit with size = 1 to get max score across documents.

1 Like