Aggregation on similarity

Describe the issue:
I am feeding documents to Openseach that containt field ‘MessageText’ and i want to aggregate it and to show most common similar messages texts. I would consider that messages are similar if score is high enough.

for example:

text #1 → hello my name is Jhon
text #2 → Hello my name is Lisa
text #3 → Helo i want pizza.

and i would get:

Hello my name is Jhon → Count =2
Hello i want pizza → Count = 1