Help needed for Index sharding to run script score knn

Hello we need to use script score KNN and it computes cosine similarity using brute force.

Following is our cluster configuration
Data nodes count : 6
Instance type : r6g.xlarge
Number of docs inindex: 3.7M
Number of shards: 12
Size of each shard: 4.5 gb
Number of replica: 1

With above configuration we have been seeing avg latency of 1.8s to 2.8s

How can we reduce this latency to less than a sec ? Should we add more shards, reduce shards or upgrade instance type ? Any recommendations would really help

Profile api shows that script score query and collector operation contributes majorly to this latency