Opensearch primary shard allocation

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.13

Describe the issue:
I have 4x nodes with JBOD NVME disks. Each disk has an instance of opensearch on them (container - data node). I set a attribute value of hostname to prevent a shard from replicating to another disk on the same host. Note: i since found ‘cluster.routing.allocation.same_shard.host’.

I have heavy indexing, light searching on this cluster. To reduce CPU, i enabled ‘SEGMENT’ replication which greatly helped (great they fixed the index template bug in 2.11 skipping this setting!). My next problem is, i see some hosts get more ‘Primary’ shards than others. This increases CPU on that host as the replica no longer needs to index the same documents. Do we have any way of allowing opensearch to weight primaries using an host attribute?

ie: 4 hosts, 6x disks per host = 24 OS Data nodes
Index = 12 shard. Place 3 primaries on each host ‘attribute’ weight, then work out where the replicas should go?

I seen ‘cluster.routing.allocation.balance.prefer_primary’ setting but as each disk instance is essentially a node, this isnt helping.

I think you can try to increase the number of primary shards to 24, each node could have a equal number of primary shards and then the hosts are balanced. The best practice about sharding is that the number of primary shards is Nx greater than the number of nodes.

Wondering how the balancing algorithm works?

24x nodes.
24x primary , 24 x replica

would the primary shards get balanced first and then the secondary? If so, your suggestion might work. If not, then i could still end up with more primaries on the same node. I was hoping for some way to influence the balance decision making a little.

If the number of primary shards is Nx greater than the number of nodes, then there’s more chance that each node has equal primary shards, even though both the primary and replicas are considered in shard allocation.
For your case, the index level setting index.routing.allocation.total_shards_per_node can be used to control number of shards that will be allocated to a single node, for example, 1. set number of replicas to 0, 2. set total_shard_per_node to 1, 3. after the shards movement finishes, set number of replicas back to 1, now each node holds only one shard for the index which has 12 primary shards and 1 replicas.