How to limit the number of threads for ONNX in opensearch

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.17.0 opensearch

Describe the issue:
I am running opensearch on a system with 64 cores, and there are a lot of threads for ONNX (as shown in pstack of the opensearch JVM). And these ONNX threads are using a lot of cpu time during a neural search. Is there anyway to limit the number of ONNX threads in opensearch?

Configuration:
All default. I tried setting -XX:ActiveProcessorCount=1in jvm.options but that didn’t reduce the number of ONNX threads. It seems these ONNX threads are not part of the JVM (they are started by the ONNX library).

Relevant Logs or Screenshots:

Which queries did you use at the same time your OS cluster has a trouble in CPU usage?

The very first time your cluster has met ONNX is when you deploy ML models to nodes

I ran the neural queries which takes a line of question, generates the embedding of that on the fly (using ONNX) and then does a search using the embedding.

I ran the query repeatedly and the high CPU usage was seen whenever the query is being executed. The threads that consumed the most CPU are the (many) ONNX threads, and I want to find a way to limit the number of ONNX threads. Currently it appears that the number of ONNX threads is based on the number of cores on the system, and that’s why I got a lot of ONNX threads (since I have 64 cores).