Describe the issue:
I am running opensearch on a system with 64 cores, and there are a lot of threads for ONNX (as shown in pstack of the opensearch JVM). And these ONNX threads are using a lot of cpu time during a neural search. Is there anyway to limit the number of ONNX threads in opensearch?
Configuration:
All default. I tried setting -XX:ActiveProcessorCount=1in jvm.options but that didn’t reduce the number of ONNX threads. It seems these ONNX threads are not part of the JVM (they are started by the ONNX library).
I ran the neural queries which takes a line of question, generates the embedding of that on the fly (using ONNX) and then does a search using the embedding.
I ran the query repeatedly and the high CPU usage was seen whenever the query is being executed. The threads that consumed the most CPU are the (many) ONNX threads, and I want to find a way to limit the number of ONNX threads. Currently it appears that the number of ONNX threads is based on the number of cores on the system, and that’s why I got a lot of ONNX threads (since I have 64 cores).