I was wondering if there is any way to force a limit to the number of docs a reranker can handle per request? So after registering a model I should be able to set a docs limit one request to the model can handle.
As you know rerank models are compute intensive and having no limit per model could slowdown or even crash ML nodes.
At the moment, no. The best I can recommend for a solution today is to include a truncation processor in your search pipeline before the rerank processor, to limit the number of docs in the search response.