The tasks that I find helpful are:
1.QA
2. Document summarization
3. Zero/few-shot classification
4. Image embedding
I also suggest providing multi-predict API similar to multi-search. This can, in a way, reduce requests to the API and thus reduce the load.
The out-of-the-box HuggingFace QA was very good in a general context. I haven’t tested it deeply in specialized contexts such as healthcare for instance. But I can see there will be a need for fine-tuning at soem point.