Hi @zhichao-aws,
Our main pain points are around ingestion throughput (1) and search latency (3). This is mainly due to the shared threadpool in OS.
Another one was 4b, as it was not obvious from the docs that the model cannot be deployed inside OS and we need to deploy in SageMaker.
Relevant forum threads with additional details:
- model registration / deployment and bit of performance detailas: How to register sparse encoding model in AWS OpenSearch - #15 by darvel
- ingestion and search performance: How to scale neural sparse ingestion pipeline
Thanks for working on this!