Top Techniques for Putting OpenSearch into Practice in a Big Environment

Hello Everyone :hugs:,

I’m looking for best practices from people who have deployed OpenSearch in a similar way because I’m employed on an endeavour that involves doing so on a wide scale.

We intend to use OpenSearch for its the full text search, logging analytics, and monitoring features because our system generates a large volume of data.

The following details pertain to our surroundings:

  • Data Volume: Every day, we hope to index many terabytes of data.
  • Cluster Size: A 20-node cluster will be launched initially, and it will be scaled up as necessary.
  • Data Retention: Since we must keep data for a minimum of a year, storage and performance issues arise.
  • Query Load: Thousands of inquiries with varied degrees of complexity should be handled by our system each second.

I have some inquiries in light of these requirements:

Cluster Configuration: How should a large OpenSearch clusters be configured? :thinking: Do we need to be aware of any particular settings or optimisations to guarantee stability and performance? :thinking:

Indexing Strategy: How can a large amount of data be indexed? Are there best practices for throughput in indexing, replication, and shard allocation? :thinking:

Data Retention & Management: Which techniques do you suggest using to manage data retention for extended periods of time without adversely affecting performance? :thinking: Exist effective methods for archiving older data? :thinking:

Monitoring and Maintenance: Which methods and instruments are most appropriate for keeping an eye on the functionality and well-being of an OpenSearch cluster this size? :thinking: How can we handle node failures and shard rebalancing among other maintenance tasks? :thinking:

I followed this :point_right: https://opensearch.org/docs/latest/search-plugins/knn/performance-tuning/minitab

Any knowledge, firsthand accounts, or helpful links you could provide would be highly valued. Our goal is to develop a stable and effective OpenSearch implementation, and we would be delighted to absorb knowledge from the community.

Thank you :pray: in advance.