OpenSearch Query Timeout Error- For Setting Alerting

I’m encountering timeout errors while executing queries in my OpenSearch environment, and I could use some guidance on troubleshooting and resolving this issue.
I’m utilizing the built-in alerting functionality in OpenSearch to monitor specific conditions, such as message in data Level=“error”
query is below
{
“timeout”: “60m”,
“query”: {
“bool”: {
“must”: [
{
“range”: {
“fields.Context.TimeStamp”: {
“from”: “now-60m”,
“to”: “now”,
“include_lower”: true,
“include_upper”: true
}
}
},
{
“term”: {
“level.keyword”: “Error”
}
}
]
}
}
}

Specific Questions:

  1. What could be causing these timeout errors in my OpenSearch queries?
  2. Are there any optimizations I can apply to my queries or cluster configuration to improve performance and prevent timeouts?
  3. How can I effectively monitor and diagnose performance issues in my OpenSearch cluster?
  4. Are there any best practices for handling frequent data inserts and querying in OpenSearch?
  • I’ve already checked the cluster’s resource utilization and confirmed that there are no significant spikes in CPU or memory usage during query execution.
  • I’ve reviewed the OpenSearch documentation and forums but haven’t found a solution that addresses my specific issue.

Any insights or suggestions would be greatly appreciated. Thank you in advance for your assistance!

Hey @esmart2024

Do you have enough Heap configured on your instance?

Can you show your logs when this issue occurs and what does your resource look like?

Yes.
I’m doing with sample data. problem is just not increasing alert count

Hey @esmart2024

So I fired up a new OS/OSD instance. Copied your Query into mine and I received this…

Do you have the same? If so what are your resources on your instance?

During my test I cant duplicate your time out error.

I forgot to ask, in OpenSearch or OpenSearch-Dashboards do you see anything in the logs files that would pertain to this issue?

Time out issue could be a couple things.

no

you’re getting this because you don’t have data level=error

hey @esmart2024

Can you show us what your full configuration looks like?