After reading the alerting docs, I have an understanding on how to monitor for a particular event in a doc, and then to fire an alert depending on how many docs match that event
An example of this is below, where it matches env_name and level in any indices with my_index
in it. If it finds more than 50 occurrences in 10 mins it will fire an alert.
{
"type": "monitor",
"name": "api ERRORs",
"enabled": false,
"schedule": {
"period": {
"interval": 10,
"unit": "MINUTES"
}
},
"inputs": [{
"search": {
"indices": ["*my_index*"],
"query": {
"size": 0,
"query":{
"bool": {
"must": [
{ "match": { "env_name": "dev"}},
{ "match": { "level": "ERROR" }}
],
"filter": [
{ "range":{"@timestamp":{"gte":"{{period_end}}||-10m","lte":"{{period_end}}","format":"epoch_millis","boost":1.0 }}}
]
}
}
}
}
}],
"triggers": [{
"name": "api ERRORs > 50",
"severity": "5",
"condition": {
"script": {
"source": "ctx.results[0].hits.total.value > 50",
"lang": "painless"
}
},
"actions": [{
"name": "Send Slack to channel alerts-dev",
"destination_id": "<destination removed >",
"message_template": {
"source": "Monitor {{ctx.monitor.name}} just entered alert status. Please investigate the issue.\n - Trigger: {{ctx.trigger.name}}\n - Severity: {{ctx.trigger.severity}}\n - Period start: {{ctx.periodStart}}\n - Period end: {{ctx.periodEnd}}"
},
"throttle_enabled": true,
"throttle": {
"value": 120,
"unit": "MINUTES"
},
"subject_template": {
"source": " api ERRORS > 50 in the last 10 minutes"
}
}]
}]
}
What I am trying to work out is it possible to alert when there is an increase in the number of docs with level=ERROR based on a percentage. I.e. in the last 5 minutes, there has been a 20% increase in the number of docs that match the query.