Count terms in features of anomaly detectors

Elmux · November 14, 2020, 8:47pm

Hi guys

I have a use case to detect anomalies in log files of denied remote access. I would create a detector with filter to the according log message and add a feature with “count” aggregation to the client_ip field.

According to the youtube video it is not possible to count terms like ip addresses in features, only numeric values. Ist that still true? I think using the “count” aggregation, which is mapped to the Elasticsearch “value_count” aggregation, it should be possible to count such non-numeric fields.

Thanks for clarification.

Kind regards
Elmar

ylwu · November 18, 2020, 5:57pm

You can user any field types which supported by ES count with expression
Screen Shot 2020-11-18 at 9.56.10 AM

Elmux · November 20, 2020, 7:54am

Thanks für the answer.
Is it also possible to use a terms aggregation?

"status_codes": {
  "terms": {
    "field" : "status_code"
  }
}

Is the anomaly detection engine able to handle a buckets array with values? See result:

"status_codes": {
  "doc_count_error_upper_bound" : 0,
  "sum_other_doc_count" : 53,
  "buckets" : [
    {
      "key" : 200,
      "doc_count" : 4583
    },
    {
      "key" : 301,
      "doc_count" : 4501
    },
    [...]
}

ylwu · December 1, 2020, 2:08pm

Currently the feature query only support single value aggregation. That means the aggregation should only return 1 numeric value, e.g. max/min/sum/average/count. You can’t use term aggregation and a bucket array.

Elmux · December 2, 2020, 6:26am

Thank’s for the clarification!

bpavani · February 8, 2021, 3:07pm

Hi @Elmux,

Have you used high cardinality feature? I believe that should address your requirement. Let me know if it doesn’t.

Thanks,
Pavani

llermaly · November 14, 2022, 5:15pm

What about adding a category breakdown on top of the count() function?

I’m trying to get anomalies in the document counts for security logs for each category (INDEX, login failed, BAD SSL, etc)

Thanks

kaituo · December 13, 2022, 9:37pm

You can use our high cardinality detector by specifying error category as category field.

Topic		Replies	Views
Anomaly detection with term aggregations Machine Learning	2	1356	June 4, 2020
Include counting feature in anomaly detector Machine Learning	1	477	February 7, 2023
How to create anomaly detectors by error rates OpenSearch configure	0	34	January 6, 2025
Detect Anomalies on aggregation feature X based on aggregation feature Y Machine Learning	1	18	July 8, 2025
Muti-variate Anomaly Detection Machine Learning anomaly-detection	4	530	January 30, 2023

Count terms in features of anomaly detectors

Related topics