Hi,
I have indices where we record the users’ accesses to our system and I would like to rollup it generating aggregated statistics and counting the number of accesses per user (per hour).
Technically, this means ‘select username, count(*) group by username’ or in ES flavor ‘Terms’ aggregation on ‘username’ and then count().
Is there a way to implement this using rollup jobs? From what I see from the UI it seems only possible to aggregate over numeric fields, but not to count events/values
Thanks!
While the UI does a validation check to only support numeric type for value_count aggregation, you can create a rollup job with value_count
aggregation on a keyword column using REST API. I think the UI needs to be corrected to allow value_count on keyword fields.
The above stated example can be realized in rollup as follows
e.g.
{
"rollup": {
"enabled": true,
"schedule": {
"interval": {
"period": 1,
"unit": "Minutes",
"start_time": 1602100553
}
},
"last_updated_time": 1602100553,
"description": "A sample rollup",
"source_index": "sample_source_index",
"target_index": "sample_target_index",
"page_size": 1000,
"delay": 0,
"continuous": false,
"dimensions": [
{
"date_histogram": {
"source_field": "sample_timestamp_field",
"fixed_interval": "60m",
"timezone": "America/Los_Angeles"
}
},
{
"terms": {
"source_field": "username"
}
}
],
"metrics": [
{
"source_field": "username",
"metrics": [{ "value_count": {} }]
}
]
}
}
@thalurur , in my case, with OpenSearch 1.3.2, “value_count” is not working:
Here is my rollup job definition:
{
"rollup": {
"rollup_id": "java-cattle-summary",
"enabled": true,
"schedule": {
"cron": {
"expression": "53 08 * * *",
"timezone": "America/Sao_Paulo"
}
},
"description": "Eventos de log Java",
"source_index": "java-*",
"target_index": "history-java-cattle",
"page_size": 1000,
"delay": 0,
"continuous": false,
"dimensions": [
{
"date_histogram": {
"fixed_interval": "1d",
"source_field": "@timestamp",
"timezone": "UTC"
}
},
{
"terms": {
"source_field": "cattle_env"
}
},
{
"terms": {
"source_field": "level"
}
},
{
"terms": {
"source_field": "stack_service"
}
},
{
"terms": {
"source_field": "logger"
}
}
],
"metrics": [
{
"source_field": "logger",
"metrics": [
{
"value_count": {}
}
]
}
]
}
}