In trying out the new rollup APIs, I’m having trouble understanding what is actually happening.
What I want is to be able to rollup an hours worth of documents into an hourly index where I can then see the results of each individual hours rollup. It would be fine if these were in separate indicies (hourly-2021-05-11-0300) for 0300 UTC today or if they were just in hourly and I had to use some kind of time based query on them.
When I tried to configure this, it doesn’t seem to be doing that.
I used source-* as my source index where the * is for the rollover IDs. I used hourly for the target index. I set the time aggregation to use the @timestamp field with a calendar interval of every 1 hour in timezone UTC.
I added a couple fields in the aggregations and metrics and told it to run every 1 hours with an execution delay of 5 minutes to account for delays in processing incoming records.
It’s been running for over 24 hours. There are at least a billion records in the source indicies. Probably fewer than 2 billion.
It’s not clear to me what is happening now.
Did it start at the most recent hour, or the earliest one it found? Is it doing every hour it has in every index? Will these rollups be a single rollup aggregating all the hours together, or will I be able to see the rollups for individual hours?
What’s the best way to “see” what’s in the rollup index? I can’t use kibana discover with the index, and all I seem to be able to do with the _search API is look for the aggregation fields, but I can’t tell what hour this data is from (or if that information has been discarded).
What happens if the rollups cannot keep pace? Is there a way to go above 10,000 pages per execution? Is there any way to get an idea of what the rollup API is doing? What hour it is working on? If the rollups start at the beginning, what happens when it reaches the start time and there are new records? There will be many new records in the 24 hours it has been running so far. Does it keep going, or stop and start again at the next scheduled execution time?