Anomaly Detection for Continuous Ingesting Data

Hi, Currently I have ingesting a 20-25GB of daily data in opendistro. And I have created multiple detectors and set aggregation method avg but its getting failed after a couple of hours and got message not have enough data to initialize detector. But when i set aggregation method SUM. The detectors running successfully. Please let me know where I am having mistake. What would be the settings for Detector Interval, Window delay and window size for it ?

AD will detect realtime streaming data. Can you check the date range your data? For SUM, it will return 0 if no data found, but for AVG, it will return null if no data found. So if you are using SUM, the detector can run successfully as it can get 0 even no data.

I’m getting following error message for the detector I have created: " The detector is not initialized because no sufficient data is ingested.
Make sure your data is ingested correctly. If your data source has infrequent ingestion, increase the detector time interval and try again."

Also I have shared the screenshot of detector settings. Window Size: 8

@ylwu Please check my previous message. Can you help me on it ?

From the screenshot, the detector interval is 40 minutes. The error message means you have no enough data to train model. The model need about 150 data points to train. Here one data point is the aggregated result of one 40 minutes interval. If you don’t have historical data, then the detector will wait 150 intervals to get enough training data. That will be 150 * 40 = 6000 minutes. To make the model pass training faster, you can ingest enough historical data or reduce detector interval, make sure every interval (40 minutes in this case) has data.

make sure every interval ( 40 minutes in this case) has data.

If an interval, or multiple intervals, has no data can the real time detector ever recover? I have a detector that seems to start normally with historical data but if there is ever a lull in the index receiving data the real time detector appears to stop working. The only way the detector recovers is with a restart of the OpenSearch node.

The realtime detector job assumes there are always some data under your interval and data filtering settings. It’s ok that data missing in some interval, detector job may skip these intervals. If data back to normal (there is data in all or most intervals), then detector job can work correctly again. Can you explain why you have to restart OpenSearch nodes? Can restart detector realtime job work? If don’t restart nodes, what exception do you see?

Restarting the OpenSearch nodes is the only thing that seems to get the detector back up and running correctly. I tried to just restart the detector itself but the detector just gets stuck in an Initializing state even when normal data is flowing in again. I have had detectors stuck in Initializing for hours and hours when trying to stop and restart them. After restarting the node, the detector goes into the Initializing state but will fairly quickly (10-15 minutes) go into a Running state and I can see results coming in for each interval. If it can restart within 20 minutes after a node restart, I would assume it would do the same without having to be restarted. The issue is I’m not seeing any exceptions or particularly useful logging.

To clarify there are two issues I see: the first is that the detector appears to stop working properly if it doesn’t receive any data in the index for awhile. There are no errors or exceptions in the API calls/logs that I can see. The second is that once the detector is in that state there does not seem to be any recovery when normal data begins to flow again without a restart of the node itself.

@zack , that seems some bug. Can you create a Github issue Issues · opensearch-project/anomaly-detection · GitHub ? You can add details like which OpenSearch version you are running, how many nodes in your cluster, how many detectors are running, your detector configuration (you can get by get detector API), what’s your OS version etc.