Further models for anomaly detection?

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.6.0

Describe the issue:
Hi all,

is there a possibility to create further models in the anomaly detection module?
In Elastic, for example, there are things like outlier detection, is something like that also possible in OpenSearch?

Is there documentation / instructions on how to develop and implement such a model?

Thanks and regards,
Philipp

Hi @phipiship1, can you describe the type of anomaly detection models that you like to create and the use cases that you like to support? Details about the data sources, algorithms and other characteristics like offline versus streaming would be helpful.

In terms of the roadmap, we do have plans to add forecasting to the anomaly detection plugin. We also play to continue to build our ML framework to support more models beyond text embedding models. In the future, we would like for you to be able to bring your own anomaly detection model, integrate it through the framework and use it through our various query interfaces (eg. DSL, SQL, PPL).

Furthermore, you also have the option to implement outlier detection and anomaly detection type support using our existing VectorDB capabilities. If you have the ML expertise, you could create an embedding model to, for instance, represent user behavior activity logs, build a k-NN index and perform similarity vector queries. You can detect outliers and anomalies by identifying vectors that are significantly dissimilar.

Hi @dylan
thanks for your answer.

One use case would be, for example, the detection of untypical logons of a user in relation to time and location.

Unfortunately, I have no experience with the development of such ML models, do you have a tutorial that is recommended for getting started?

Thanks and regards,
Philipp

@phipiship1, I don’t have a tutorial to implement a solution like this, but we are working on building application samples and templates. Your use case is something we should keep on the radar (@wbeckler). I’ll also take note that a tutorial and/or blog post on this use case would be useful.

On that note, we have customers and partners like Graylog who have implement your use case using our existing anomaly detection capabilities. Here’s a presentation by Graylog on your use case (UEBA):

Regards,
-Dylan

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.