Using security analytics is hard mostly because of a lack of documentation

The security analytics plugin is one of the most exciting plugins for us. The ideas behind the plugin seem good, the design principles sound. But actually ingesting data in such a manner that is useful and generate alerts is unclear. The documentation and GUI offer little guidance on log sources and mapping.

See also this post: Slack

Arnold his post is spot on in my experience.

At the university we have been using opensearch for multiple use cases and in general it’s been more than fine. Fantastic software, great community. But we find it hard to implement this particular feature while it would help us greatly.

What does not help is that the security analytics team is more difficult to reach than some other teams. The Slack channel is not that well monitored it seems, Github issues less looked after and there are no triage meetings I’m aware of.

We really appreciate all the hard work that has been done for this plugin. Maybe it’s about not having enough time, which would be very understandable. But solving these things would make it easier to actually make use of all this hard work. Interacting with the community could also help improve the feedback cycle.

2 Likes

Totally agree with you @neographikal . Slack I for now I’ve seen zero responses. During the last weeks I found out a lot of things during code reviews on the actual sources, but still missing proper documentation. Especially things like what preferred sources, somehow it has support for ECS, but what to expect?.
TL;DR

It would be great if most detection rules have a scenario guidance, for example which sources and format it was well tested. Say a collection reference. This can already help a lot with the provided rules. This would make acceptance and creation of detectors much easier, especially working with Security Analysts, which look for functionality rather than technology challenges.
For AWS environments I see a lot of good samples, but to give some examples.

Windows logs and AD/LDAP logs and Apache logs are supported. Great how do we gather these ? Which ways did you validated the support? Can we use Sysmon, filebeat-oss or maybe better fluent-bit? Format should be ECS, but then. Afaik would be great to additionally have a repo with samples available.

We believe Security Analytics from the Foundation is a excellent extension, but again as every product it will require adoption to be smooth and reliable towards an outcome.

You can see this as an offer to help working on the documentation and make the product better for the community.

I couldn’t agree more - it’s like having a race car in the garage only to drive it at night when none sees you crashing it. I had been using SigmaHQ and Opensearch before and was most enthusiastic when the Security Analysis module was anounced.

My use case is Windows logs analytics, so I used the pre-existing tools for this (namely Winlogbeat). I created a page (in German) on how to adjust the templates for Opensearch. I have an updated script that also modifies the ingestion-pipelines and imports them.

The process of mapping (Sigma rules to OpenSearch fields) is no fun thing to do. Often there are is more than one possible target field. While my Java knowledge is more than limited, I even try to look at the security analytics code on Github for the missing pieces of documentation.

I hope this thread will help to bring interested parties together and aid in creating better documentation.

In my opinion we need:

  1. a common schema (like ECS)
  2. a transparent overview of the field mapping of external sources (like Sigma) to the common schema

Thanks for creating this issue - I was almost feeling I am the only one :smile:

Hi @neographikal , @arnold79 @adn77 ,

Thank you for sharing your concerns. Im taking this feedback to my team and will work on a plan to improve documentation in the coming weeks. If there are specific areas that need immediate attention, feel free to highlight those so we can prioritize.

Regarding better engagement on slack, GitHub issues etc, we will work on a plan to improve that in the coming weeks.

While i dont have better news immediately, i just wanted to let you know that your voices are important and are being heard to help make this a better product.

Thanks
Jimish

1 Like

Thank you Jimish, I didn’t expect a solution overnight but it’s good to know it’s being looked at. I appreciate you responding here in this fashion, receiving feedback isn’t always easy. Please know that everything comes from a good place and from wanting to help the project. If I can help at all, no problem.

As for the focus, in my opinion the most pressing issues are:

  1. How to ingest logs in such a manner that it can be processed by the plugin is totally unclear. Which log ingesters to use, do you need certain index pipelines, how to map the data? For each log type the flow needs to be described end to end.
  2. Names of the default log types are inconsistent and unclear. A few examples: Microsoft365, which log exactly? Network isn’t network general, from the field names I gather it supports Zeek logs. Then it should be clear that it is about Zeek and how to ingest the data. In our SIEM setup we have named things like logtype-system, so something like: network - zeek. Then you could also do network - suricata or network - fortigate logs etc.
  3. Security analytics is based on ECS from what I’ve read. ECS support in opensearch is not fully implemented. You can’t just import all the component templates from the ECS repo. If the plugin claims to use ECS, it should be clear how to use it and what does and what does not work. (or in the ideal world implement the missing stuff so ECS works out of the box)
  4. In the past year I have tried to use the plugin several times, but even starting out on a clean cluster I ran into errors multiple times. It has gotten better, but still it’s not error free. I reported two recently on Github. The testing procedures could be more extensive.
  5. Interaction with the community like you addressed in your post.

The first point is absolutely essential. Without data there is no SIEM. Without the SIEM there is no detection. Getting data in (and monitoring that it keeps coming in) is the foundation of the whole thing.

Hope this helps, I’m always open for a call and to help out.

2 Likes

Thank you @neographikal . This is great user feedback. Also 100% agreed on your callout about the guidance on bringing data into OpenSearch and using security analytics for it.

We designed security analytics to use the same ingestion patterns as you would for any other data source ingested into OpenSearch. But i understand that this is clearly highlighted in the documentation. We will improve that.

We will definitely take up your offer on a feedback conversation.

Thank you for the feedback. We understand some of the challenges around ingestion of data to setting up detectors and creating alerts. As we continue to add and harden the foundational pillars of detection, visualization, findings/alerts, correlation engine, however have not lost sight of some of the friction around documentation, ingestion of security event logs, mappings and other related items.
Please continue to provide feedback and we will prioritize accordingly.

Agreed! We hope to have a story around data ingestion to enable users seamlessly ingest data into OpenSearch clusters and have detectors created.

Can you help and create an issue on our GitHub documentation repo

@praveensameneni thanks for your response. Will continue providing feedback and let me know we additional help is needed. Importance to provide out of the box features, they should work as expected. I think this is an excellent goal, but indeed requires user feedback from the community. I’m familiar with customising and working with indices, but Security analysts tend to be less familiar with most engineering in-depth knowledge.

I do see some good communication, questions and feedback coming in, also from Slack which is great. I will continue with my research improving my Security Analytics repo.

1 Like

Thanks, no problem to do a call and I suppose others are willing as well. To be clear, I’m less skilled with the internals and index mappings than Arnold, but I usually can take care of my own. With the security analytics I’ve hit a wall which is a shame.

About the ingest patterns. Totally agree they are not different from an architectural standpoint. It’s still log collectors, forwarders and opensearch as a sink no matter if you’re doing logs, otel, IoT-data, etc.

Where this doesn’t work out is how this correlates with the sigma rules and the detectors. That’s one layer deeper, on a field level and even one more deeper: the format of the data. And that comes down to which ingest option you use, what it’ll output by default.

Every field that a log type needs, needs to be 100% clear: what should go in this field and what is the easiest way to accomplish this. You can gather wineventlogs (is that the correct one?) with fluent-bit if you are a masochist, but there are probably easier ways to fill the 152(!!) fields the log type supports.

But I’m really happy we have this conversation going, let’s see where it takes us. I’ll try to find some time to open two documentation issues

  • End-to-end ingest documentation
  • Health monitoring on three levels (monitoring the log-ingester for each source/server, monitoring each log source for that log ingester and implement health checks on the mapping of the data)

The latter is from experience with a large SOC-vendor where the importance of this became clear during multiple tests. Same mantra: no data, no SIEM, no detection.

Already opened an issue for the network logs: [FEATURE] Rename network events to Network events: Zeek · Issue #512 · opensearch-project/security-analytics · GitHub

Good afternoon, I’m happy to see the lively discussion re: Security Analytics today. I’ve been struggling with this component myself for several weeks and wanted to share some of my experience as well. I only offer these comments to aid with improving the product and sincerely appreciate all the development efforts to date. It’s turning into a great product!

  • Documentation: successful implementation of this module does require a deeper understanding of the underlying processes within SA. Here are some areas that are unclear - how are fields mapped, what are the relevant indices for SA and what does each do, how correlations work, etc.

  • Multi-tenancy: I’ve really been struggling with trying to figure out how to enable a multitenancy use case. For example all findings/alerts go into one index. It would be better to split them out. Allthough the documentation indicates that filtering by role should isolate tenants - so far I have been unable to successfully isolate tenants from (detectors, alerts, findings, and correlations).

  • Rules: The default rules seem to include everything from sigma (including the many experimental rules which are poorly written or generate a lot of false positives). Rules are updated with each release, but I think funamentally it would be better to have a way to update the rules easier/more frequently. We are considering a local repo with vetted rules that we push in as custom rules to opensearch. In fact, we are looking for a way to remove the default rules and categories. The current rules are also out of date - but I believe this was being addressed in the next 2.12 release

  • Rule mappings: This is a very misunderstood process. We’ve run into lots of issues here. The default mappings dont seem to follow ECS. We have winlogbeat feeding data into our opensearch but hardly any of the default field mappings work. We found that the GUI creates a mapping template (based on the name of the log collection), but if you ever change a field in your index - it breaks horribly because that field doesn’t exist in the older indexes. It makes it impossible to bring in new data without reindexing or deleting the old indexes. This has been a real problem for us. We tried to modify the mapping template manually but then found that the GUI overwrote the changes to our template which was very frustrating.

  • Query generation from SIGMA rules: I saw today some bug reports in github re: the parsing of sigma keywords and also null value. The way Security Analytics parses the rules and generates the necessary detection queries for use in opensearch does not handle these items correctly. So those rules are generating hundreds of false positives or not working at all.

  • Correlations: it appears in the last release or two some auto generation of correlations was implemented - but these are causing massive indexes to accumulate. I think it may be fixed in 2.11 - but I cannot find documentation anywhere for this feature.

  • Management of the SA indexes - We could really use some better guidance on how to manage the ISm policies for rollover/deletion period, etc for the SA indexes (alerts, findings, detections, etc). Additionally, we found that when our indexes ago out and get deleted - the finding entries still exist and if you click on one in the GUI then it causes the GUI to crash. So we need a way to remove the old findings when we remove the old index data,

  • Alert notifications: The message body that can be generated for an alert has insufficient data right now. It would be great if we could use mustache variables or soemthign similar to send more data to our external slack (webhook) for example. Enough info to be able to construct a hyperlink back to the original alert in the gui interface, etc. would be great. Right now this is extremely limited.

  • The Security Analytics GUI: this has a number of relatively minor but critical bugs with respect to generating detectors and especially mappings. There are a number of bug reports in github for these issues. A lot have to do with input validation and not properly validating sigma rule text. This causes a lot of issues where mappings get partially created with errors. Also cloned rules fail to edit properly.

Documentation would really help with the ingest of data, rule mappings, etc. But fundamentally I would like to see more manual control over rule creation, mapping creation, index mgmt, and especially multitenancy separation. I also think a look-back period like elastic search would be useful - so that new rules dont go all the way back to the beginning of your indexes - just go back xxx hours/minutes/whatever.

Thanks everyone!

Would love to hear your thoughts on the out of the box features - are you referring to visualizations / dashboards / findings with knowledge base?
Please do create an issue on our GitHub repo with enhancement or bug tag for us to prioritize.

1 Like

Will try clarifying in the docs as well as a blog post on the purpose of the correlation engine, it’s current nascent implementation of correlating the findings executed on the matching documents of Sigma rules with a field present in different logs.
We built the foundation of the correlation engine with a bigger vision of increasing the fidelity of the data across various security event logs and security event logs with metrics, traces and logs - which is where the true value will come to the fore.

Apologies, we will add more documentation and also write a detailed blog on the internal workings of how - we create indices for the findings, execution of Sigma rules against the incoming documents (executing the rule as percolate queries of a doc level monitor). There’s so much to dive deep and share. Together, we can evolve this into a great community built product.

Welcome @tallyoh and thank you for taking the time to post a detailed list of opportunities!

Can you please create an issue on our documentation repo - would like to track everything that needs to be addressed.
As for the correlations, you may find this blog helpful.

Can you please check out the Index Management documentation on defining the policies related to transition states. I also found this blog that maybe helpful

We will review the issues and post updates.

@praveensameneni thanks for the feedback. I’ll take a look at those references and create teh documentation ticket as well. Thanks

Hi @praveensameneni . Mainly about detection rules, mapping and findings. I will create an issue around these topics, when I have some spare time next week I will create it.

Also I Like the way how you and the team are approaching this at the moment. :+1:

@praveensameneni good evening. Could the security analytics team host an online meeting where we could ask questions live and also maybe do a deep dive with the developers who are most knowledgeable about how SA works under the hood? I bet such a meeting (maybe recorded too) might be really helpful to get more users/implementers comfortable with the plugin and ask questions related to their use cases. It would also be good to know the roadmap for the plugin. Thanks

Good idea I would join :slight_smile:

As for opening an issue: I found my own while searching through open issues, opened in May:

We had some good meetings about this, I’m curious about the plan going forward and the priorities. Is it possible to shed some light on this and can we help out?