Hi!
Version: Latest opensearch dataprepper image: opensearchproject/data-prepper:latest
Describe the issue: I am having problems with conditional routes and sending the logs to the respective sinks with fluent-bit. Even though I define a route like
route:
- linux-syslog: '/filepath == "/var/log/messages"'
all logs with this filepath will be diverted to a random sink or end up in the sink without a route.
Configuration: My dataprepper pipeline is the following (sensitive data is omitted):
pipeline.yaml:
conditional-routing-linux-pipeline:
source:
http:
port: 2021
ssl: true
ssl_key_file: "/usr/share/data-prepper/certs/host-key.pem"
ssl_certificate_file: "/usr/share/data-prepper/certs/host-crt.pem"
processor:
- rename_keys:
entries:
- from_key: "log"
to_key: "message"
route:
- linux-syslog: '/filepath == "/var/log/messages"'
- linux-secure: '/filepath == "/var/log/secure"'
- linux-audit: '/filepath == "/var/log/audit/audit.log"'
sink:
- opensearch:
hosts: [data nodes]
cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
username: user
password: password
index: linux-syslog
routes: [linux-syslog]
- opensearch:
hosts: [data nodes]
cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
username: user
password: password
index: linux-secure
routes: [linux-secure]
- opensearch:
hosts: [data nodes]
cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
username: user
password: password
index: linux-audit
routes: [linux-audit]
- opensearch:
hosts: [data nodes]
cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
username: user
password: password
index: linux-undefined
My fluent-bit configuration from a host:
fluent-bit.conf:
[INPUT]
name tail
path /var/log/messages,/var/log/secure,/var/log/audit/audit.log
path_key filepath
tag logs
refresh_interval 5
DB /opt/fluent-bit/dbs/flb.db
#parser trim
[OUTPUT]
match *
name stdout
#name http
#host dataprepperhost
#port 2021
#match logs
#URI /log/ingest
#format json
#Json_date_key timestamp
#Json_date_format iso8601
#tls On
#tls.verify On
[FILTER]
name record_modifier
match *
Record hostname ${HOSTNAME}
This will generate logs like:
[1] logs: [[1751568204.129640072, {}], {"filepath"=>"/var/log/messages", "log"=>"Jul 3 20:42:34 host systemd[1]: check-mk-agent@33087-922-996.service: Deactivated successfully.", "hostname"=>"host"}]
[2] logs: [[1751568204.130433611, {}], {"filepath"=>"/var/log/audit/audit.log", "log"=>"type=SERVICE_STOP msg=audit(1751568154.692:69736): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=check-mk-agent@33087-922-996 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"", "hostname"=>"host"}]
Relevant Logs or Screenshots: As seen per the following screenshot, the logs are assigned to either the linux-undefined index or assigned to the wrong index (a /var/log/messages log should not be in the linux-audit log!):
Am I simply overseeing a problem in my routing configuration? Does someone have similar problems? Is there a way to properly debug routes?
Thank you,
Thomas