Conditional routes do not forward to correct index

Hi!

Version: Latest opensearch dataprepper image: opensearchproject/data-prepper:latest

Describe the issue: I am having problems with conditional routes and sending the logs to the respective sinks with fluent-bit. Even though I define a route like

route:
    - linux-syslog: '/filepath == "/var/log/messages"'
all logs with this filepath will be diverted to a random sink or end up in the sink without a route.

Configuration: My dataprepper pipeline is the following (sensitive data is omitted):

pipeline.yaml:
conditional-routing-linux-pipeline:
  source:
    http:
      port: 2021
      ssl: true
      ssl_key_file: "/usr/share/data-prepper/certs/host-key.pem"
      ssl_certificate_file: "/usr/share/data-prepper/certs/host-crt.pem"
  processor:
    - rename_keys:
        entries:
        - from_key: "log"
          to_key: "message"
  route:
    - linux-syslog: '/filepath == "/var/log/messages"'
    - linux-secure: '/filepath == "/var/log/secure"'
    - linux-audit: '/filepath == "/var/log/audit/audit.log"'
  sink:
    - opensearch:
        hosts: [data nodes]
        cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
        username: user
        password: password
        index: linux-syslog
        routes: [linux-syslog]
    - opensearch:
        hosts: [data nodes]
        cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
        username: user
        password: password
        index: linux-secure
        routes: [linux-secure]
    - opensearch:
        hosts: [data nodes]
        cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
        username: user
        password: password
        index: linux-audit
        routes: [linux-audit]
    - opensearch:
        hosts: [data nodes]
        cert: "/usr/share/data-prepper/certs/domain-ca-cert.pem"
        username: user
        password: password
        index: linux-undefined

My fluent-bit configuration from a host:
fluent-bit.conf:

[INPUT]
    name tail
    path /var/log/messages,/var/log/secure,/var/log/audit/audit.log
    path_key filepath
    tag logs
    refresh_interval 5
    DB /opt/fluent-bit/dbs/flb.db
    #parser trim
[OUTPUT]
    match *
    name stdout
    #name http
    #host dataprepperhost
    #port 2021
    #match logs
    #URI /log/ingest
    #format json
    #Json_date_key timestamp
    #Json_date_format iso8601
    #tls On
    #tls.verify On
[FILTER]
    name record_modifier
    match *
    Record hostname ${HOSTNAME}

This will generate logs like:

[1] logs: [[1751568204.129640072, {}], {"filepath"=>"/var/log/messages", "log"=>"Jul  3 20:42:34 host systemd[1]: check-mk-agent@33087-922-996.service: Deactivated successfully.", "hostname"=>"host"}]
[2] logs: [[1751568204.130433611, {}], {"filepath"=>"/var/log/audit/audit.log", "log"=>"type=SERVICE_STOP msg=audit(1751568154.692:69736): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=check-mk-agent@33087-922-996 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'UID="root" AUID="unset"", "hostname"=>"host"}]

Relevant Logs or Screenshots: As seen per the following screenshot, the logs are assigned to either the linux-undefined index or assigned to the wrong index (a /var/log/messages log should not be in the linux-audit log!):

Am I simply overseeing a problem in my routing configuration? Does someone have similar problems? Is there a way to properly debug routes?

Thank you,
Thomas

@ThomasK1730 I believe it’s attempting to recognise the second part of the message not as a string but perhaps as a field in a document, same as filepath.

Can you remove the starting / and confirm if this fixes the issue?

So the route would look like this:

route:
    - linux-syslog: '/filepath == "var/log/messages"'
    - linux-secure: '/filepath == "var/log/secure"'
    - linux-audit: '/filepath == "var/log/audit/audit.log"'

I have not been able to find a way to make it recognise the input as string that starts with /

Hello Anthony,

thank you for your reply!

Sadly your proposed way of defining the routes didnt achieve the desired outcome either. When I omitted the first /, all requests where redirected to the linux-undefined sink. This lets me conclude that no route matched and thus the requests were redirected to the default sink.

Escaping the first / was not an option either, I got the following errors:

line 1:14 token recognition error at: '\'
line 1:13 extraneous input '"' expecting {Function, Integer, Float, Boolean, 'null', JsonPointer, EscapedJsonPointer, VariableIdentifier, String, NOT, '-', '(', '{'}
line 1:32 extraneous input '"' expecting <EOF>
2025-07-06T14:21:13,547 [main] ERROR org.opensearch.dataprepper.core.validation.LoggingPluginErrorsHandler - 1. conditional-routing-linux-pipeline.route. caused by: Route linux-syslog contains an invalid conditional expression '/filepath == "\/var/log/messages"'. See https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/ for valid expression syntax.

I probably misinterpreted the escaping mechanisms since the docs refer to json pointers.

After that I tried to use regex expressions. There I had some problems as well:

line 1:13 mismatched input '"' expecting {JsonPointer, EscapedJsonPointer, String}
line 1:16 token recognition error at: 'messages$'
2025-07-06T14:35:06,520 [main] ERROR org.opensearch.dataprepper.core.validation.LoggingPluginErrorsHandler - 1. conditional-routing-linux-pipeline.route. caused by: Route linux-syslog contains an invalid conditional expression '/filepath =~ ".*messages$"'. See https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/ for valid expression syntax.

Did I oversee something here? The expression
'/filepath =~ ".*messages$"'
should be technically correct. The docs of the expression sytnax als reinforced this.

Finally I got my pipelines.yaml running by defining the following routes:

route:
    - linux-syslog: '/filepath =~ ".*messages.*"'
    - linux-secure: '/filepath =~ ".*secure.*"'
    - linux-audit: '/filepath =~ ".*audit.*"'

This is not the ideal use of regex expressions but I finally got my routes working:

As I am still intrigued, is there a way to debug pipelines?
I set this in my docker compose

    environment:
      - LOG_LEVEL=DEBUG

and this in my log4j2.properties:

status = error
name = PropertiesConfig

appender.console.type = Console
appender.console.name = STDOUT
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}] [%t] %-5level %logger{36} - %msg%n

rootLogger.level = TRACE
rootLogger.appenderRefs = stdout
rootLogger.appenderRef.stdout.ref = STDOUT

logger.pipeline.name = org.opensearch.dataprepper.pipeline
logger.pipeline.level = debug

Yet I saw very inconclusive output with which I could not determine route logic/route interpretation of requests.

Kind regards,
Thomas

@ThomasK1730 The issue with $ has already been reported here

I have also updated another issue with further comments and asked regarding logging here