Help with drop events to filter traces. please!

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue

I have serveral Java microservices using OTel agent to export traces and logs to Data Prepper. Data Prepper has to filter out the “noise”, endpoints like /actuator and calls to discovery-server as well as fetching resources from web-jars need not be sent to OpenSearch.

I have spent hours trying to figure out how to use drop_when to filter them out, but have been unsuccessful. It is NOT clear whether I need to filter the source or the sink structure, but based on intuition, I will say source.

Examples are RARE, AI knowledge does NOT EXIST and is all guesswork.

This is my current pipeline, trying to filter out /actuator and discovery-server calls at the moment:

otel-trace-pipeline:
  source:
    otlp:
      port: 21890
      ssl: false

  processor:
    - drop_events:
        drop_when: 'contains(/attributes/url.full, "discovery-server") or contains(/attributes/url.path, "/actuator/health")'
    - otel_traces: {}  # Standardizes incoming spans
      #    - service_map_stateful: {}  # Aggregates spans for Service Map
    - otel_trace_group:
        hosts: ["http://opensearch:9200"]  # Trace grouping for OpenSearch

  sink:
    - stdout: {}
    - opensearch:
        hosts: ["http://opensearch:9200"]
        index_type: trace-analytics-raw
        dlq_file: /var/log/data-prepper/dlq.log

This is a source message:

[otel.javaagent 2026-01-28 16:15:00:003 +0000] [http-nio-8080-exec-2] INFO io.opentelemetry.exporter.logging.LoggingSpanExporter - ‘GET /timesheet-monitor/actuator/health’ : 080ac8d54350d39a40dffa885905b155 73d16441c039b10a SERVER [tracer: io.opentelemetry.tomcat-10.0:2.23.0-alpha] AttributesMap{data={thread.id=89, http.response.status_code=200, url.scheme=http, thread.name=http-nio-8080-exec-2, network.peer.port=34782, url.path=/timesheet-monitor/actuator/health, client.address=10.0.0.2, network.peer.address=10.0.0.2, network.protocol.version=1.0, http.route=/timesheet-monitor/actuator/health, http.request.method=GET}, capacity=128, totalAddedValues=11}

This is the sink event message (different service/trace)

{“traceId”:“51df92fd2bcbaefd9f0e39b7827953ae”,“droppedLinksCount”:0,“instrumentationScope”:{“name”:“io.opentelemetry.tomcat-10.0”,“droppedAttributesCount”:0,“version”:“2.23.0-alpha”},“resource”:{“droppedAttributesCount”:0,“attributes”:{“telemetry.distro.version”:“2.23.0”,“service.name”:“timesheet-producer”,“telemetry.distro.name”:“opentelemetry-java-instrumentation”,“process.command_args”:[“/layers/paketo-buildpacks_bellsoft-liberica/jre/bin/java”,“org.springframework.boot.loader.launch.JarLauncher”],“process.runtime.version”:“21.0.9+15-LTS”,“os.type”:“linux”,“team”:“Connectus”,“process.pid”:1,“container.id”:“c619816a94411e0121e356aa502dce6b4da585b94992749c0337f83ca79148d3”,“telemetry.sdk.name”:“opentelemetry”,“telemetry.sdk.language”:“java”,“process.runtime.name”:“OpenJDK Runtime Environment”,“service.instance.id”:“9c00905e-8b0c-4cf9-b989-8ede7b469225”,“service.version”:“2.1.1217-SNAPSHOT”,“os.description”:“Linux 6.8.0-87-generic”,“process.executable.path”:“/layers/paketo-buildpacks_bellsoft-liberica/jre/bin/java”,“host.arch”:“amd64”,“host.name”:“c619816a9441”,“telemetry.sdk.version”:“1.57.0”,“process.runtime.description”:“BellSoft OpenJDK 64-Bit Server VM 21.0.9+15-LTS”,“deployment.environment”:“dev”},“schemaUrl”:“https://opentelemetry.io/schemas/1.24.0"},“kind”:“SPAN_KIND_SERVER”,“droppedEventsCount”:0,“flags”:257,“parentSpanId”:“”,“schemaUrl”:“https://opentelemetry.io/schemas/1.37.0”,“spanId”:“009b2149e913be0c”,“traceState”:“”,“name”:"GET /timesheet-producer/actuator/prometheus”,“startTime”:“2026-01-28T16:18:00.599564981Z”,“attributes”:{“user_agent.original”:“Prometheus/3.1.0”,“network.protocol.version”:“1.1”,“network.peer.port”:46694,“url.scheme”:“http”,“thread.name”:“http-nio-8080-exec-11”,“server.address”:“timesheet-producer”,“client.address”:“10.0.5.248”,“network.peer.address”:“10.0.5.248”,“url.path”:“/timesheet-producer/actuator/prometheus”,“http.request.method”:“GET”,“http.route”:“/timesheet-producer/actuator/prometheus”,“server.port”:8080,“http.response.status_code”:200,“thread.id”:154},“links”:,“endTime”:“2026-01-28T16:18:00.618479777Z”,“droppedAttributesCount”:0,“durationInNanos”:18914796,“events”:,“status”:{“message”:“”,“code”:0},“serviceName”:“timesheet-producer”}

I have ran out of idea on how to get the drop_when statement to look like.

Please help, thanks!

Configuration: Latest versions of opensearch, dataprepper and OTel agent (up to date to 1/2026)

Relevant Logs or Screenshots:

@ofir I tested this end-to-end and the issue appears to be with the drop_when expression syntax.

After otel_trace_source ingests spans, the field paths change. Span attributes live under /spanAttributes/ , not /attributes/ .
Add handle_failed_events: drop and use contains()

See full example:

otel-trace-pipeline:
  source:
    otel_trace_source:
      port: 21890
      ssl: false
      health_check_service: true
  buffer:
    bounded_blocking:
      buffer_size: 25600
      batch_size: 400
  processor:
    - drop_events:
        drop_when: 'contains(/spanAttributes/url.path, "actuator") or contains(/spanAttributes/url.path, "webjars") or contains(/spanAttributes/http.target, "actuator") or contains(/spanAttributes/http.target, "webjars") or /resource/attributes/service.name == "discovery-server"'
        handle_failed_events: drop
  sink:
    - pipeline:
        name: raw-pipeline
    - pipeline:
        name: service-map-pipeline

Hope this helps

Hi, Anthony, I did eventually manage to get it to work with:

otel-trace-pipeline:
source:
otlp:
port: 21890
ssl: false
processor:

  • drop_events:
    drop_when: ‘contains(/attributes/url.full, “discovery-server”) or contains(/attributes/url.path, “/actuator/”) or contains(/attributes/url.path, “/webjars/”) or contains(/name, “batchPoller”) or contains(/name, “batchErrorPoller”) or contains(/name, “AzureFileStorage”)’

I noticed that you’re using the ‘old’ method for otel_trace_sourc. I’m not sure if that’s having an impact here. I actually had to use three different AI tools to finally get this expression working because documentation, especially practical examples, is scarce. Conversely, many obsolete examples still exist (which further confuses the AI), and there have been numerous breaking changes as the project evolved. I hope this gets sorted in the future.

1 Like