Enable telemetry-otel for Opensearch 2.12

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Opensearch 2.12
telemetry-otel plugin 2.12

Describe the issue:

Unclear how to get on-demand traces from Opensearch.

Configuration:

Based on documentation here: Distrbuted tracing - OpenSearch Documentation

  1. “opensearch.experimental.feature.telemetry.enabled”: “true”,
    “telemetry.tracer.enabled”: “true”, in cluster settings.

  2. search query with ‘trace=true’ param fails:

curl -XGET "http://localhost:9200/index/_search?pretty=true&trace=true" -H 'Content-Type: application/json'  -d'
{
  "query": {
    "match": 
      "text_entry": "the"
    }
  }
}'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "request [/index/_search] contains unrecognized parameter: [trace]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "request [/index/_search] contains unrecognized parameter: [trace]"
  },
  "status" : 400
}
  1. Expectation: Search query succeeds and _otel_traces.log available immediately after the query.

Set the trace param in the header instead as a URL param:

curl -XGET "http://localhost:9200/index/_search?pretty=true" -H 'Content-Type: application/json' -H 'Trace: true' -d'
{
  "query": {
    "match": {
      "text_entry": "the"
    }
  }
}'

In addition, set telemetry.feature.tracer.enabled=true in cluster settings.
I see a few log statements now, but not the traces themselves:

2024-04-09 17:09:01,978 INFO [2024-04-09T17:09:01,937][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'indices:data/read/search[phase/query] 127.0.0.1' : ce32cf9ddb528d9ed780437f253e60aa 33968f64a611ec45 SERVER [tracer: org.opensearch.telemetry:] AttributesMap{data={action=indices:data/read/search[phase/query], target_host=127.0.0.1, thread.name=opensearch[Umang-Sharan][transport_worker][T#9]}, capacity=128, totalAddedValues=3}
2024-04-09 17:09:01,979 INFO [2024-04-09T17:09:01,938][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'GET /index/_search' : ce32cf9ddb528d9ed780437f253e60aa 146fb8202065432c CLIENT [tracer: org.opensearch.telemetry:] AttributesMap{data={rest.request_id=14, url.query=pretty=true, rest.raw_path=/index/_search, thread.name=opensearch[Umang-Sharan][transport_worker][T#9]}, capacity=128, totalAddedValues=4}
2024-04-09 17:09:01,980 INFO [2024-04-09T17:09:01,939][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'GET /index/_search' : ce32cf9ddb528d9ed780437f253e60aa c14a6475adc49376 SERVER [tracer: org.opensearch.telemetry:] AttributesMap{data={trace=true, url.query=pretty=true, http.uri=/index/_search?pretty=true, thread.name=opensearch[Umang-Sharan][transport_worker][T#9], http.method=GET, http.version=HTTP_1_1}, capacity=128, totalAddedValues=6}

@umang-glean by default, tracing is using log exporter:

2024-04-09 17:09:01,978 INFO [2024-04-09T17:09:01,937][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'indices:data/read/search[phase/query] 127.0.0.1' : ce32cf9ddb528d9ed780437f253e60aa 33968f64a611ec45 SERVER [tracer: org.opensearch.telemetry:] AttributesMap{data={action=indices:data/read/search[phase/query], target_host=127.0.0.1, thread.name=opensearch[Umang-Sharan][transport_worker][T#9]}, capacity=128, totalAddedValues=3}
2024-04-09 17:09:01,979 INFO [2024-04-09T17:09:01,938][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'GET /index/_search' : ce32cf9ddb528d9ed780437f253e60aa 146fb8202065432c CLIENT [tracer: org.opensearch.telemetry:] AttributesMap{data={rest.request_id=14, url.query=pretty=true, rest.raw_path=/index/_search, thread.name=opensearch[Umang-Sharan][transport_worker][T#9]}, capacity=128, totalAddedValues=4}
2024-04-09 17:09:01,980 INFO [2024-04-09T17:09:01,939][INFO ][i.o.e.l.LoggingSpanExporter] [Umang-Sharan] 'GET /index/_search' : ce32cf9ddb528d9ed780437f253e60aa c14a6475adc49376 SERVER [tracer: org.opensearch.telemetry:] AttributesMap{data={trace=true, url.query=pretty=true, http.uri=/index/_search?pretty=true, thread.name=opensearch[Umang-Sharan][transport_worker][T#9], http.method=GET, http.version=HTTP_1_1}, capacity=128, totalAddedValues=6}

Those are the traces themselves.

Great, thanks!
Is telemetry.otel.tracer.span.exporter.class=org.opensearch.telemetry.tracing.exporter.OtlpGrpcSpanExporterProvider expected to work? I don’t see it in OpenSearch/plugins/telemetry-otel/src/main/java/org/opensearch/telemetry/tracing/exporter at main · opensearch-project/OpenSearch · GitHub

I’d like to export the OTEL traces to GCP/AWS.

No, I think the class is not there. You could use the OTEL one(s):

telemetry.otel.tracer.span.exporter.class: io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter

I think it has limitations with respect to endpoints (it uses the default one at localhost).

That should be okay, thanks for the pointer.
I’m able to export spans to otel-collector, however, metrics are also being logged and exported. I have the following cluster settings:

    "telemetry.feature.metrics.enabled": "false",
    "telemetry.feature.tracer.enabled": "true",
    "telemetry.otel.metrics.exporter.class": "io.opentelemetry.exporter.logging.LoggingMetricExporter",
    "telemetry.otel.tracer.span.exporter.class": "io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter",

My expectation based on these settings is:

  1. No metrics exported through LoggingMetricExporter since telemetry.feature.metrics.enabled is false.
  2. No metrics exported to otel-collector, only traces.

Are they correct? What settings limit otel-collector to traces/spans only?
Sample metrics exporter logs below:

2024-04-10 13:09:40,020 INFO [2024-04-10T13:09:39,975][INFO ][i.o.e.l.LoggingMetricExporter] [Umang-Sharan] Received a collection of 2 metrics for export.
2024-04-10 13:09:40,021 INFO [2024-04-10T13:09:39,976][INFO ][i.o.e.l.LoggingMetricExporter] [Umang-Sharan] metric: ImmutableMetricData{resource=Resource{schemaUrl=null, attributes={service.name="OpenSearch"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.exporters.otlp-grpc, version=null, schemaUrl=null, attributes={}}, name=otlp.exporter.exported, description=, unit=, type=LONG_SUM, data=ImmutableSumData{points=[ImmutableLongPointData{startEpochNanos=1712778279966623000, epochNanos=1712779779975091000, attributes={success=false, type="span"}, value=3, exemplars=[]}, ImmutableLongPointData{startEpochNanos=1712778279966623000, epochNanos=1712779779975091000, attributes={success=true, type="span"}, value=12, exemplars=[]}], monotonic=true, aggregationTemporality=CUMULATIVE}}
2024-04-10 13:09:40,021 INFO [2024-04-10T13:09:39,976][INFO ][i.o.e.l.LoggingMetricExporter] [Umang-Sharan] metric: ImmutableMetricData{resource=Resource{schemaUrl=null, attributes={service.name="OpenSearch"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.exporters.otlp-grpc, version=null, schemaUrl=null, attributes={}}, name=otlp.exporter.seen, description=, unit=, type=LONG_SUM, data=ImmutableSumData{points=[ImmutableLongPointData{startEpochNanos=1712778279966623000, epochNanos=1712779779975091000, attributes={type="span"}, value=15, exemplars=[]}], monotonic=true, aggregationTemporality=CUMULATIVE}}

The expectations are correct for OpenSearch but I think you see metrics from OtlpGrpcSpanExporter, not OpenSearch

Thanks for your help!
Is there a good example showcasing tracing support within other plugins? It’s called out as a tenet in [RFC] Distributed Tracing · Issue #6750 · opensearch-project/OpenSearch · GitHub, but I couldn’t find one.

We don’t have many examples sadly, but tracing support rolls out gradually. The only plugin that uses it is NetworkPlugin (it has Tracer instance pushed down), others will come for sure at some point. Thank you.