Can structured JSON output from containers override or extend the default log fields in OpenSearch?

Hi everyone,

We’re running Node.js containers on Kubernetes (Rancher-managed CaaS shared clusters) with centralized logging to OpenSearch Dashboards. Currently, our container is captured and indexed with this structure:

{
“_source”: {
“cluster_name”:
“kubernetes”: {
“container”: { “name”: },
“pod”: { “name”:},
“deployment”: { “name”: }
},
“namespace”:,
@timestamp”: ,
“message”:
}
}

Instead of plain text, we want our application to emit structured JSON logs like this:

console.log(JSON.stringify({
timestamp:",
level: ,
category: ,
step: ,
message:,
environment:,
additionalDetails:,
app:
}));

Our goal is for those fields (level, category, step, message, additionalDetails, environment) to appear as top-level, individually searchable and filterable fields in the OpenSearch document not buried as a JSON string inside the message field. Is that possible? If so, does it happen automatically, or is there something we need to configure or request from the platform team?

@codesrcdoc That’s a great question. In order to achieve this you would need to use Data Prepper to parse the log lines in JSON, see below example pipeline.yaml for data prepper:

log-pipeline:
  source:
    http:
      ssl: false
      # Keep whatever auth/SSL settings your existing pipeline already has

  processor:
    - parse_json:
        # Most Kubernetes log collectors (Fluent Bit, Fluentd) put the raw
        # container log line in a field called "log". If yours uses a different
        # field name, run a quick check — see note below.
        source: log
        delete_source: true

  sink:
    - opensearch:
        hosts: ["https://<your-opensearch-host>:9200"]
        username: <username>
        password: <password>
        index: app_logs

Hope this helps