Inbound security

Hi,

How can I secure the input of the dataprepper?
In other words, how can I make sure that only given clients are allowed to send data to the dataprepper, while others (ie, ones that aren’t mine) are not allowed to?

With kind regards,
Matthijs ter Woord

2 Likes

Hi @mterwoord

By default, the HTTP Source Input is unauthenticated and SSL is disabled. However, you can follow this link to secure the input with basic authentication and enable SSL for the Data Prepper 1.2 release.

Thanks!

Hello Matthijs,

In the upcoming Data Prepper 1.2, you will be able to configure HTTP Basic authentication with a username and password. This release is not yet available, but I expect it next week. You can configure the OTel Trace Source in your pipeline configuration file with something like the following:

source:
  otel_trace_source:
    authentication:
      http_basic:
        username: my-user
        password: my_s3cr3t

The full documentation is in the OTel Source README.

You will also be able to secure the core API with a similar mechanism as documented here.

These both use the plugin framework and you can add other authentication mechanisms through Java code.

In the current version (1.1), the primary means of security is through network configuration.

David

So primary means currently is using a reverse proxy and limiting there.
I guess that should work. Only thing left would be to make sure that client X cannot send data which looks like data from customer Y. Any suggestions here, or is that not possible?

Matthijs,

So primary means currently is using a reverse proxy and limiting there.
I guess that should work.

Yes, for Data Prepper 1.1, securing it with a reverse proxy is a solution. If HTTP Basic authentication is sufficient for your needs, it will be available in version 1.2 (scheduled for Dec 7th).

Only thing left would be to make sure that client X cannot send data which looks like data from customer Y. Any suggestions here, or is that not possible?

Are you looking for something like RBAC where different clients are allowed access to different pipelines? You can do something similar by creating different pipelines. You could then have different username/passwords on the OTel Trace Sources. Presently Data Prepper does not support multiple sources, so the pipelines would have to be duplicated. GitHub issue #406 is a feature request for multiple sources. Please comment or upvote it if this looks like it would be helpful for your use-case.

If you are looking for something else, you can also create a Feature Request in GitHub. I’d like to discuss what your needs are and how Data Prepper can meet them.

We have several applications running, both in house, at customers, and in the cloud. Currently we use a custom tool transforming MS Application Insights requests into documents in elasticsearch. This tool understands different keys, and does some protection. What I’m hoping to be able to do is to make a set of credentials for each (set of) application(s), and have them only be able to send logging as themselves, and not the others. And also, preferrably, have open search indices be split.

I believe the only solution right now is to create a different pipeline per application. I understand that this not ideal because you’ll be creating extra ports and threads. Would you like to create a feature request? That would help define the exact requirements for creating a solution.

See https://github.com/opensearch-project/data-prepper/issues/712
If there’s anything unclear, please do ask :slight_smile:

1 Like