Data-preeper plugin

Hello,

I’m using opensearch 1.2.3 with opensearch-dashboards 1.2.0 and data preeper 1.2.1 using docker-compose dans docker container.

data-preeper using sink : stdout works well, but it’s impossible to replace it by opensearch sink using ssl or not.

Here is part of my docker-compose (not ssl):
version: ‘3’
services:
opensearch:
image: sfy-metriks-registry-prod.artifactory/opensearch:${OPENSEARCH_VERSION}
#image: opensearchproject/opensearch:${OPENSEARCH_VERSION}
container_name: opensearch
environment:
- cluster.name=opensearch
- node.name=opensearch
- discovery.seed_hosts=opensearch
- cluster.initial_master_nodes=opensearch
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- “OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m” # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
- “DISABLE_SECURITY_PLUGIN=true”
- “OPENSEARCH_DEMO=false”
shm_size: ‘1Gb’
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
hard: 65536
volumes:
- opensearch-data:/usr/share/opensearch/data

#- ./opensearch.yml:/usr/share/opensearch/config/opensearch.yml

ports:
  - 9200:9200
  - 9600:9600 # required for Performance Analyzer
networks:
  - opensearch-net

opensearch-pipeline:
image: opensearchproject/data-prepper:latest
container_name: opensearch-pipeline
#restart: unless-stopped
depends_on:
- opensearch
ports:
- 21890:21890
volumes:
- ./pipelines.yaml:/usr/share/data-prepper/pipelines.yaml
- ./data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml
networks:
- opensearch-net

volumes:
opensearch-data:

networks:
opensearch-net:

And :
data-prepper-config.yaml
ssl:false

pipelines.yaml:

simple-sample-pipeline:
workers: 2
delay: “5000”
source:
random:
sink:

- stdout:

- opensearch:
    hosts: ["http://localhost:9200"]
    index: "logstash-test"

Result is:
opensearch-pipeline | 2022-01-17T09:00:26,255 [main] ERROR com.amazon.dataprepper.plugin.PluginCreator - Encountered exception while instantiating the plugin OpenSearchSink
opensearch-pipeline | java.lang.reflect.InvocationTargetException: null
opensearch-pipeline | at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
opensearch-pipeline | at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64) ~[?:?]
opensearch-pipeline | at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]


opensearch-pipeline | at com.amazon.dataprepper.plugin.PluginCreator.newPluginInstance(PluginCreator.java:38) ~[data-prepper.jar:1.2.1]
opensearch-pipeline | … 13 more
opensearch-pipeline | Caused by: java.net.ConnectException: Connection refused
Have you any idea:my configuration seems correct

Thank you

Hi @comijac,

Can you please make sure OpenSearch is running and try restarting it to check if it resolves the error?

FYI, we have seen this bug previously with the OpenSearch:1.0.0 docker container and based on the error you posted it seems like it might persist in OpenSearch:1.2.3, we have opened an issue for the same: [BUG] Opensearch:1.0.0 docker container run with flaky failure in github CI · Issue #1350 · opensearch-project/OpenSearch · GitHub

Thanks

yes i’m sure opensearch is running.

Running java for data-prepper (as in data-prepper-tar-install.sh) out of container works well with opensearch embedded in a container.

Creating a container for data prepper , running java for data-prepper (as in data-prepper-tar-install.sh)fails

Creating the same container, except lauch an infinite loop by sleep at start , then launching java for data prepper (as in data-prepper-tar-install.sh) manually (as root) in the container works well.

thanks

@comijac

Could you also replace http with https and restart OpenSearch?

- opensearch:
    hosts: ["https://localhost:9200"]
    index: "logstash-test"

Just to clarify, did you mean creating a container for OpenSearch here?

Creating a container for data prepper , running java for data-prepper (as in data-prepper-tar-install.sh)fails

Yes,
Creating a container : both by docker-compose up.
But when I launch data-prepper manually inside its container, it works. Only it fails when data-prepper is started by docker entrypoint.

As you can see in my docker-compose.yml above, I have 2 containers , one for opensearch and one for data-prepper. Changing with https doesn’t change the problem, I have the same error.
I have another ocker-compose using ssl but it returns the same error.

Let see my command data-prepper:
set – gosu $RUN_UID:$RUN_GID java $DATA_PREPPER_JAVA_OPTS -jar $EXECUTABLE_JAR $CONFIG_LOCATION$PIPELINES_FILE_NAME $CONFIG_LOCATION$CONFIG_FILE_NAME
and the result:

2022-01-21T09:13:26,443 [main] ERROR com.amazon.dataprepper.plugin.PluginCreator - Encountered exception while instantiating the plugin OpenSearchSink
opensearch-pipeline | java.lang.reflect.InvocationTargetException: null
opensearch-pipeline | at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
opensearch-pipeline | at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
opensearch-pipeline | at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) ~[?:?]
opensearch-pipeline | at java.lang.reflect.Constructor.newInstance(Unknown Source) ~[?:?]
opensearch-pipeline | at com.amazon.dataprepper.plugin.PluginCreator.newPluginInstance(PluginCreator.java:38) ~[data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at com.amazon.dataprepper.plugin.DefaultPluginFactory.loadPlugin(DefaultPluginFactory.java:66) ~[data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at com.amazon.dataprepper.parser.PipelineParser.buildSinkOrConnector(PipelineParser.java:163) ~[data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at java.util.stream.ReferencePipeline$3$1.accept(Unknown Source) [?:?]
opensearch-pipeline | at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source) [?:?]
opensearch-pipeline | at java.util.stream.AbstractPipeline.copyInto(Unknown Source) [?:?]
opensearch-pipeline | at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source) [?:?]
opensearch-pipeline | at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source) [?:?]
opensearch-pipeline | at java.util.stream.AbstractPipeline.evaluate(Unknown Source) [?:?]
opensearch-pipeline | at java.util.stream.ReferencePipeline.collect(Unknown Source) [?:?]
opensearch-pipeline | at com.amazon.dataprepper.parser.PipelineParser.buildPipelineFromConfiguration(PipelineParser.java:105) [data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at com.amazon.dataprepper.parser.PipelineParser.parseConfiguration(PipelineParser.java:70) [data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at com.amazon.dataprepper.DataPrepper.execute(DataPrepper.java:129) [data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | at com.amazon.dataprepper.DataPrepperExecute.main(DataPrepperExecute.java:33) [data-prepper-core-1.2.1.jar:1.2.1]
opensearch-pipeline | Caused by: java.lang.RuntimeException: Connection refused

@comijac,

You have your pipeline configured to connect to OpenSearch on localhost. This might not work depending on your Docker networking configuration.

One thing you can try is to see if you can just curl OpenSearch from a Docker container:

Run:

docker run opensearchproject/data-prepper curl https://localhost:9200/ -k -u admin:admin

Does it work?

Here is another possible setup to try:

Run OpenSearch with the container name opensearch like:

docker run --name opensearch -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" -d opensearchproject/opensearch:latest

Then, you should be able to access it via curl:

docker run --network=container:opensearch opensearchproject/data-prepper curl https://localhost:9200/ -k -u admin:admin

Without using ssl mode (just http). I have 4 containers:

  • opensearch: works well ,
  • opensaerch-dasboards: works well , connect to opensearch, visualize dashboards
  • opensearch-logstash: works well , connect to opensearch, create indexs,
  • opensearch-pipeline: (data-prepper) this one: read the content of a json file,
    - when the content go to stdout : works well
    - when the content go to opensearch index : connection refused.
    In my local machine (where are the 4 containers) curl http://localhost:9200 works well.
    Inside the 4 containers : curl http://opensearch:9200 works well.

But when data-prepper data-prepper-core-1.2.1.jar is launched inside data-prepper container, only stdout works:

sink:
- stdout:

but the following fails with “connection refused”:
sink:
- opensearch:
hosts: [“http://opensearch:9200”]
index: “logstash-test”


Adding to opensearch : ```
ENVIRONMENT : “discovery.type=single-node”


------------------------------------------------------------------------------------------------------------------

Note that : launching data-prepper data-prepper-core-1.2.1.jar not by docker (entrypoint) but manually inside the data-prepper container woks well (connection to opensearch accepted)

Sorry, it can be closed.
It was because : data-prepper must wait for full start of opensearch.

1 Like

Thank you for the update @comijac.

I’m sorry that you had some difficulty getting started with Data Prepper. I created GitHub issue #936 to consider a better approach such that Data Prepper doesn’t fail immediately.

I’d certainly appreciate if you add any comments about your experience here or other feedback on the issue.