Data prepper does not take changes in file source at runtime

data prepper version: opensearch-data-prepper-jdk-2.8.0-linux-x64

Describe the issue: I have a pipeline configured with file source and opensearch sink. Data prepper starts and reads the file source. Later if any changes are made in file say adding new line with some text, data prepper does not take that and pass it to opensearch.

The new change in file is taken only after data prepper is restarted.
Is this how it is supposed to work?

restarting the data prepper also duplicates the data as it reads the whole file again.

Please let know if the events are taken at runtime or restart of dataprepper is required for every change made to file source.

Configuration:
pipelines.yaml is placed in <home_dir>/pipelines:

sample-pipeline-2:
delay: “1”
source:
file:
path: “/scratch/Data_prepper/source.txt”
record_type: event
sink:
- opensearch:
hosts: [“http://host:9200”]
index: “sample-logs-2”
bulk_size: 10
max_retries: 5
insecure: true

Hi, any updates on this?

Hi

From my experience with data prepper (ingestion pipeline in AWS), the moment the data prepper activated, it will take a snapshot from your source and migrate over.

Any new or updated documents will not migrate over and you need to start again the data prepper for the prepper to snapshot the latest state and migrate

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.