Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch version 2
Describe the issue:
Hello, I am building a aws opensearch pipeline that reads csv data from s3 bucket (using sqs event) and stores that in a opensearch serverless collection. I am using the following configuration to create this and it works as expected. My problem is I want to create dynamic index based on the filename of the parsed file. I tried passing an additional field in the SQS message, but data-prepper rejects that field.
My architecture currently is S3 → SQS → Opensearch serverless. In Opensearch pipeline, source is S3(SQS) with CSV processor and opensearch collection as the sink.
I was able to use {key}
which is the S3 key, but my key is formatted as dt=2023-10/filename_202310.csv
. I just want 202310
as the index. Is there a way to dynamically generate this?
Configuration:
Configuration (using data-prepper 2)
version: "2"
log-pipeline:
source:
s3:
codec:
newline:
compression: "none"
aws:
region: "my-region"
sts_role_arn: "my-role"
acknowledgments: true
scan:
buckets:
- bucket:
name: "my-bucket"
processor:
- csv:
source: "message"
delimiter: "\t"
delete_header: false
sink:
- opensearch:
hosts: [ "my-serverless-host" ]
aws:
sts_role_arn: "my-role"
region: "my-region"
serverless: true
serverless_options:
network_policy_name: "my-network-policy"
index: "vector_index" <--- want to make this dynamic, not sure how.
Relevant Logs or Screenshots: