So basically I’m trying to set up an open search sync pipeline on aws. Right now, I have the config for the pipeline as (moves data from dynamodb to opensearch):
version: "2"
dynamodb-pipeline:
source:
dynamodb:
acknowledgments: true
tables:
- table_arn: <>
stream:
start_position: LATEST
export:
s3_bucket: <>
s3_region: us-east-1
s3_prefix: <>
aws:
sts_role_arn:<>
region: us-east-1
sink:
- opensearch:
hosts:
- https://dummy.amazonaws.com
index: index
template_content: |
{"settings":{"analysis":{"analyzer":{"autocomplete_analyzer":{"tokenizer":"standard","type":"custom","filter":["lowercase","autocomplete_filter"]}},"filter":{"autocomplete_filter":{"type":"edge_ngram","min_gram":2,"max_gram":20}}}},"mappings":{"properties":{"name":{"type":"text","analyzer":"autocomplete_analyzer","search_analyzer":"standard"},"summary":{"type":"text","analyzer":"autocomplete_analyzer","search_analyzer":"standard"},"category":{"type":"keyword"}}}}
index_type: custom
document_id: ${getMetadata("primary_key")}
action: ${getMetadata("opensearch_action")}
document_version: ${getMetadata("document_version")}
document_version_type: external
aws:
sts_role_arn: <role>
region: us-east-1
So I don’t want everything from dynamodb, instead I want only PK
with value GAMESEARCH
to be indexed, I mean the entire dynamodb row record for matching PK
, for which I need to specify the template for open search mapping.
Right now the above config kind of tries to copies everything from dynamodb, which I don’t want. Any thoughts?