Data streaming to opensearch from postgres

How to stream data to opensearch from postgres rds.

Any change in rds should be reflected in opensearch.

Processing of continuous stream data as it generates.

Where many tables in rds equal to one index in opensearch with nested fields.

It should be an direct sink between rds and opensearch

So I am not sure I have any examples I can show but here are some different strategies you could use.

  1. PGSync - This is a project I just came across recently. As I understand, it allows you to sync a Postgres database to OpenSearch. I believe this is something that would be scheduled to run on a regular interval rather than something that is streaming.

  2. Since you mentioned RDS and you want the data streaming you could use CDC (change data capture) to get the difference of the data and then push it into OpenSearch. This would likely take a fair amount of engineering work as you would need to create the mappings and transforms you need for the application. These CDC events would probably need to be consumed by a lamda or something similar to be formatted and sent to OpenSearch. Using change data capture - Amazon Relational Database Service

  3. The last method that I think seen recommended the most is using a queue service like Kafka to capture the updates. This way Postgres and OpenSearch could subscribe to the events separately. This is better because if either service experiences downtime the ingestion is not blocked for the other.

Do you know how to do it with Kafka… @ dtaivpp

Yeah, I am not sure I can properly describe it in one go but here is a basic pattern you can use. Kafka receives updates from the client. It stores them in a topic that the Postgres sink connector and the OpenSearch sink connector read from. They in turn put the data into the respective databases.

Instacluster has a good article on setting up the Postgres sink connector. Kafka Postgres Connector - Streaming JSON Data using Sink Connectors

Aiven’s OpenSearch sink connector repo talks a bit about setting the OpenSearch one up. GitHub - aiven/opensearch-connector-for-apache-kafka: Aiven's OpenSearch® Connector for Apache Kafka®

Kafka-Postgres-OpenSearch

Thank you for this…

If you provide the best data streaming to opensearch it will be an very advantage to opensearch, user will increase…

Due to this streaming and not available of runtime fields in opensearch most of the people are moving to elasticsearch…

Can you make a video on this, i will be helpful to many…

Unfortunately, I don’t have time at the moment to make a video but if you were to create one I would happily share it with others.