Bulk operation using Python


I found documentation lack example how to use bulk OpenSearch api. I examples from ElasticSearch bulk api. I want to confirm if I am doing this correctly.

I am using these parameters for the request. This is how the structure of logs is expected as I understand.

logs = [{'index': {'_index': 'event-logs-2022-06-08'}}, 
{'message': 'Instances: [i-asdaad] have been launched.', 'severity': 'INFO', '@timestamp': '2022-06-08T14:47:15.000Z', 'cluster_name': 'abc-cluster', 'cluster_id': 'xjjsd23'}]

I couldn’t confirm from the source code of library if we really need to pass index_name in helpers.bulk
as a parameter. Doesn’t opensearch use index_name logs body structure {'index': {'_index': 'event-logs-2022-06-08'}} as mentioned above?

This is the snippet I am using to do the bulk operation. Issue is the I want to use timestamp with index names and dates change for different logs. If I put a static index_name, it will use same index name for logs from different dates.

from opensearchpy import OpenSearch, RequestsHttpConnection, helpers

client = OpenSearch(
            hosts = [{'host': host, 'port': 443}],
            http_auth = awsauth,
            use_ssl = True,
            verify_certs = True,
            connection_class = RequestsHttpConnection
        resp = helpers.bulk(client, logs, index= index_name, max_retries = 3)

Is there a better way to do this within API? One way, I can request with multiple iteration for each day but it looks like OpenSearch can use index name from body itself, although I couldn’t find a way to do that.

Hi, I have used the bulk operation a demo more specifically https://github.com/laysauchoa/opensearch-python-dive-in/blob/main/index.py#L43.

Also, if you are looking for ways to write search queries with the Python client, you can check this out: Write search queries with Python and OpenSearch® - DEV Community 👩‍💻👨‍💻