OpenSearch Client Java & Python APIs

asfoorial · April 16, 2021, 12:17am

Good day everyone,

Any plan for providing/supporting Elasticsearch Java Highlevel REST APIs and Python APIs?

Will currently writtern code that uses these APIs break now that elasticsearch namespaces have changed to reflect OpenSearch?

Will there be such APIs for OpenSearch maintained in repositories such as Maven Central and PyPI?

Regards,
Hasan

aparo · April 16, 2021, 8:26am

@asfoorial I’m working on clients code generator in my spare time.
I started here the fork of Elasticsearch code generator GitHub - aparo/opensearch-client-generator: OpenDistro Client code generator to be used with Elasticsearch and now I’m working to providing OpenAPI.
My task list is the following:

openAPI descriptor
Python (typed) client
Java/Scala client
My idea is also to provide scripts (python or similar) to automatically migrate the code from Elasticsearch to OpenSearch.

asfoorial · April 16, 2021, 10:05pm

@aparo This is good news. However, does it maintain the same Java/Python APIs and signatures? In other words, if we have existing code that uses current elasticsearch Java/Python APIs would it continue to work if we use these new OpenSearch Java/Python clients?

Thanks,
Hasan

aparo · April 17, 2021, 4:09pm

@asfoorial I will try to keep as it the API, but the python ones are missing types and in some case also parameters.

erickg · May 29, 2021, 7:59pm

@aparo I tried to generate python client from OpenSearch repo. It was successful. I believe types can be added but typing depends what versions the package should support. I would recommend putting up compatibility requirements. IMO, I would drop python 2 support completely. And requires 3.7+ since aiohttp support 3.7+.

The documentation still links to elastic.co. If kept unchanged, I believe that will be difficult to align client users when OpenSearch is released.

ralph · May 31, 2021, 7:37am

python 2 is dead since last year: PEP 373 – Python 2.7 Release Schedule | peps.python.org (we’ll, it’s been dead for much longer, really, but last year marked the official EOL for it)

searchymcsearchface · May 31, 2021, 2:01pm

Can I just say how happy I am that the Python 2/3 thing can be history now?

asfoorial · May 31, 2021, 3:34pm

@erickg

Excellent.

Could you please share the code to the generated python API?

Thanks

erickg · May 31, 2021, 7:13pm

@ralph elasticsearch-py still declares Python 2 support as long as 3.4, 3.5, 3.6 from its manifest.

@searchymcsearchface Yep, me too.

@asfoorial Code from elastichsearch-py. I changed utils/generate-api.py to get the client code.

searchymcsearchface · May 31, 2021, 7:39pm

Niiiice. Looks like a lot of manual work still needs to be done, but it’s a start.

erickg · June 1, 2021, 5:30pm

Before discussion work, I need some help with license. The README.md stated Copyright 2021 Elasticsearch B.V. Licensed under the Apache License, Version 2.0. LICENSE file says Apache License. This is confusing to me.

searchymcsearchface · June 1, 2021, 6:05pm

Let me try and find someone to help you!

spotfoss · June 1, 2021, 6:18pm

Disclaimer: This is not legal advice, I am not a lawyer. I have worked in the open source legal space for 15+ years and worked closely with Red Hat Legal in my previous job.

@erickg, what you see in the README.md is what I would expect to see when you fork another project. They are attributing the copyright holder on the original work, and documenting the license that the work is under.

As you add/modify code to your fork, their copyright statement (and license for their copyrighted changes) still apply, so you need to be sure to retain that attribution (in README.md and wherever it appears in the code files, probably in the comment header). What you can do is append your own copyright statement, like this:

Copyright 2021 Elasticsearch B.V. 
Copyright 2021 ErickG

Licensed under the Apache License, Version 2.0

You can definitely do this in README.md without issue, and you can make this change to any source files you modify. If you create entirely new files (that do not copy content from existing files), you do not need to include the Elasticsearch copyright attribution statement.

I am assuming, for simplicity, that your fork intends to keep the Apache License, Version 2.0, that you inherited from the upstream fork. It is possible for your changes to be under a different license, but it complicates things (including my answer) quite a bit, so my advice to you would be to keep your fork Apache 2.0.

If you have additional Copyright or License questions, please feel free to ask me, and I will do my best to help you.

erickg · June 2, 2021, 6:05am

Thanks, that’s helpful. Of course, I would love to have Apache License.

I have a question about source repo elasticsearch-py’s license. It says

Copyright 2021 Elasticsearch B.V. Licensed under the Apache License, Version 2.0.

This doesn’t align with APL declaration in elasticsearch-py/LICENSE at master · elastic/elasticsearch-py (github.com). Does that mean the code is not APL anymore?

searchymcsearchface · June 3, 2021, 2:20pm

I’m not sure I follow 100% - copyright and license are two separate things. The repo you linked looks like Apache to me.

@spotfoss Thoughts?

spotfoss · June 3, 2021, 2:47pm

The file that you linked to is a copy of the Apache License 2.0. I am also unsure where your confusion is coming from, as this matches the statement in README.md.

aparo · June 3, 2021, 4:19pm

IMHO, the actual python API sucks because are very 2.x legacy designed.
They don’t follow the actual approach of using Python Typed for methods and objects.
It’s ok to maintain for old created code, but for new one it should better to move a more python modern approach.
The same if for Jaa API that the High-Level are poor of entity model design.

erickg · June 3, 2021, 5:18pm

Thanks. It makes sense to me now. Then the library can be licensed as Apache License 2.0 with updated copyright holder when it moves on.

erickg · June 3, 2021, 5:21pm

I’d be more than happy to be able to use more python 3.6 features and onwards if possible. To start, I would aim for being able to make API requests to OpenSearch 1.0 without issues.

erickg · June 10, 2021, 10:33pm

I spent some time looking at python client codebase. There are different hacks to make it work as of today.
Since elasticsearch-py 7.* should be able to work with the 1.0. I think it doesn’t matter for me to break compatibility between two client projects. I aim to build the client to talk to OpenSearch 1.x and drops Elasticsearch compatibility when not possible. Correct me if I misunderstood.

I see some areas to improve:

Use AST to generate functions instead of using templates.
See if I can composite API classes after the above point.
Then the package can go for native typing hints.
- Start support py 3.6 at least.
Change API function signature to requests style. Basically, drop query_params support which extracts kwargs to params. That confused the most why the client works differently comparing to HTTP requests in Kibana console.
- Then I don’t need to rename type to doc_type for URL query parameters.
Drop XPack API
Annotate network modules
A lot of renaming in comments and documentations
Type hints.
Changes with OpenSearch
- Bulk ingestion errors are very difficult to know
- lz4 compression (this needs clusters support)
- Probably better to have a pipeline with OpenSearch sooner than later. There are tests that are done together with Jenkins pipeline for elasticsearch-py that requires a cluster.

What do you think it’s important for you? Any other thoughts?

Topic		Replies	Views
Is there plan for OpenAPI spec? General Feedback	2	894	April 12, 2021
OpenSearch high level client library for Python (Elasticsearch-DSL equivalent) OpenSearch Client Libraries opensearch-py , clients-general	2	1833	October 14, 2021
Opensearch.org development OpenDistro	7	1019	April 26, 2021
Client / APIs for the forks OpenDistro	15	2960	April 13, 2021
Documentation opensearch java client OpenSearch all-clients	1	1003	January 10, 2023

OpenSearch Client Java & Python APIs

Related topics