Disclaimer: This is not legal advice, I am not a lawyer. I have worked in the open source legal space for 15+ years and worked closely with Red Hat Legal in my previous job.
@erickg, what you see in the README.md is what I would expect to see when you fork another project. They are attributing the copyright holder on the original work, and documenting the license that the work is under.
As you add/modify code to your fork, their copyright statement (and license for their copyrighted changes) still apply, so you need to be sure to retain that attribution (in README.md and wherever it appears in the code files, probably in the comment header). What you can do is append your own copyright statement, like this:
Copyright 2021 Elasticsearch B.V.
Copyright 2021 ErickG
Licensed under the Apache License, Version 2.0
You can definitely do this in README.md without issue, and you can make this change to any source files you modify. If you create entirely new files (that do not copy content from existing files), you do not need to include the Elasticsearch copyright attribution statement.
I am assuming, for simplicity, that your fork intends to keep the Apache License, Version 2.0, that you inherited from the upstream fork. It is possible for your changes to be under a different license, but it complicates things (including my answer) quite a bit, so my advice to you would be to keep your fork Apache 2.0.
If you have additional Copyright or License questions, please feel free to ask me, and I will do my best to help you.
The file that you linked to is a copy of the Apache License 2.0. I am also unsure where your confusion is coming from, as this matches the statement in README.md.
IMHO, the actual python API sucks because are very 2.x legacy designed.
They don’t follow the actual approach of using Python Typed for methods and objects.
It’s ok to maintain for old created code, but for new one it should better to move a more python modern approach.
The same if for Jaa API that the High-Level are poor of entity model design.
I’d be more than happy to be able to use more python 3.6 features and onwards if possible. To start, I would aim for being able to make API requests to OpenSearch 1.0 without issues.
I spent some time looking at python client codebase. There are different hacks to make it work as of today.
Since elasticsearch-py 7.* should be able to work with the 1.0. I think it doesn’t matter for me to break compatibility between two client projects. I aim to build the client to talk to OpenSearch 1.x and drops Elasticsearch compatibility when not possible. Correct me if I misunderstood.
I see some areas to improve:
Use AST to generate functions instead of using templates.
See if I can composite API classes after the above point.
Then the package can go for native typing hints.
Start support py 3.6 at least.
Change API function signature to requests style. Basically, drop query_params support which extracts kwargs to params. That confused the most why the client works differently comparing to HTTP requests in Kibana console.
Then I don’t need to rename type to doc_type for URL query parameters.
Drop XPack API
Annotate network modules
A lot of renaming in comments and documentations
Type hints.
Changes with OpenSearch
Bulk ingestion errors are very difficult to know
lz4 compression (this needs clusters support)
Probably better to have a pipeline with OpenSearch sooner than later. There are tests that are done together with Jenkins pipeline for elasticsearch-py that requires a cluster.
What do you think it’s important for you? Any other thoughts?
That looks like a great list. A few things to consider:
Keep an eye on the project roadmap. The project uses semantic versioning, so no breaking changes are expected in 1.x but at 2.0 you’ll start to see some breaking changes (as an example, master → main or whatever the new term ends up being).
The OpenSeach plugins could probably have an native API in the client
I’m not super familiar with this aspect of the python client, but having extensibility for additional plugins would be really helpful.
Hmm, maybe I can wait for a client build from you. I can check what features or transport layer are missing from the generated client.
@searchymcsearchface Most of the client code is generated from a set of spec files from the upstream build.
The OpenSeach plugins could probably have an native API in the client
I’m not super familiar with this aspect of the python client, but having extensibility for additional plugins would be really helpful.
As long as the spec is generated somewhere, a generator can generate client code from those spec. I am not sure how the spec is generated or written. This seems belong to OpenSearch repo.
Keep an eye on the project roadmap. The project uses semantic versioning, so no breaking changes are expected in 1.x but at 2.0 you’ll start to see some breaking changes (as an example, master → main or whatever the new term ends up being).
I think @aparo and I both try not to break compatibility to talk to OpenSearch cluster. But we want to ask users to change their code using the new client library. As of how much change users have to make, that can be discussed. All in all, elasticsearch-py 7.x should be able to talk with OpenSearch since day 1. It sounds like a good enough compromise.
Is there anyone looking into also migrating the Elasticsearch DSL Python library yet?
The Elasticsearch Python client just got a product check built-in, that will even prevent v7.14+ from being used with e.g. the 7.10 oss-distribution, hence forcing people to use 7.11 and thereby Elastic License v2…
Yep. Work is happening but it’s going to a few weeks until the client libs are public (renaming and updating license headers requires both engineering and legal review). Yesterday the first non-AWS committer was invited to a client lib, which should help the velocity.
Until then use the client version right before the check was implemented (the highest 7.13.x usually) - see the documentation Compatibility - OpenSearch documentation .