Dates for 3.0.0?

So one of the best things about doing things is you learn things, so the next time you do them you’re better at them.

We just released OpenSearch 2.0 and we learned a couple of things:

1/ Having a long gap between 2.0 release candidate and 2.0 release makes it weird to build features. We didn’t want to put features into 2.0 after the RC cut off, but then we also didn’t want to release things into 1.x that weren’t in 2.0.

2/ Having an RC is good in theory, but we didn’t get a lot of feedback on the RC itself.

3/ Even if we didn’t get user feedback on the RC, we definitely needed to have a hard integration point to add extra testing for breaking changes.

The plan of record right now for OpenSearch 3.0 is to put out an RC in September with a full release in January. Based on what we learned above, I’m leaning towards tightening up the time between the RC and full release to ~4 weeks, with only one minor version between them and to see if there are better ways to get feedback on the RC/beta.

But wait, there’s more:

As I started to look at head to 3.0, I realize that we might not need another major release. As a reminder, OpenSearch follows semver, which means we only release breaking changes in major versions. For our minor releases we follow a train model, where we release every 6 weeks or so. When I was planning the year out, I thought that some of the features we were planning for this year would require a major version because they would have breaking changes in them. But as we’ve built them out, we’ve realized that these features are actually additive and will not break any existing functionality. Since upgrading to major versions can be disruptive for people, if we have an opportunity to continue to build on the 2.x platform we should at least consider it.

Open Questions:

If we are going to do a second major release this year:
What do you think the right gap is between the Beta/RC and the final release?
What can we do to get better feedback on the RC/Beta?

If we’re not going to do a second major release this year
Are there any breaking changes you were planning for 3.0? If so, what are they, and what would the impact be if we moved out our next major release into 2023 (with Lucene 10)?

Are there any knock on effects (like having to continue to support spoofing for older versions) that we haven’t considered that would recommend doing a major release?

Proposed Schedules
The current schedule plan lives here for reference.

If we left 3.0 in as a major release, but shortened the time between the RC and the beta, it might look like this:

Release Number Code Freeze Date Release Date
1.3.3 June 03 2022 June 09 2022
2.1 June 23rd June 30th
1.3.4 July 01 2022 July 07 2022
2.2 August 4th Aug 11th
1.3.5 August 16th August 23rd
3.0.0 RC OpenSearch Core/Dashboards : August 11th
Plugins and Clients: August 18th September 14th
2.3.1 September 22nd September 29th
3.0.0 September 30 2022 October 06 2022
3.1.0 November 04 2022 November 10 2022
1.3.6 December 1st December 8th
3.2 January 10th, 2023 January 12th 2023

If moved 3.0 out into 2023, things might look like this:

Release Number Code Freeze Date Release Date
1.3.3 June 03 2022 June 09 2022
2.1 June 23rd June 30th
1.3.4 July 01 2022 July 07 2022
2.2 August 4th Aug 11th
1.3.5 August 16th August 23rd
2.3 September 7th September 14th
1.3.6 September 30 2022 October 06 2022
2.4 November 04 2022 November 10 2022
1.3.7 December 1st December 8th
2.5 January 10th, 2023 January 12th 2023

What do you think?

We will also be discussing this proposal at the Tuesday, June 21st Community Meeting. Hope to see you there.

/C

5 Likes

I prefer the 2nd schedule as it allows the users to use the 2.x version with new additional features for more time rather than the need to upgrade to new major version. I would also recommend to finalize and publish the long term support (LTS) plan for both 1.x and 2.x version before finalizing the date for 3.x version.

3 Likes

@CEHENKLE -

I’m very much in favor of the second one due to the effort in making sure changes don’t break anything. In general this seems like it would always be the better course. I’m specifically considering impact on adoption. The more breaking changes we add, the higher the probability that users’ production instances have to be updated, custom made implementations of search will probably require some sweeping up as well, as well as all kinds of other potential speed bumps based on whatever changes happen to be breaking.

Sure, people don’t have to update if they don’t want, but if you’re still on 1.x and you’re not even on 2.x before 3.x releases, the steeper the slope becomes. Even if it’s actually not exceedingly difficult, the perception of difficulty is there. This will be the proverbial stick in the bike wheel if we want people to be able to adopt OpenSearch seamlessly and make it ubiquitous.

Give users time to try it out in their integration environments and get the newer version deployed before we make their infrastructure incompatible with latest.


However -

If you end up deciding its better to choose situation one, I would humbly request that we bias for action on making sure we have documented upgrade procedures as well as clear documentation on making snapshots and protecting your data.

Either course can work, but I think we we were really working backwards from the customer, we’d want to ensure seamless operation for existing environments by breaking as little as possible.

One last Q: I can only think of advantages for the second situation and problems with the first. I’m probably being biased a bit - what do you think some of the advantages are for maintaining the tighter schedule and bumping to 3 earlier?

Thanks for all your work.

1 Like

I’ve come to expect that major version upgrades (at least with ElasticSearch in the past) require lockstep upgrades with client libraries, plugins, and external daemons. Our use case heavily depends on external daemons which often use client libraries that have historically included version checks. It was not clear at the time of the 2.0 RC release whether or not a 1.x client would even work at all. We were more confident that the 2.0 client libraries would work when they were released as well as likely supporting 1.x for a controlled and splayed upgrade. We ultimately upgraded most of the clients to 2.0 (expecting 1.x compatibility) when they became available and tested this configuration. In this testing, we also discovered that logstash-output-opensearch 1.x does not indicate support for 2.0, but it appears to work.

I’m optimistic that cross-version compatibility will be better for OpenSearch than it was for ElasticSearch. Upgrading multiple services simultaneously increases the risk of missing an integration point and at worst causing an outage and/or data loss. For me, indication that the 1.x client libraries were expected to work against the 2.0 RC would have led me to try the 2.0 RC rather than wait for the 2.0 client libraries.

We planned on the 3.0 release to remove version spoofing. Beyond that, my primary concern is to run supported versions that will get security updates.

Speaking of security updates, the release schedule seems to indicate that 1.x will be supported at least until the end of 2022. Does OpenSearch publish (major) version end-of-life dates?

1 Like

100% this right here.

In general I agree and personally prefer longer time between major releases for some of the reasons raised; although I think we should strive to reduce the FUD around major upgrades. The fact there is such a concern is a red flag we should be working to remove.

That being said for 2.0 → 3.0, I prefer the first schedule.

Here’s why: OpenSearch needs to be independent of anything Elastic and the sooner the better.

Elastic has made it clear there will be no support for anything outside Elasticsearch and have explicit version checks to ensure elastic specific compatibility. To assist users in the easiest migration path possible, OpenSearch 1.x and 2.x has a lot of backwards compatibility logic with Elasticsearch and spoof hacks for compatibility with Elasticsearch clients that is continuing to cause ongoing confusion and headaches for users. The sooner we can migrate users away from these hacky integrations the better the OpenSearch experience will be; and this will be a breaking change that will require a major release.

In parallel we should be reducing the friction for OpenSearch upgrades so major version releases will not be as “disruptive”. Then this wouldn’t be as much of a concern. I think there’s a correlation between this concern and why OpenSearch doesn’t receive much RC feedback. I’m not sure changing the time between RC and GA will make much of a difference. FUD around GA upgrades usually translates to greater FUD around RC upgrades.

1 Like

I like the 2nd schedule better. There are lots of exciting features that are breaking changes being baken now. Planning sufficient time will help get more features ready for 3.0.

I think we should change the thinking a bit here. As seen on segment replication and remote storage development (even at the lucene level directly) sandbox module, feature flags, deprecations, legacy inheritance, settings, and reindex are just a few OpenSearch mechanisms that enable us to release “Breaking Change” features at any point without requiring major releases or “sufficient time to bake” features. Contributors shouldn’t feel major version locked on unleashing a feature, only major version locked when promoting that feature as the “default behavior”.

1 Like

+1 to move 3.0 to 2023. Since Segment Replication and Remote Index can be delivered in minor versions, this also will give more time for developers to leverage the new inclusive usages before removing deprecated usages.

2 Likes

I can definitely see both perspectives, but I lean towards a longer time between majors (option 2). This will help users have more time to ramp onto 2.0 and not have to think through potential breaking changes across two major versions. And +1 to @nknize’s point on not feeling locked on delivering new features. If there are ways to make things additive, but not change the default, we can minimize breaking changes while still helping users adopt new capabilities.

One thing to consider is that OpenSearch Dashboards will need a 3.0 release before April 30th, 2023 in order to upgrade Node.js from v14 to a newer version (v16 or v18) before the end of LTS. Releases | Node.js

2 Likes

I bet we can decouple OpenSearch from OpenSearch Dashboards releases before April 2023. :slight_smile:

2 Likes

I think not considering the elastic compatibility and client issues I link to is a big miss if we don’t consider a 3.0 release sooner than later. Maintainers will be the ones pulling the magic to keep the house of cards standing and users will be the ones continuing to endure pain until they decide to either wait for 3.0 or bail on the project completely. I urge not to ignore the implementation details here in favor of the simple idea of not having to release another major until next year. Some short term pain just might alleviate long term chronic pain.

Maybe we should put this discussion on GitHub so other maintainers don’t miss out on weighing in before a decision is made without their input?

Another thing to consider. If we release 3.0 in September with Lucene 9.x, the REST Version API, and Elastic 7.x version removal, users can perform a direct 1.x to 3.0 rolling upgrade without having to go to 2.0 first; and client compatibility (including beats and Logstash fixed version compatibility) will be fixed with full bwc support. This means a one time client upgrade, instead of upgrading to a wonky 2.0 first, and then upgrading to a 3.0.

If I’m an end user going through the client compatibility issues we’re seeing today I’d love the option of directly upgrading to 3.0 instead of being forced to upgrade my clients to 2.0 and then again to 3.0 because we waited until Lucene 10.

2 Likes

I would prefer we don’t bake too many exciting new features for too long, and thus would prefer an earlier schedule (version 1). My biggest concern is that by extending the schedule we are just deferring assembling 3.0 till later, instead of getting serious about it in parallel with 2.x. The time to care about 3.0 is now regardless of when it ships.

I am 100% with @nknize on needing to get to a state where we are not constantly introducing breaking changes. We were pretty good with bringing features into 1.x that didn’t break backwards compatibility, and we should continue going the extra mile while building features in a backward-compatible way. Our goal is to get to a state where we can go years without a major release, all while adding a ton of value continuously.

Personally, I was leaning towards 2nd option before we brought back spoof hacks for compatibility with Elasticsearch clients . Essentially, we still cannot break ties from Elasticsearch as @nknize mentioned, sooner we do that - better it is going to be for both projects and communities. In these regards, 1st option is looking like a better path forward now.

1 Like

+1 to all your comments, especially moving the Remote Index work to 2.x

I am concerned with option 1. It seems like we are doing another major version and introducing breaking changes earlier than we need to for the sake of breaking. I agree with @dblock that we should be assembling 3.0 in parallel with 2.x, but I don’t think we need to push to launch 3.0 early if we have ways to continue to deliver new functionality in 2.x without introducing breaks. That’s why I’m still leaning with option 2. I think we are asking a lot of end users to plan for multiple major version jumps in a ~1.5 year period if they want the latest features. (e.g., 7.10 → OpenSearch 1.x → OpenSearch 2.x → OpenSearch 3.x).

2 Likes

If we release 3.0 in September with Lucene 9.x, the REST Version API, and Elastic 7.x version removal, users can perform a direct 1.x to 3.0 rolling upgrade without having to go to 2.0 first; and client compatibility (including beats and Logstash fixed version compatibility) will be fixed with full bwc support.

That is very useful context! Thanks @nknize! Do you know what the planned breaking changes that would push the major to version 3?

The removal of backwards compatibility logic with Elasticsearch and spoof hacks for compatibility with Elasticsearch clients.

1 Like