Spring data integration

I use Spring Data Elasticsearch with an ELastic search instance. If I want to use OpenSearch, I don’t know if i can continue to use Spring Data Elasticsearch. Is there other solution to use OpenSearch in Spring in a higher level?
Thank you.

Welcome @giu85 - I moved your post to the OpenSearch category.

You should be able to use Spring Data with Opensearch. OpenSearch behaves like Elasticsearch 7.10.2, so if that is supported by Spring Data (which I think it is) then you’re fine. Let us know how it goes or if you run into any issues!

see also:
https://github.com/opensearch-project/OpenSearch/issues/542

1 Like

A note before my comment: I am the project lead and main maintainer of Spring Data Elasticsearch at the moment.

Spring Data Elasticsearch (SDE) uses the client libraries from Elasticsearch (ES) to connect to an ES (or different) cluster. The last version of SDE that uses an ES client up to version 7.10.2 is SDE version 4.1.x that uses ES lib 7.9.3. SDE 4.2 already is on 7.12.1, the current main branch which will become SDE 4.3 uses ES 7.13.1.

It should be possible to use the SDE versions that use an 7.12 or 7.13 client with Opensearch (OS), if Opensearch is compatible with ES and as long as no functionality is used that was introduced after ES 7.10.

I had some short tests and for example OS does not support runtime fields in the index mapping, that was added in ES 7.11 and will be part of SDE 4.3. So although an application might be running without problems, but errors will come up when a feature is used in SDE that is not supported in OS.

As for the compatibility: One thing that SDE does is to retrieve the version of the cluster it is running against. To do this, we use the org.elasticsearch.client.core.MainResponse org.elasticsearch.client.RestHighLevelClient.info() call, that’s basically what a GET / returns.

On ES this will return

{
    "cluster_name": "docker-cluster",
    "cluster_uuid": "HoISicdkQxyWbYBwAPtE8g",
    "name": "33463b03a667",
    "tagline": "You Know, for Search",
    "version": {
        "build_date": "2021-04-20T20:56:39.040728659Z",
        "build_flavor": "default",
        "build_hash": "3186837139b9c6b6d23c3200870651f10d3343b7",
        "build_snapshot": false,
        "build_type": "docker",
        "lucene_version": "8.8.0",
        "minimum_index_compatibility_version": "6.0.0-beta1",
        "minimum_wire_compatibility_version": "6.8.0",
        "number": "7.12.1"
    }
}

On OS we get

{
  "name" : "b156c817d389",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "03q29RgHQwicLXU6pLeJvQ",
  "version" : {
    "distribution" : "opensearch",
    "number" : "1.0.0-rc1",
    "build_type" : "tar",
    "build_hash" : "26d579287f50bb33e17c8fe1f05ea208d5c64d1f",
    "build_date" : "2021-05-28T18:18:49.848386Z",
    "build_snapshot" : false,
    "lucene_version" : "8.8.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  }
}

Please note that OS does not send the tagline element. The tagline has always been a required field in the ES code that parses the answer into the MainResponse class. This parsing now fails, so that this call fails when accessing an OS instance.

I did not take the time to look for more incompatibilities, I could try to setup our tests to use an OS instance for the integration tests instead of an ES instance, but I can’t tell when I might find time for that.

So as we are implementing features that are missing from SDE it might be that these features are not available in OS.

As for future development, Elastic is currently working on a new client that will replace the RestHighLevelClient and that will be licensed with the Apache 2 license (GitHub - elastic/elasticsearch-java: Official Elasticsearch Java Client). When this is released, we will switch to use this client which is built on the Elasticsearch specification (GitHub - elastic/elasticsearch-specification: Elasticsearch full specification).

So as a summary: If OS is compatible with ES 7.10.2 it should not be a problem to use SDE (probably even with the next ES client) as long as no functions in SDE are used that use ES functionality added after 7.10.

I hope this clarifies the situation a bit.

2 Likes

@sothawo Wow - thanks for the very complete write up.

FYI - In 1.0 GA there has been a change merged regarding the version number (#708), so OpenSearch can report as 7.10.2.

With regards to the tagline field - it was removed as part of the debranding. Do you happen to know if the presence of the field is the only relevant bit or does the value also need to remain as it was?

Also - would you be open to contributions to help with OpenSearch compatibility?

As for what value is returned in the version: That is only used to create a warning log entry if the client library used and the cluster differ in at least the minor part, mostly to provide a hint if problems arise.

The tagline: I can only look in the code and there both in the actual client and the one for the next client I can only see that tagline is a required field, no reference to the value.

As for Opensearch compatibility: We will have to change the way of how SDE accesses the cluster anyway for the next version of the ElasticsearchClient since we cannot use the same classes to build requests and get responses as we do now. What comes to my mind would be to introduce something like a SearchEngineDriver interface and there could then be implementations for Elasticsearch and OpenSearch. SDE then would use a provided driver (that might be even be provided by SPI), if there is functionality that is not supported by the driver it could throw a corresponding exception - that’s just a first idea. Who would then write this and contribute - well SDE is a community driven project, so any contributions are welcome. As for me contributing on this part: As currently almost all work on SDE is done by me in my spare time, I don’t know how much I could do on the programming part besides setting up basic the architecture and providing this interface. I’ll check this with the Spring Data project lead what he thinks about such an approach.

2 Likes

Hi – we’ve opened [BUG] Elasticsearch RestHighLevelClient can not get response of Info API from OpenSearch server due to missing "tagline" field in MainResponse · Issue #901 · opensearch-project/OpenSearch · GitHub to track the issue.

Thanks!
/C

1 Like

We’ve restored a tagline (Add 'tagline' back to MainResponse in server that was removed in PR #427 by tlfeng · Pull Request #913 · opensearch-project/OpenSearch · GitHub) so you should see it in the response for 1.0.0

Thanks,
/C

1 Like

:tada: Glad to see this!

[Although it’s functional yet super un-fun - I need to open an issue to “Add whimsy”]

Yeah, I was definitely at war with myself on this one :wink: But hey, we can always change it later :slight_smile:

Hello,
Until the support for two client flavors is added to spring-data-elasticsearch, what is the recommended solution for people using it today and want to use opensearch with latest springboot ?

When using Springboot 2.5.X , I have no issue, because it uses spring-data-elasticsearch 4.2.7 and elasticsearch-rest-high-level-client 7.12.

However, when running my application with springboot 2.6.2 (spring-data-elasticsearch:4.3.0, elasticsearch-rest-high-level-client:7.15.2) with opensearch 1.2.3 (current latest), all queries fail with the following exception:

org.elasticsearch.ElasticsearchException: Elasticsearch version 6 or more is required
	at org.elasticsearch.client.RestHighLevelClient.performClientRequest(RestHighLevelClient.java:2084) ~[elasticsearch-rest-high-level-client-7.15.2.jar:7.15.2]
	at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1732) ~[elasticsearch-rest-high-level-client-7.15.2.jar:7.15.2]
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1702) ~[elasticsearch-rest-high-level-client-7.15.2.jar:7.15.2]
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1672) ~[elasticsearch-rest-high-level-client-7.15.2.jar:7.15.2]
	at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:1029) ~[elasticsearch-rest-high-level-client-7.15.2.jar:7.15.2]

Is there any solution or workaround for this issue?

Hello @orid,

I don’t think this could be solved by OpenSearch, the Elasticsearch HLRC has been hardened [1] recently to connect to Elasticsearch clusters only (basically a bunch of validations). Possible options (if it works for you) is try to downgrade elasticsearch-rest-high-level-client to 7.12.x line or use older 4.2.x release line of the Spring Data Elasticsearch, its compatibility level is the same as 4.3.x as per official documentation [2].

PS: Long-term solution would be an official support of the OpenSearch by Spring Data but we are not there yet.

Thank you.

[1] Verify that main info response returns correct product headers by swallez · Pull Request #73910 · elastic/elasticsearch · GitHub
[2] Spring Data Elasticsearch - Reference Documentation

Thanks @reta ,
I did as you suggested, downgraded the client, and for now it works.
I just hope I won’t need to use latest spring-data-elasticsearch (bugfix or new feature).

I bring here the gradle configuration I used, in case someone has the same issue:

buildscript {
    ext {
        set('elasticsearch.version', '7.12.1')
    }
}

dependencies {
    compileOnly (
//          downgrade to spring-data-elasticsearch:4.2.7 and avoid the OpenSearch error "Elasticsearch version 6 or more is required"
          "org.springframework.data:spring-data-elasticsearch:4.2.7",
    )
   ...

The same info existing in github issue:
https://github.com/opensearch-project/OpenSearch/issues/1926

with OpenSearch 2.0 now having been released and being incompatible with the old elasticsearch clients all spring-data-elasticsearch users are now blocked from upgrading to OpenSearch 2.0 as there’s no proper OpenSearch support there yet.

as a re-post from above, this is the tracking ticket to add support there:

as you can see there, @sothawo, who’s the sole maintainer of this library, is hard at work to refactor the library to support the new elasticsearch-java library (as required to support Elasticsearch 8) and this refactoring can then be used as the basis for adding OpenSearch support through opensearch-java.
however, as pointed out in a separate comment on another PR the release where this can happen will only be in autumn:

[…] preparing for the big version jump in autumn With Spring 6, Spring Data 3, Spring Data Elasticsearch in 5 and all based on Java 17.

if the slides posted on this page are still accurate then Spring 6 will only come out in October, which is a very long way away: Spring 6 And Spring Boot 3 - Spring Cloud

this leaves everyone who needs this in a bit of a pickle… i could see the following options on how to progress faster here:

  • temporarily, as a quick-win, fork spring-data-elasticsearch as spring-data-opensearch, replace all usage of the elasticsearch RestHighLevelClient with the OpenSearch one (should mostly be a search & replace thing?) as a stepping stone until the new release of spring-data-elasticsearch is ready, by which time everyone can migrate over again (should then be the same effort as upgrading from spring-data-elasticsearch 4.x to 5.x as all APIs in the fork should stay the same compared to 4.x and no new features should be added)
    • advantages: should be fairly easy to do (not a lot of effort, fast time-to-market)
    • disadvantages
      • consumers have to migrate from spring-data-elasticsearch to spring-data-opensearch now and then back to spring-data-elasticsearch in autumn
      • “throw-away” work as the fork will be abandoned and the (minor) work done there can’t be ported to spring-data-elasticsearch (as there instead the proper work will be done with the new code structure, most likely also based on opensearch-java instead of the RHLC from OpenSearch)
  • add the OpenSearch support already in spring-data-elasticsearch 4.x
    • i’m not sure if this is feasible in any way (probably not, based on what i read there are too many ES dependencies in all the code and changing them is a breaking change)
    • one major issue is the effort here as nobody seems to be helping @sothawo (i shouldn’t be throwing stones while sitting in a glas house, but my excuse is that i don’t have the time right now…)
  • refactor consumers and throw out spring-data-elasticsearch, rely on opensearch-java, the opensearch RHLC client or pure REST calls directly instead

for consumers stuck on Elasticsearch 7.10.2 it’s starting to get more urgent to be able to move to OpenSearch, and some (e.g. we) need features from 2.x, preventing us from going to 1.x for the time being and setting the compatibility flag so that it claims to be ES 7.10.2 instead (to make the client libraries happy).

so now my questions:

  • what are others doing here?
  • @sothawo: do you have an opinion / recommendation?
  • if the first option (temporary throw-away fork) would be the way to go: could we temporarily host this under OpenSearch Project · GitHub so that it has a proper home and is maintained?
  • is there anyone who could help out in the refactoring of spring-data-elasticsearch so that at least it’ll be all ready in October?

Hey @ralph , this is indeed the problem, let me try to share some thoughts on the matter:

  • temporarily, as a quick-win, fork spring-data-elasticsearch as spring-data-opensearch ,

Spring Data Elasticsearch is pretty large project, forking it is an option but at the same time, it will unavoidably lead to yet another painful migration at some point when Spring Data Elasticsearch would introduce Opensearch support.

  • add the OpenSearch support already in spring-data-elasticsearch 4.x

That is possible but if we do that in non-breaking fashion, the API would be cumbersome in a sense that Opensearch integration would reuse some classes from Elasticsearch. Or another way - forget reuse, everything is just copy / pasted from Spring Data Elasticsearch to Spring Data Opensearch (== similar to forking in some sense).

  • is there anyone who could help out in the refactoring of spring-data-elasticsearch so that at least it’ll be all ready in October?

Yes, I personally proposed to help @sothawo on several occasions but no luck so far. He has a vision and executes on it (he is doing a great job here, I have no intention to mess it up in any way). Still open to help here, I think that would be the best option to move forward.

if the slides posted on this page are still accurate then Spring 6 will only come out in October, which is a very long way away: Spring 6 And Spring Boot 3 - Spring Cloud

This is interesting: Spring Boot 3 is JDK-17 and jakarta.* based. I think Opensearch 2.x may need some work here (not 100% sure though, never tried that).

An update for anyone looking here for a Spring Data client for OpenSearch, there is a working client here: GitHub - opensearch-project/spring-data-opensearch