OpenSearch Lucene Study Group Meeting - Monday, December 4th

Sign up to join the meeting at Meetup:

Link to previous meeting’s post: OpenSearch Lucene Study Group Meeting - Monday, November 27th

Welcome to the OpenSearch Lucene Study Group!

Apache Lucene is the open-sourced search library that powers OpenSearch and many search applications large and small.

Based on last week’s meeting, we’re moving the learning series part earlier in the agenda, since most participants said they were attending in order to learn more about Lucene.

In the second half of the meeting, we will review recent developments in Apache Lucene and discuss their potential impact to OpenSearch, with a particular focus on new and exciting Lucene features that we can (and should) expose through OpenSearch. Since some changes require a deep dive to fully understand, we will ask participants to volunteer for “homework” to dig deeper into changes and report back for the next meeting.

Standing Agenda:

  • Welcome / introduction (5 minutes)
  • Lucene learning series - someone will either present a Lucene-related talk or we will do Lucene Q&A (20 minutes, recorded)
  • Review assigned issues from last time (10 minutes)
  • Review new Lucene changes and assign homework (20 minutes)

By joining the OpenSearch Lucene Study Group Meeting, you grant OpenSearch, and our affiliates the right to record, film, photograph, and capture your voice and image during the OpenSearch Community Meeting (the “Recordings”). You grant to us an irrevocable, nonexclusive, perpetual, worldwide, royalty-free right and license to use, reproduce, modify, distribute, and translate, for any purpose, all or any part of the Recordings and Your Materials. For example, we may distribute Recordings or snippets of Recordings via our social media outlets.

Not a lot of changes for in Lucene 9.9 since last week, as a release branch was cut. So, I’ve decided to include changes across all versions (including a change that was previously missing from CHANGES.txt for 9.80).

Lucene 10.0.0API ChangesDeprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager) for TopFieldCollectorManager and TopScoreDocCollectorManager.
Lucene 10.0.0API ChangesAdding -level param to CheckIndex, making the old -fast param the default behaviour.
Lucene 10.0.0OtherFixing bug in UnescapedCharSequence#toStringEscaped()
Lucene 9.10.0Bug FixesPrevent extra similarity computation for single-level HNSW graphs.
Lucene 9.9.0API ChangesGITHUB#12803: Remove FST constructor with DataInput for metadata. Please use the constructor with FSTMetadata instead.,
Lucene 9.9.0ImprovementsNodeHash now stores the FST nodes data instead of just node addresses
Lucene 9.9.0ImprovementsTest2BFST now reports the time it took to build the FST and the real FST size
Lucene 9.8.0New Features`WordDelimiterGraphFilterFactory` now supports the `ignoreKeywords` flag