Using opensearch version 2.4.1 and opensearch-java-2.1.0
I want to use opensearch in a website and be able to make a full text search in a title and pageText field using wildcards to also match part of word (is there another way?) and filtering by a field type (can have the values page, product or download). The results should deliver an excerpt of the webpage where the searchterm was found with highlighting.
Highlighting works, but only if I don’t query for the field typeFilter at the same time. I guess what I’m doing is not really how you use a filter opensearch. Is there a special mapping I should use for a filter keyword? I’m also not sure if the query with wildcards I use is the way to go. Later on I would also like the title to have a higher weight in the results than the pageText.
I’m not expecting a ready solution, but would be happy about anything pointing me in the right direction.
Here is how far I got so far. This is how I create the index:
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder().index(indexName)
.settings(s -> s.analysis(i -> i.analyzer("default", a -> a.custom(c -> c.tokenizer("standard").charFilter("html_strip")))))
// .settings(s -> s.analysis(i -> i.filter("stemmer", f -> f.
.mappings(m -> m.properties("title", p -> p.text(t -> t)))
.mappings(m -> m.properties("type", p -> p.keyword(t -> t)))
.mappings(m -> m.properties("template", p -> p.keyword(t -> t)))
.mappings(m -> m.properties("author", p -> p.keyword(t -> t)))
.mappings(m -> m.properties("publishingDate", p -> p.date(t -> t)))
.mappings(m -> m.properties("lastActivated", p -> p.text(t -> t)))
.mappings(m -> m.properties("lastModified", p -> p.text(t -> t)))
.mappings(m -> m.properties("language", p -> p.keyword(t -> t)))
.mappings(m -> m.properties("link", p -> p.keyword(t -> t)))
.mappings(m -> m.properties("pageText", p -> p.text(t -> t.analyzer("default"))))
.mappings(m -> m.properties("image", p -> p.keyword(t -> t)))
.build();
CreateIndexResponse response = client.indices().create(createIndexRequest);
And this is my query:
Map<String, HighlightField> map = new HashMap<>();
map.put("title", HighlightField.of(hf -> hf.numberOfFragments(0)));
map.put("pageText", HighlightField.of(hf -> hf.numberOfFragments(4).fragmentSize(100)));
Highlight highlight = Highlight.of(
h -> h.type(HighlighterType.of(ht -> ht.builtin(BuiltinHighlighterType.Unified)))
.fields(map)
.fragmentSize(50)
.numberOfFragments(5)
);
SearchRequest.Builder requestBuilder = new SearchRequest.Builder()
.index(indexName)
.size(size)
.from(from)
.query(q -> q.wildcard(m ->
m.field("title").wildcard("*" + searchTerm + "*") //TODO query does not work for search in title
.field("pageText").wildcard("*" + searchTerm + "*")
// .field("type").wildcard(typeFilter) //TODO highlighting not working when using this, need to use different mapping?
)
)
// .sort(null)
.highlight(highlight);
SearchRequest request = requestBuilder.build();
ResultList<SearchItem> returnList = null;
SearchResponse<SearchItem> searchResponse = client.search(request, SearchItem.class);
if (searchResponse != null || !CollectionUtils.isEmpty(searchResponse.hits().hits())) {
List<SearchItem> results = searchResponse.hits().hits().stream().map(Hit<SearchItem>::source).collect(Collectors.toList());
for (Hit<SearchItem> item : searchResponse.hits().hits()) {
//TODO is there a more clever way to map the highlighting?
String pageTextHighlighting = Objects.toString(item.highlight().get("pageText"));
String pageTitleHighlighting = Objects.toString(item.highlight().get("title"));
if (StringUtils.isNotBlank(pageTextHighlighting) && !StringUtils.equals(pageTextHighlighting, "null")) {
item.source().setPageText(pageTextHighlighting);
}
if (StringUtils.isNotBlank(pageTitleHighlighting) && !StringUtils.equals(pageTitleHighlighting, "null")) {
item.source().setTitle(pageTitleHighlighting);
}
}
return returnList;
}
return new ResultList<SearchItem>();