I Would like to search and aggregate string fields containing urls, based on the path for instance.
the problem is that they are ingested with their query parameters.
SO we have urls like: bla/bim/boom?param=1
and I would like to count urls starting with bla/bim/boom
, disregarding the params.
How can I do that from within the UI?
The other option that I found would be to split this field in two (request_path and request_query_params ) directly in fluentbit.
But I would be a bit surprised if there would not be an option directly in opensearch.
Are you looking to count all the docs containing bla/bim/boom
only or are you looking to aggregate and show counters for all paths?
If it’s the first one, a wildcard query on a keyword
field should do the trick:
field_name:bla/bim/boom*
If it’s a text
field, it’s problematic. You can query for the phrase "bla/bim/boom"
, but it will also count foo/bla/bim/boom
, which is not what you want.
If you want to aggregate, you could potentially do this with a text field and proper analysis and do a terms
aggregation and exclude param.*
values, but IMO it’s too complicated and if you have a lot of data, risky: because field data is in the JVM heap, so if you have a lot of documents, you can crash OpenSearch. Not really crash it, but de-stabilize it, while the query fails…