Regex supported in Dashboards Query Language (DQL)?

Hi

(Reposting this question from slack as we didnt get any feedback there)

Describe the issue:
I’m working on a dashboard visualization, and I need to filter out certain values using a regular expression.
Specifically, I’m trying to match patterns like this:

TEST-[a-z0-9]+/[a-zA-Z0-9-]±ind-core-[a-z0-9]±[a-z][a-z0-9]+

The end goal is to visualize this filtered data as a line graph on the dashboard. However, I’m struggling with how to implement this in the visState JSON section of the dashboard definition.

I can see some examples of using wildcards in the DQL documentation, but in this case, we need to use regex expression (for more complex cases not achievable through wildcards).

Would anyone be able to share the correct syntax or provide any guidance on how I can apply this regex to display the data properly?

Is this not supported at all and should we raise a feature request for this?

Any help or suggestions would be greatly appreciated!

Versions (relevant - OpenSearch/Dashboard): 2.12.0

Thanks,
Shivani

Could you provide a specific use case? What is the structure of the documents/logs where you want to perform the filter?

I’ll give you an example that works for me with opensearch dashboards 2.15:
–I want to get all documents (logs) that contain a text with this pattern:
“12345678912_123456789123”, that is, two groups of numbers separated by an underscore.
It’s true that with DSL searches it’s not possible, but it is possible with DQL queries. You can add a filter in “opensearch Query DSL” like this:

{
  "query": {
    "regexp": {
      "message": "[0-9]{11}_[0-9]{12}"
    }
  }
}

Hi, apologies for the late reply!

The records are in json format, and there is a field in the record, distinguished_name. It contains a hash in the end of the name that either starts with a digit, or an alphabet character.

  • alpha case
    Regex: TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[a-z][a-z0-9]+
    Sample string: TEST-vtas251/CON-1/ind-core-677458881-pvplz

  • digit case
    Regex: TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[0-9][a-z0-9]+
    Sample string: TEST-vtas251/CON-1/ind-core-677458881-8j6sm

Use-case is to add a filter aggregation to distinguish between the two types.

{
“id”: “3”,
“enabled”: true,
“type”: “filters”,
“params”: {
“filters”: [{
“regexp”: {
“distinguished_name.keyword”: “TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[a-z][a-z0-9]+”
},
“label”: “alpha”
}, {
“regexp”: {
“distinguished_name.keyword”: “TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[0-9][a-z0-9]+”
},
“label”: “digit”
}
]
},
“schema”: “split”
}

Any thoughts on this?

Hi @mpicallo
Any suggestions on this?

Hi,
here are some points to consider and a possible solution:
Formatting Your JSON:

  • Ensure that the JSON you shared for the filter aggregation is valid and properly configured. For example:
{
  "id": "3",
  "enabled": true,
  "type": "filters",
  "params": {
    "filters": [
      {
        "query": {
          "regexp": {
            "distinguished_name.keyword": "TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[a-z][a-z0-9]+"
          }
        },
        "label": "alpha"
      },
      {
        "query": {
          "regexp": {
            "distinguished_name.keyword": "TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[0-9][a-z0-9]+"
          }
        },
        "label": "digit"
      }
    ]
  },
  "schema": "split"
}

Regular Expression Limitations:

  • Regular expressions in OpenSearch have some limitations and might not work in all cases, especially if they include non-standard characters like ±. Check if you can simplify the patterns or escape them correctly.
  • Test the patterns directly in Dev Tools using a regexp query to ensure they match the expected documents:
GET your-index-name/_search
{
  "query": {
    "regexp": {
      "distinguished_name.keyword": "TEST-[a-z0-9]+/[a-zA-Z0-9-]+/ind-core-[a-z0-9]±[a-z][a-z0-9]+"
    }
  }
}

Using DQL Instead of JSON:

  • If your goal is to create a filter in the dashboard, you can try using DQL (Direct Query Language) instead of visState JSON. For example:
distinguished_name.keyword:/TEST-[a-z0-9]+\/[a-zA-Z0-9-]+\/ind-core-[a-z0-9]±[a-z][a-z0-9]+/

This might be easier to implement and maintain.

Validation in Logs:

  • If you have access to the logs in JSON format, validate that the distinguished_name.keyword field contains values that actually match the described patterns.

Regards,

Hi @mpicallo,

Thank you for pointing us in the right direction! Your suggestion provided valuable insights that led us to a potential solution.

As you mentioned, we tried using a DQL query directly, and incorporating the Lucene regex query syntax from OpenSearch documentation proved to be effective.

The key adjustment was correctly escaping the hyphen (-) in the regular expression. Here’s the working regex we used:
distinguished_name.keyword: /TEST\-[a-z0-9]+\/[a-zA-Z0-9\-]+\/ind\-core\-[a-z0-9]+\-[a-z][a-z0-9]+/

This resolved the issue for our case. Thanks again for your guidance!

2 Likes