Regexp "@" and ".*&.*" matches with everything

OpenSearch 1.3

My query in the opensearch “dev_tools” is as below, it’s matches with everything:

{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "search_field.keyword": "@"
          }
        }
      ]
    }
  }
}

Additionally regexp : .&. also matches with everything

{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "search_field.keyword": ".*&.*"
          }
        }
      ]
    }
  }
}

Could someone explain why It’s returning everything in the above two scenarios.
Any suggestion to improve the query with explanation will be appreciated.
Have anyone faced such type problems in opensearch?

Thanks

Hey @rahul7822

I used this, it seamed to work.

GET /_search
{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "message.keyword": "@"
          }
        }
      ]
    }
  }
}

Hi @Gsmitt,
Thanks for you response.
It’s strange behavior in my case. My mapping for the index is as below

{
  "booktransactions": {
    "mappings": {
      "properties": {
        "author": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        },
        "bookDesc": {
          "type": "text"
        },
        "bookId": {
          "type": "text"
        },
        "bookName": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        },
        "gender": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        },
        "price": {
          "type": "double"
        },
        "publishedDate": {
          "type": "date"
        }
      }
    }
  }
}

The query I execute is as below:

GET booktransactions/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "regexp": {
            "gender.keyword": "@"
          }
        }
      ]
    }
  }
}

Result : I get all the hits/documents from the index

"took": 23,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 20,
      "relation": "eq"
    },
    "max_score": 1,

hey @rahul7822

Did some testing in my lab, That search query did not work , But I tried this and received good results.

GET /winlogbeat-2023.06.02/_search
{
  "query": {
    "regexp": {
      "message": "off"
    }
  }
}

But the "@" and "&" did not work so well. So Im not 100% sure whats up, perhaps its the symbols.

odd - @wbeckler @seanneumann - do you know of any gotchas with the "@" and "&" symbols?

I’m not sure what the regex rules are :frowning:

By default, all “Optional Operators” are enabled in Regular Expression queries executed on OpenSearch where & is the “intersection” operator. Saying .*&.* means anything that matches .* and also matches .*. As a result you get all items returned.

As you have discovered, @ means “anystring”.

To turn those optional operators off, you can use this:

   ...
        {
          "regexp": {
            "message.keyword": {
              "value": ".*&.*",
              "flags": "NONE"
            }
          }
        }
   ...
3 Likes

Awesome, Thanks @AMoo-Miki

thank you @AMoo-Miki!!