Dealing with @timestamp values in SQL

I’m just starting to experiment with the SQL plugin and quickly ran into a problem specifying a time range in my queries. I’m not sure how to format the date/time values in the WHERE clauses.

For example, this fails:

POST _opendistro/_sql?format=csv
{"query": "SELECT @timestamp, level, message, FROM kubernetes_cluster-galaxy-2020-04-27  WHERE logsource.keyword='cas' and @timestamp =>  "2020-04-28T21:36:56.335Z" LIMIT 100"
}

The error returned is:

{
  "error": {
    "reason": "Invalid SQL query",
    "details": "Failed to parse request payload",
    "type": "IllegalArgumentException"
  },
  "status": 400
}

How should I format the date time constant (i.e. “2020-04-28T21:36:56.335Z” )?

Also, the doc mentions there’s a date_format function but doesn’t provide any details. I seem to be able to extract the time part of @timestamp by using syntax like date_format(@timestamp,‘HH:mm:ss’), but if I use that in a WHERE condition, I can’t figure out what kind of constant value I can compare against. For example date_format(@timestamp,‘HH:mm:ss’) = “00:01:27” fails with the same error as above.

Is the SQL plugin leveraging some other open source SQL parser project which may have additional documentation? In addition to the specific questions above, I’m interested in understanding what other functions might be available.

I haven’t yet tried this w/ OpenDistro, but for the basic license version of Kibana, you can do this:

SELECT HISTOGRAM(CAST("@timestamp" AS DATE), interval 2 hours) as hist, …

obviously, you can leave off the HISTOGRAM, and/or CAST to DATE if you want.
It’s the quoting you are looking for, IIUC.

Thanks for responding @iamthealex-es .

I managed to get the syntax figured out for a date range:

SELECT * FROM myindex WHERE @timestamp between "2020-04-30T03:30:00.000Z" and "2020-04-30T03:40:00.000Z"

But I’m still struggling with sub-setting based on time (e.g. all messages between 2:00 and 2:30) and understanding how to work with the date_format function. For example, here’s a query that dumps a couple of records from an index and uses the date_format function to create a new column TIME.

POST _opendistro/_sql
{
  "query": """
           SELECT @timestamp, date_format(@timestamp,"HH:mm") as time
           FROM kubernetes_cluster-galaxy-2020-04-30
             WHERE @timestamp between "2020-04-30T03:30:00.000Z" and "2020-04-30T03:40:00.000Z"
              LIMIT 2
           """
}

And here are the results returned (slightly re-formatted for readability/space):

{"schema": [
    {"name": "@timestamp","type": "date"},
    {"name": "time","type": "text"}
  ],
  "total": 1188,
  "datarows": [
    ["2020-04-30 03:32:31.391","03:32"],
    ["2020-04-30 03:32:31.787","03:32"]
  ],
  "size": 2,
  "status": 200
}

Based on these results, it appears that the date_format function returns a text string. Therefore, I would expect this query to work and return two rows:

POST _opendistro/_sql
{
  "query": """
           SELECT @timestamp, date_format(@timestamp,"HH:mm") as time
           FROM kubernetes_cluster-galaxy-2020-04-30
             WHERE date_format(@timestamp,"HH:mm") = "03:32"
              LIMIT 2
           """
}

Unfortunately, while the syntax appears to be acceptable (i.e. no errors), the query returns no rows.

Anyone have an idea of what I’m doing wrong? On the surface, it seems like a bug since I’m using the exact same value returned from the first query in the second one.

One thing I noticed is the illegal => which is supposed to be >= in SQL standard. Tested the new query with OpenSearch 1.1 and found no issue:

POST kubernetes_cluster-galaxy-2020-04-27/_doc
{
  "@timestamp": "2020-04-29T00:00:00Z",
  "logsource": "cas"
}

POST kubernetes_cluster-galaxy-2020-04-27/_doc
{
  "@timestamp": "2020-04-30T00:00:00Z",
  "logsource": "cas"
}

POST _plugins/_sql
{
  "query": """
    SELECT @timestamp
    FROM kubernetes_cluster-galaxy-2020-04-27
    WHERE @timestamp >= '2020-04-29 21:36:56.335'
  """
}

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    }
  ],
  "datarows": [
    [
      "2020-04-30 00:00:00"
    ]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}