Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Data Prepper 2.0.1
Fluent-Bit 2.0.9
OpenSearch 2.6.0
OpenSearch Dashboards 2.6.0
Docker-Compose 1.29.2
Describe the issue:
Hello together,
i want to setup log-ingestigation and i do have a question (probably several questions) on how i can parse/change a date coming from a logfile, so all the timestamps have the same format. The goal is to not use the timestamp created by fluent-bit or data prepper, but using the timestamp created by the application and using this time for a Time Filter on OpenSearch.
Unfortunately i do have different formats at the date, so OpenSearch is throwing an error (see below)
The parsing in general is working fine with all the other logfiles except schedule.log, server.log and client.log due of the differences in the date values.
I’m basically tailing 6 logfiles with fluent-bit, all with partly different arrangements and datestamps:
schedule.log
INFO 04.02.2023 23:00:00.474 (com.namespace.test): starting task ‘Clean up logs’ - schedule entry ‘Clean up logs’ (id=5)
server.log:
INFO 17.02.2023 07:59:16.810 (com.namespace.test): Uploaded 0 media file(s).
client.log
ERROR 30.06.2022 10:56:45.859 {uID=54,pID=58} (com.namespace.test): Client error: ERROR 30.06.2022 10:56:45.666 (com.namespace.test): The test() method has thrown exception:
Username (Firstname Lastname), session: sessionID, project: 1, ip: 127.0.0.1
Version=13;JDK=17;OS=Windows 11 amd64;Date=30.06.2022 10:56:45 (I)
java.lang.IllegalStateException: JSObject is not valid or already disposed.
at com.namespace.test(SourceFile:196)
at com.namespace.test(ChromeEngine.java:488)
at com.namespace.test$e.b(SourceFile:3783)
at com.namespace.test$e.onMessageReceived(SourceFile:5755)
at com.namespace.test(SourceFile:1085)
at com.namespace.test(SourceFile:69)
at com.namespace.test(SourceFile:79)
at com.namespace.test(Executors.java:539)
at com.namespace.test(FutureTask.java:264)
at com.namespace.test(ThreadPoolExecutor.java:1136)
at com.namespace.test(ThreadPoolExecutor.java:635)
at com.namespace.test(Thread.java:833)
access.log (Tomcat)
2022-07-12T08:09:52.729+0200 45 192.168.1.0 Username 0 304 “GET /home/javascript.js HTTP/1.1” “127.0.0.1, 127.0.0.2”
catalina.log (Tomcat)
2022-12-07T16:23:35.132+0100 INFO com.namespace.testcleanupCaches: Caches stats: authReqs=0 sessions=1 introspection=0
access.log (Apache2)
2023-03-07T01:19:01.166+0100 xxx xxx 192.168.0.2 “192.168.0.3” - - 232 74 242 382 - 0 200 “GET /test.html HTTP/1.0” “-” “Referrer”
Configuration:
pipelines.yaml:
log-pipeline:
workers: 2
delay: “5000”
source:
http:
ssl: false
port: 2021
health_check_service: true
authentication:
unauthenticated:processor:
- add_entries: entries: - key: "environment" value: "dev" - key: "id" value: "c24" overwrite_if_key_exists: true - grok: patterns_directories: [ "/usr/share/data-prepper/patterns" ] match: log: [ '%{DATESTAMP_EVENTLOG_ACCESS:time_config}%{SPACE}%{LOGLEVEL_OWN:log-level}(?<greedydata>(.|\r|\n)*)', '%{LOGLEVEL_OWN:log-level}%{SPACE}%{DATESTAMP_EVENTLOG_SERVER:time_config}(?<greedydata>(.|\r|\n)*)', '%{DATESTAMP_EVENTLOG_ACCESS:time_config} %{NUMBER:request.duration.ms:double} %{IP:remote.host.ip} %{USER:remote.authenticated.user} %{NUMBER:bytes.sent:int} %{NUMBER:http.status.code:int} \"(?:%{WORD:http.request.method} %{NOTSPACE:http.request.url}(?: HTTP/%{NUMBER:http.request.version})?|-)\"', '%{DATESTAMP_EVENTLOG_ACCESS:time_config} %{HOSTNAME:request.uri} %{URIHOST:request.uri} %{IP:lb.ip}(?<greedydata>(.|\r|\n)*)' ] - delete_entries: with_keys: ["timestamp"]
sink:
- stdout:
(usually OpenSearch, but commented out)
pattern.conf:
LOGLEVEL_OWN (DEBUG|INFO|WARN|ERROR)
MILLISECONDS (\d){3,7}
DATESTAMP_EVENTLOG_ACCESS %{YEAR}-%{MONTHNUM}-%{MONTHDAY}T%{HOUR}:%{MINUTE}:%{SECOND}.%{MILLISECONDS}%{ISO8601_TIMEZONE}
DATESTAMP_EVENTLOG_SERVER %{DATE_EU}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}.%{MILLISECONDS}
USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}
fluent-bit.conf:
(example with server.log)
[INPUT]
Name tail
Refresh_Interval 60
Path /logs/server.log
Path_Key logfile-origin
Ignore_Older 1m
multiline.parser java
Read_from_Head true
Skip_Long_Lines Off
Mem_Buf_Limit 25MB
Tag server-log
Relevant Logs or Screenshots:
data-prepper | 2023-03-02T10:16:31,850 [log-pipeline-sink-worker-2-thread-2] WARN org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink - Document [org.opensearch.client.opensearch.core.bulk.BulkOperation@1352c987] has failure.
data-prepper | java.lang.RuntimeException: failed to parse field [time_config] of type [date] in document with id ‘QBDSoYYB0DLbZ_tRy_Nd’. Preview of field’s value: ‘02.03.2023 11:16:21.912’ caused by failed to parse date field [02.03.2023 11:16:21.912] with format [strict_date_optional_time||epoch_millis] caused by Failed to parse with all enclosed parsers
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.BulkRetryStrategy.handleFailures(BulkRetryStrategy.java:163) ~[opensearch-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.BulkRetryStrategy.handleRetry(BulkRetryStrategy.java:118) ~[opensearch-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.BulkRetryStrategy.execute(BulkRetryStrategy.java:71) ~[opensearch-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.lambda$flushBatch$2(OpenSearchSink.java:206) ~[opensearch-2.0.1.jar:?]
data-prepper | at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:89) ~[micrometer-core-1.9.4.jar:1.9.4]
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.flushBatch(OpenSearchSink.java:203) ~[opensearch-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.plugins.sink.opensearch.OpenSearchSink.doOutput(OpenSearchSink.java:177) ~[opensearch-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.model.sink.AbstractSink.lambda$output$0(AbstractSink.java:38) ~[data-prepper-api-2.0.1.jar:?]
data-prepper | at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:89) ~[micrometer-core-1.9.4.jar:1.9.4]
data-prepper | at org.opensearch.dataprepper.model.sink.AbstractSink.output(AbstractSink.java:38) ~[data-prepper-api-2.0.1.jar:?]
data-prepper | at org.opensearch.dataprepper.pipeline.Pipeline.lambda$publishToSinks$3(Pipeline.java:247) ~[data-prepper-core-2.0.1.jar:?]
data-prepper | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
data-prepper | at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
data-prepper | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
data-prepper | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
data-prepper | at java.lang.Thread.run(Thread.java:833) ~[?:?]
Also maybe you do have hints or recommendations about what i could change. I’m pretty new to grok/fluent-bit and OpenSearch itself, so i’m thankful for any comment!