Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 2.17.0
Describe the issue: We try to use fluent-bit 3.1 to send data from several json files. This is a example of the structure:
{
"api_request_time": "2024-10-27T03:18:48.532427",
"branch_count": 1,
"branches": [
"main"
],
"code_repo_size_bytes": -1,
"commit_count": -1,
"forks_count": 0,
"https_url": "https://xxxx.git",
"last_read_activity_at": "2022-02-08T16:26:05.572Z",
"lfs_size_bytes": -1,
"lines_of_code": {
"Markdown": {
"blank": 2,
"code": 5,
"comment": 0,
"nFiles": 1
},
"SUM": {
"blank": 11,
"code": 209,
"comment": 4,
"nFiles": 6
},
"YAML": {
"blank": 9,
"code": 204,
"comment": 4,
"nFiles": 5
},
"header": {
"cloc_url": "github.com/AlDanial/cloc",
"cloc_version": "1.96",
"elapsed_seconds": 0.0288269519805908,
"files_per_second": 208.138550480113,
"lines_per_second": 7770.50588459089,
"n_files": 6,
"n_lines": 224
}
},
"members": [
{
"id": 1956,
"name": "xxx"
},
{
"id": 961,
"name": "xxx"
},
{
"id": 4,
"name": "xxxx"
},
{
"id": 7345,
"name": "xxx"
},
{
"id": 8264,
"name": "xxx"
},
{
"id": 3651,
"name": "xxx"
}
],
"merge_requests": [],
"name": "xxx",
"namespace": "xxx",
"project_id": 10500,
"releases_count": 0,
"subgroup_id": 1053,
"tags": [],
"tags_count": 0,
"total_project_size": -1,
"web_url": "https://xxxxx"
}
We have some trouble to insert the data perfectly to opensearch and i hope someone of you has an idea. Currently i had success to send the data with the multiline_parser option of fluent-bit. This was my configuration:
[MULTILINE_PARSER]
name multiline-regex-test
type regex
flush_timeout 1000
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|--------------------------------------------
rule "start_state" "^\s*{" "cont"
rule "cont" "^\s*[}\],]|\s*" "cont"
Unfortunately I only get an entry in opensearch with a field “log” and unstructured. Do you have any idea how I can get the data structured and searchable?
I would be pleased to receive hints from someone!
Relevant Logs or Screenshots: