Index creation dataset info.
Performance is fine with index creation. I am putting these details FYI it may help you to understand the scenario.
Text file with 25 column and 10 million records
Creating index on 4 column out of 25 column
Creating index in batchmode with batch side 30000 and where primary key is column GNAF_PID.
Below is query for creating the index.
String endpoint = “/” + indexName + “?include_type_name=false”;
HttpEntity entity = new StringEntity( body , ContentType.APPLICATION_JSON);
Request request = new Request(RestCallConstants.PUT, endpoint);
request.setEntity(entity);
Response response = restClientFactory.getClient().performRequest(request);
where body value is
…
{“settings”:{“index”:{“number_of_shards”:1,“number_of_replicas”:0}},“mappings”:{“properties”:{“lat1”:{“type”:“text”,“index”:false},“lat2”:{“type”:“text”,“index”:false},
“lat3”:{“type”:“text”,“index”:false},
“lat4”:{“type”:“text”,“index”:false},
“Latitude”:{“type”:“text”, “index”:true," analyzer":“standard”},
“NewField”:{“type”:“text”,“index”:false},
“AD10”:{“type”:“text”, “index”:true," analyzer":“standard”},
“Building_Name”:{“type”:“text”,“index”:false},
“PC1”:{“type”:“text”,“index”:false},
“BN1”:{“type”:“text “,“index”:true,” analyzer”:“standard”},
“BN3”:{“type”:“text”,“index”:false},“BN2”:{“type”:“text”,“index”:false},
“BN4”:{“type”:“text”,“index”:false},“Longitude”:{“type”:“text”,“index”:false},
“Postcode”:{“type”:“text”,“index”:false},“AD2”:{“type”:“text”,“index”:false},
“ADC1”:{“type”:“text”,“index”:false},“LLC6”:{“type”:“text”,“index”:false},
“AddressLine1”:{“type”:“text”,“index”:false},
“GNAF_PID”:{“type”:"keyword ",“index”:true },
“AD9”:{“type”:“text”,“index”:false},
“PC11”:{“type”:“text”,“index”:false},
“PC10”:{“type”:“text”,“index”:false},
“NF1”:{“type”:“text”,“index”:false},
“NF4”:{“type”:“text”,“index”:false}}}}
…
Now search query details which has performance issue.
I am doing search on above created index. Search file has 1 million records and doing search on 3 column.in batchmode with batch size 10000 and max result count 50.
Query is as below
………………………………
httpEntity = new StringEntity( requestBody , ContentType.APPLICATION_JSON);
Request request = new Request(RestCallConstants.GET, endPoint);
request.setEntity(httpEntity);
response = restClientFactory.getClient().performRequest(request);
requestBody
{“query”:{“bool”:{“must”:[{“match_all”:{}}],
“filter”:{“bool”:{“must”:[{“match_phrase”:{“AD10”:{“query”:“14 CHERMSIDE STREET1770”,“boost”:1.0}}},
{“term”:{“BN1”:{“boost”:1.0,“value”:“null1770”}}},
{“term”:{“GNAF_PID”:{“boost”:1.0,“value”:“GAACT7148916451770”}}}]}}}},
“from”:0,“size”:50}
…………………………….
Performance number for search query
Process 1 million records
Max count 50
Batch size 10000
Other infra structure and memory is same for both.
With opendistro security enabled |
With opendistro security disabled |
54.2 min |
38.5 min |