"search_as_you_type" "index_prefix" field does not return document with "match" query but does with "match_phrase" query

I have created a mapping in OpenSearch that looks as follows:

"mappings": {
        "properties": {
            "name": {
                "type": "text",
                "fields": {
                    "prefixed_split":{
                        "type":"search_as_you_type",
                        "analyzer":"asciifold_separated"
                    }
                }
            }
        }
    }

The asciifold_separated analyzer looks as follows:

"asciifold_separated": {
    "char_filter": [],
    "tokenizer": "standard",
    "filter": [
      "lowercase",
      "asciifolding"
    ]
}

I used the _bulk end point to store some documents. One of the documents I stored is as follows:

{ "index" : { "_id": 1 }}
{ "name":"David Silva" }

When I analyze the vectors I am getting the following response:

{
    "_index": "test_local",
    "_id": "1",
    "_version": 2,
    "found": true,
    "took": 23,
    "term_vectors": {
        "name": {
            "terms": {
                "david": {
                    "term_freq": 1
                },
                "silva": {
                    "term_freq": 1
                }
            }
        },
        "name.prefixed_split._2gram": {
            "terms": {
                "david silva": {
                    "term_freq": 1
                }
            }
        },
        "name.prefixed_split._index_prefix": {
            "terms": {
                "d": {
                    "term_freq": 1
                },
                "da": {
                    "term_freq": 1
                },
                "dav": {
                    "term_freq": 1
                },
                "davi": {
                    "term_freq": 1
                },
                "david": {
                    "term_freq": 1
                },
                "david ": {
                    "term_freq": 1
                },
                "david s": {
                    "term_freq": 1
                },
                "david si": {
                    "term_freq": 1
                },
                "david sil": {
                    "term_freq": 1
                },
                "david silv": {
                    "term_freq": 1
                },
                "david silva": {
                    "term_freq": 1
                },
                "david silva ": {
                    "term_freq": 1
                },
                "s": {
                    "term_freq": 1
                },
                "si": {
                    "term_freq": 1
                },
                "sil": {
                    "term_freq": 1
                },
                "silv": {
                    "term_freq": 1
                },
                "silva": {
                    "term_freq": 1
                },
                "silva ": {
                    "term_freq": 1
                },
                "silva  ": {
                    "term_freq": 1
                }
            }
        }
    }
}

As you can see, the name.prefixed_split._index_prefix sub field which is created by the search_as_you_type field type has an entry for sil, but when I search for the sil entry in a match query, it does not return any document

{
    "query": {
        "bool":{
            "should": [
                {
                    "match": {
                        "name.prefixed_split._index_prefix": "SIL"
                    }
                }
            ]
        }
    },
    "from": 0,
    "size": 10000,
    "sort": [],
    "aggs": {}
}

the result is:

{
    "took": 7,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }
}

However, when I use the match_phrase query, I get the desired result

{
    "query": {
        "bool":{
            "should": [
                {
                    "match_phrase": {
                        "name.prefixed_split._index_prefix": "SIL"
                        
                    }
                }
            ]
        }
    },
    "from": 0,
    "size": 10000,
    "sort": [],
    "aggs": {}
}

result:

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1.9891725,
        "hits": [
            {
                "_index": "test_local",
                "_id": "1",
                "_score": 1.9891725,
                "_source": {
                    "name": "David Silva"
                }
            }
        ]
    }
}

I have a couple of questions regarding these outputs:

  1. From my understanding, the only difference between the match and match_phrase query is that in match_phrase the query is tokenised and the last token is searched as a prefix while the rest of he tokens are searched in exact order. But for a single search term like “sil”, the tokenisation process should produce a single token “sil”. I see in the term vectors, there is already a term “sil” present in the name.prefixed_split._index_prefix sub field.
    Why does the match query not match the “sil” to the entry “sil” which is present in the term vectors? why does it need to be searched as a prefix?

  2. If you observe the field name.prefixed_split._index_prefix, after the last entry of “silva”, there are 2 extra whitespaces present. Where are those whitespaces coming from? The same is true for the entry "david silva ". Where is the extra whitespace coming from?

  3. I am trying to implement a search feature where the under can type in any part of the first few characters of an entry to get the desired document. For example, for an entry like “David”, they can type in any of the prefixes like “d”, “da”, “dav”, etc and I have supply them with the relevant document. Is this really the best way to achieve this, or are there more space efficient ways to achieve the same?