Conversational commerce apps answer from a wide variety of query types, both very detailed or generic:
I search for
Could you suggest me a
vanilla ice creamfor my party?
Query elements might have collations – word aggregations with different meaning from the single elements:
Please, give me a
With the present idea I wish to:
- Return search results that are pertinent to what the user is looking for
- Inform the user whether the returned items do not match exactly with the search query
- Warn when the user demands for something that is off-scope
The data structure that might fullfil those requirements could be expressed in this form:
entity clusters : term : vanilla ice cream catalog : false subterms : term: ice cream catalog : true term : party catalog : false
Where does the collations might be obtained? Structured representations of human knowledge are available in every higher pace than ever. Open databases such as DBPedia, Wiktionary, WordNet could be grossly defined as generic sources of human knowledge. They are aggregated in ConceptNet.io, a semantic network. Collations are linked throughout a network of relationships such as: ”is part of”, ”is capable of”, ”is a type of”, “is related with” and their meaning could be extracted from the relations filtered from such databases for a specific domain case – in our example, food.
Query search data extraction
Once the relevant collations are acquired, it’s turn to describe how queries could be elaborated in order to filter out the key terms we are interested on. I individuated some desirable features the solution should provide:
- Query entities selection. When in the query there are more than one entity cluster, the conversational agent will be able to detect it and to ask the user to choose with entity will search first. For example: give me a
red bulland a
- Partial term matching. The user is informed when the exact criteria does not match, and instead, a less ranking one is provided. for example in give me
vanilla ice creamthe specific
vanilla ice creamis not available but a generic
ice creamit is.
- Terms off scope. Warn the user when the inquired item is not for sale. for example: I’m looking for an
The terms extraction from the product catalog and the user text query share the same following proce- dures described below. They are Lemmatization, N-gram factorization.
Lemmatization procedure returns the root for of the inflected word. For example runs and running are pointing to the same root run.
Indexing Sale Catalog
The product’s name and description are parsed, tokenized and finally stored in a in-memory Set
N-Grams extraction from search query
An n-gram is a contiguous sequence of n words. In the above example could you suggest me vanilla ice cream for my party the collation vanilla ice cream will be exploded as:
vanilla ice cream,
The system will weight the filtered items according to their length: longer first. More consecutive collations terms’ are detected, better the search output will be.