Initially, it’s most important to have test sets, so that you can properly assess the accuracy of your model. As you get additional data, you can also start adding it to your training data. Users often speak in fragments, that is, speak utterances that consist entirely or almost entirely of entities. For example, in the coffee ordering domain, some likely fragments might be “short http://monoton-teatr.ru/private/opisanieABV/bedrenec12.html latte”, “Italian soda”, or “hot chocolate with whipped cream”. Because fragments are so popular, Mix has a predefined intent called NO_INTENT that is designed to capture them. NO_INTENT automatically includes all of the entities that have been defined in the model, so that any entity or sequence of entities spoken on their NO_INTENT doesn’t require its own training data.
The Mix.nlu Discover tab allows you to see what users are saying to your deployed application, giving you the opportunity to refine your NLU models based on actual data. For now the data is read-only; additional functionality will be added in future releases, such as ability to export data, assign intents, annotate the data, and add selected samples to your training set. Note that if an entity has a known, finite list of values, you should create that entity in Mix.nlu as either a list entity or a dynamic list entity. A regular list entity is used when the list of options is stable and known ahead of time.
Training NLU Models
Once you have selected a set of samples, apply the bulk operation to the selected samples by clicking the appropriate icon in the row above the samples. When there are a lot of samples for an intent, you may want to filter the displayed samples by status. To do this, open the drop-down menu next to the status visibility toggle to choose the status to display.
For example, each user will have a different set of contacts on his or her phone. It is not practical (or doable) to add every possible set of contact names to your entity when you are building your model in Mix.nlu. If there are not enough annotated samples in your training set, you will be advised to add more.
Why is natural language understanding important?
Before the entity-type is created (or modified), Mix.nlu exports your existing NLU model to a ZIP file containing a TRSX file so that you have a backup. Creating (or modifying) a regex-based entity requires your NLU model to be re-tokenized, which may take some time and impact your existing annotations. In natural language understanding, an ontology is a formal definition of entities, ideas, events, and the relationships between them, for some knowledge area or domain. The existence of an ontology enables mapping natural language utterances to precise intended meanings within that domain.