Fine tuning

After the initial definition of a dictionary, there will generally be an optimization phase, in which the correct detection and behavior of the entries added will be tested in the API they are going to be used in.

These are some issues that may come up, and how to approach them:

  • An alias of an entry is not detected as such:

    The APIs have a number of rules to generate aliases automatically depending on the morphological and semantic information of an entry. If a very basic alias is not detected, its probably because some of that information is missing.

    A very common case is the generation of plurals for concepts. If the form added to your dictionary is a made-up word, the APIs will not know how to generate that plural, and so it will be necessary to include it as an alias.

  • One of the attributes I have defined is not shown correctly at the output:

    It's important to remember than any attribute you define will not be considered an array if the string _list is not added to the name. A good approach to know how the information you have defined will look in the response of an API such as Topics Extraction is the tab Semantic information included in the entry view.

  • The entry I have defined is not detected with my analysis:

    The semantic analysis added through the user dictionary may not be the only analysis associated to a specific form. When this happens, the APIs carry out a disambiguation process in which through a number of techniques and depending on the operating parameters configured, an analysis is selected.

    This means that the analysis defined in your dictionary can lose in this desambiguation process. If this is your case, there's an option next to the field Sense ID of the entry that allows you to indicate that you only want your sense/semantic analysis taken into account:

    Use only my senses check

    If an entry has a sense in the user dictionary and several other senses in the basic resources provided, if the check is selected those senses from the basic resources will be ignored, and only the ones from user dictionary will be used.