The import process consists on being able to create entries for a dictionary using a file to define them or to update the entries that already exist in it. This process can be carried out either during the creation, or later on from the dictionary view.
The file does not need to have a specific extension, the only conditions that it must fulfill are that:
Form
Form\tAliases\tSense ID\tSemantic information
ID\tForm\tAliases\tTag\tLemma\tExact form\tSense ID\tUse only my sense\tSemantic information
The symbol \t represents a tabulation. If the fields Aliases or Sense ID have more than one value associated, they will be separated by pipes, |
.
These two formats will be complemented by two fields available in all the import dialogs:
These fields define two of the most common aspects associated to a dictionary entry. Both of them are defined within the semantic information associated to the entry (which is explained in detail in the semantic information definition section), but for ease of use purposes, they can be set outside the semantic information.
Entry type
defines if the entry is considered an entity or a concept. In the import process there are three possible values:
Ontology type
defines if the value within the ontology associated to the entry. The menu has available a number of values from our ontology (the most common ones), but you can also write whichever value you want by selecting Write your own value or select any of the values you have created for other entries in the dictionary.
In this case all the entities imported would be of the type Characters, a node that does not exist in our ontology, in the node Person, which is part of our basic ontology.
If these fields are set and the same information they define is also in the semantic information in the file, this information will be overwriten.
These two formats and the two additional fields provided by the import interface cover the most common scenarios found when working with user dictionaries.
The first variant of the basic format, the one where only the form
is defined, derives from the fact that the only thing you need to define an entry is its form. None of the other fields are mandatory, and the only field that will be assigned a type by default is entry type
.
Form
This is an example of the first format:
Rachel Green
Joey Tribbiani
Ross Geller
Monica Geller
Phoebe Buffay
Chandler Bing
It's just a list of forms where the only thing you have to make sure is that each entry is in a different line. This is extremely useful when you already getting started and come from an scenario where you only have lists of items to detect.
If you combine this with the entry type
and the ontology type
, setting them, for instance, to Entity and to Person>Characters (user-defined value seen before), you would be adding easily a list of characters from a tv show.
The second variant for the basic format allows to import entries with the basic default information they may have.
Form\tAliases\tSense ID\tSemantic information
This is an example of this format:
Rachel Green\t\tFR001|-\toccupation_list=waitress|personal shopper|executive at Ralph Lauren
Joey Tribbiani\tDr. Drake Ramoray\tFR002\toccupation_list=actor
Ross Geller\t\tFR003\toccupation_list=paleontologist|college professor
Monica Geller\t\tFR004\toccupation_list=chef
Phoebe Buffay\tRegina Phalange|Princess Consuela Bananahammock\tFR005\toccupation_list=Massage therapist|Musician/singer-songwriter
Chandler Bing\t\tFR006\toccupation_list=Statistical analysis and data reconfiguration|Junior advertising copywriter
There's a more detailed explanation about how to defined this information in the semantic information definition section.
That when using any of the formats, if one of the entries does not have a value for one of the fields, the space for it must be included even if it has no value. Check for example the aliases in the example for the second import format. Only some of the entries have aliases, but all of them include the position where they'd be included if they had.
The sense ID associated to Rachel Green has two values, FR001, and then separated by |
, a dash. This dash is how we enable in an import process the use only my senses configuration option.
The import process using any of the variants of the basic format always creates new entries. In other words, this format does not allow to update any of the entries that are already created in the dictionary.
This is the result of importing the file used as an example for the second import format:
The extended format allows to import entries with all the possible information they may have, including the advanced settings and their ID.
ID\tForm\tAliases\tTag\tLemma\tExact form\tSense ID\tUse only my sense\tSemantic information
This is an example of this format:
\tGeller Cup\t\tNPUU\tGeller Cup\ty\tFR007\ty\tsementity/class=instance@type=Top>Object>Trophy
\tHuggsy\t\tNPUU\tHuggsy\ty\tFR008\ty\tsementity/class=instance@type=Top>Object>StuffedAnimal
\tCentral Perk\t\tNPUU\tCentral Perk\ty\tFR009\ty\tsementity/class=instance@type=Top>Location
5eb3f90244100\tJoey Tribbiani\tDr. Drake Ramoray|Hans Ramoray\tNPUU\tJoey Tribbiani\ty\t\tn\tsementity/class=instance@type=Top>Person>Characters\toccupation_list=actor
In this example, we can see two types of entries: the first three, which do not have and ID (and the reason why they start directly with a tab character), and so they are new entries we are going to add to our dictionary. The last one does have an ID, so we will update an existing entry in the dictionary (in this case, we are adding a new alias).
There can be errors in the import process, mainly of two types:
When a line in the file gives a Format error, it will be ignored. The result of the import process will be specified by a message in the dictionary view, detailing the entries that have generated an error.
Usually, when you import a dictionary from a file and there is a format error, you may want to correct the affected rows in the import file (shown in the error message), copy them to a new file and try to import them again.
The error is related to the plan limit, the import won't be carried out and the following message will appear:
To export a dictionary's entries to a file, you only have to click on the sidebar's Export button.
This will open a modal dialog that will enable you to select the format you want to use in the export process: basic or extended.
It will create a file with the same format required to import a dictionary (the extended format, the most complete one), featuring each entry in a new line and all fields separated with tabulation characters.
It's important to know that this action doesn't create an exact backup of the dictionary. It only exports entries. Other fields of the dictionary like name, language and description are not saved in the export file.