Multiwords

Multiwords are combinations of words that are always grouped together if they appear in a specific order. These words are used in the different entries of the model, but are going to affect every text analyzed.

For that reason, it's useful to be able to have an overview of all the multiwords defined in the model, to be able to identify quickly the ones that may be causing a conflict or an unexpected behavior in the analysis.

Multiwords list

The main table shows the information about all the multiwords defined in the entries and subentries. They can be browsed by name, as entered by the user, and directly go to the entry where each one is defined, either in the entry definition or in one of its subentries. The table allows to order the multiwords alphabetically by each column in it and it also provides a dynamic text filter.

Why are multiwords important?

Multiwords define an scenario more restricting than single words do, which means than in the cases where there are several scenarios possible, the more restricting one is chosen as its occurrence is rarer.

In other words, when a multiword is defined in our model, if that multiword appears in a text, for sentiment analysis purposes, that multiword will always be grouped.

Let's see an example:

Case Entries Analysis
Entry definition Sentiment behavior Text Result
1 global crisis before end POSITIVE The global crisis is ending POSITIVE
crisis NEGATIVE
2 global crisis before end POSITIVE There's a huge crisis NEGATIVE
crisis NEGATIVE
3 global crisis before end POSITIVE There's a huge global crisis NONE
crisis NEGATIVE

In the last row you can see that, while the multiword defined in the first entry is detected, the whole context is not, and so the text is not assigned the sentiment behavior defined for that entry. Similarly, as the multiword is detected, the word defined in the second entry is not, and so the text does not get its polarity either. See that in the cases where two patterns could match, the one to appear first in the text will be the chosen one.

Important

When more than one grouping possibility exists, the system will choose the longest. In the text "the huge global crisis", with "global crisis" and "huge global crisis" as the defined multiwords, the second one will be chosen.