What is Lemmatization, PoS and Parsing?

Lemmatization, PoS and Parsing is the name of MeaningCloud' API for the different basic linguistic modules.

Even though it is simple in name, the parser contains a myriad of functionalities derived from the complete morphosyntactic and semantic analysis it carries out. Instead of including different APIs to obtain all the possible features provided by this analysis, features are configured through different parameters, allowing the user to take advantage of as many of them as he wishes and to combine them with other MeaningCloud's features, such as Topics Extraction or Sentiment Analysis.

Through this API you will be able to carry out some of the most used tasks in linguistic applications, all of them different aspects of the morphosyntactic and semantic analysis:

  • Syntactic analysis: obtains a thorough syntactic analysis, giving a complete syntactic tree where the leaves represent the most basic elements and their morphological and semantic analyses.
  • Lemmatization: obtains the lemmas of the different words in a text.
  • PoS tagging: obtains not only the grammatical category of a word, but also all the possible grammatical categories in which a word of each specific PoS type can be classified (check the tagset associated). In the cases it applies, the morphological analysis will be related to a semantic analysis.

This API can be configured so that the same topics that are extracted by the Topics Extraction API are included in the corresponding node on the syntactic tree, allowing the user to combine this extraction with syntactic information to detect patterns in a text. Similarly, it's also possible to include the information detected by the Sentiment Analysis, making this a very powerful too that allows you to combine different types of analysis.

The current supported languages are Spanish, English, French, Italian, Portuguese and Catalan.