Do you have any questions? Just write us an email or ask us through the feedback section.

Request

Requests are made using GET or POST data submissions to the API entry point. Typically, a POST method is recommended in order to overcome the parameter maximum length limit associated to the GET method.

Request

Endpoint

This is the endpoint to access the API.

Service Method Url
Topics Extraction POST https://api.meaningcloud.com/topics-2.0 Console

If you are working with an on-premises installation, you will need to substitute api.meaningcloud.com by your own server address.

Parameters

These are the supported parameters.

Name Description Values Default
key The access key is required for making requests to any of our web services. You can get a valid access key for free just by creating an account at MeaningCloud. Required
of Output format. xml
json
Optional. Default: of=json
lang It specifies the language in which the text must be analyzed. See supported languages. Required
ilang It specifies the language in which the values returned will appear (in the case where they are known). Check the response section to see which fields are affected. en: English
es: Spanish
it: Italian
fr: French
pt: Portuguese
ca: Catalan
da: Danish
sv: Swedish
no: Norwegian
fi: Finnish
zh: Chinese
ru: Russian
Optional. Default: same as lang
txt Input text that's going to be analyzed. UTF-8 encoded text (plain text, HTML or XML). Optional. Default: txt=""
txtf The text format parameter specifies if the text included in the txt parameter uses markup language that needs to be interpreted (known HTML tags and HTML code will be interpreted, and unknown tags will be ignored). plain
markup
Optional. Default: txtf=plain
url URL with the content to classify. Currently only non-authenticated HTTP and FTP are supported. The content types supported for URL contents can be found here. Optional. Default: url=""
doc Input file with the content to analyze. The supported formats for file contents can be found here. Optional. Default: doc=""
tt The list of topic types to extract will be specified through a string with the letters assigned to each one of the topic types that are to be extracted. e: named entities
c: concepts
t: time expressions
m: money expressions
n: quantity expressions [beta]
o: other expressions
q: quotations
r: relations
a: all
Required
uw Deal with unknown words. This feature adds a stage to the topic extraction in which the engine, much like a spellchecker, tries to find a suitable analysis to the unknown words resulted from the initial analysis assignment. It is specially useful to decrease the impact typos have in text analyses. y: enabled
n: disabled
Optional. Default: uw=n
rt Deal with relaxed typography. This parameter indicates how reliable the text (as far as spelling, typography, etc. are concerned) to analyze is, and influences how strict the engine will be when it comes to take these factors into account in the topic extraction. y: enabled
u: enabled only for user dictionary
n: disabled
Optional. Default: rt=n
ud The user dictionary allows to include user-defined entities and concepts in the topics extraction. It provides a mechanism to adapt the process to focus on specific domains or on terms relevant to a user's interests, either to increase the precision in any of the domains already taken into account in our ontology, to include a new one, or just to add a new semantic meaning to known terms. Several dictionaries can be combined separating them with |. Name of your user dictionaries. Optional. Default: ud=""
st

Show subtopics. This parameter will indicate if subtopics are to be shown. See subtopics for a more in depth explanation.

y: enabled
n: disabled
Optional. Default: st=n

Important

The fields txt, doc and url are mutually exclusive; in other words, at least one of them must not be empty (a content parameter is required), and in cases where more than one of them has a value assigned, only one will be processed. The precedence order is txt, url and doc.

Besides these parameters, there are a number of additional parameters that are specific for the different topic types that can be extracted.

Entities parameters

Name Description Values Default
dm

Type of disambiguation applied. It is accumulative, that is, the semantic disambiguation mode will also include morphosyntactic disambiguation.

n: no disambiguation
m: morphosyntactic disambiguation
s: semantic disambiguation
Optional. Default: dm=s
sdg

Semantic disambiguation grouping. This parameter will only apply when semantic disambiguation is activated (dm=s). See disambiguation grouping for a more in depth explanation.

n: none
g: global intersection
t: intersection by type
l: intersection by type - smallest location
Optional. Default: sdg=l
cont

Disambiguation context. Context prioritization for entity semantic disambiguation. See context disambiguation for a more in depth explanation.

Optional. Default: cont=""

Time expressions parameters

Name Description Values Default
timeref This value allows to set a specific time reference to detect the actual value of all the relative time expressions detected in the text. YYYY-MM-DD hh:mm:ss GMT±HH:MM Optional. Default: current time at the moment the request is made.

Subtopics

Subtopics refer to the cases where a structure detected as a topic has another topic within; under normal circumstances, the resulting topic will be the one that contains the second one, but it may imply missing semantic information associated to it. In each element, the element will not be called subtopics but sub[element name], and each element included will contain the same structure as the parent element.

The most common case will take place for entities, where there will appear elements with semantic information that are detected as other types of entities because of the grammatical structure in which they appear.

    For example: "The tickets for the Tower of London are very expensive."

Tower of London would be detected as an entity, and within its analysis, London would appear as a subentity.

Two points to take into account:

  • there will be only one level of subtopics and in the cases where it applies, the elements included there will not go through the same disambiguation process as top level topics.
  • Currently, only subentity_list is enabled.

Disambiguation grouping

Below we have examples on how exactly each mode of the disambiguation grouping parameter behaves. This parameter will only be enabled when dm=s, which means that all modes will include morphological and basic semantic disambiguation.

  • No grouping (sdg=n): no grouping done, the analyses obtained after the morphological and basic semantic disambiguation are the ones shown.
      Example 1: Toledo is very beautiful.

        When no semantic disambiguation is applied, Toledo has the following senses: Last Name, City in Spain, City in Colombia, City in USA, Adm2 in Spain, and Spanish Sports Team. The disambiguation applied will result in two senses: City in Spain and Adm2 in Spain.

      Example 2: The Toledo always wins his matches in the Santiago Bernabéu stadium.

        In this case the disambiguation applied to Toledo will result in just one sense: Spanish Sports Team.

  • Global intersection (sdg=g): ambiguous entities are grouped at entity type level (that is, the global interesection of the analyses) and marked as uncertain, resulting in just one sense per entity.
      Example 1: Her favorite is London

        In this case, the basic disambiguation results in two senses: Last Name and City. As this result is ambiguous (as it is the sentence, which can refer to either the city or the author), the result will be a single sense with no type (the result of intersecting "Person>LastName" and "Location>GeoPoliticalEntity>City"), and uncertain as the confidence value.

  • Intersection by type (sdg=t): ambiguous entities are grouped at entity subtype level and marked as uncertain.
      Example 1: Toledo is very beautiful.

        Again, the basic disambiguation of Toledo leaves two senses: City in Spain and Adm2 in Spain. As both share the entity type up to GeoPoliticalEntity, the result for this grouping mode will be a unique analysis with the sementity type Location>GeoPoliticalEntity and uncertain as the confidence.

      Example 2: The Toledo always wins his matches in the Santiago Bernabéu stadium.

        In this case the result will be the same as for sdg=n, as it is not ambiguous: Spanish Sports Team.

  • Intersection by type - smallest location (sdg=l): similar to sdg=t except for locations; ambiguous locations are disambiguated in favor of the smaller location (lower in the hierarchy)(default).
      Example 1: Toledo is very beautiful.

        As seen in the previous examples, the basic disambiguation results in two senses, both of them locations: City in Spain and Adm2 in Spain. In case of having several ambiguous locations, this mode keeps the smaller location, so in this case, the resulting sense will be City of Spain.

      Example 2: The Toledo always wins his matches in the Santiago Bernabéu stadium.

        Again, the result for this example will be the same as in sdg=n and sdg=t as the result is not ambiguous nor a location.

The main goal of this parameter is to provide different possibilities in the disambiguation process in order to be adaptable to different scenarios.

Context disambiguation

With the disambiguation context parameter you can prioritize an entity or different themes when disambiguating a text, in order to prioritize some analyses. There are two different types of values for this parameter:

  • Entity IDs: id of an entity returned by this same API.
      Example: He always wanted to live in Toledo.

        When analyzed with cont=33fc13e6dd (the id of the entity America in our ontology) two variants of the entity Toledo are detected, a city in Antioquia, Colombia, and another city in Ohio, USA.

  • Ontology themes: name of a theme from our ontology.
      Example: Madrid is the best!

        When analyzed with cont=Football (name of the entry ODTHEME_FOOTBALL in our ontology), the entity Madrid is detected as a sport team (Real Madrid C. F.).

You can use the cont parameter with either identifiers or themes from our ontology; if you use more than one, they must always be separated by the | character.

Supported Languages

These are the languages currently supported and the corresponding value to use in the lang parameter.

  • en: English
  • es: Spanish
  • it: Italian
  • fr: French
  • pt: Portuguese
  • ca: Catalan
  • da: Danish
  • sv: Swedish
  • no: Norwegian
  • fi: Finnish
  • zh: Chinese
  • ru: Russian