Topics Extraction - Console

key

The access key is required for making requests to any of our web services.

of

Output format

lang

Language in which the text is going to be analyzed.

ilang

It specifies the language of the returned response.

Content

txt

Input text that's going to be analyzed.

txtf

The text format parameter specifies if the text included in the txt parameter uses markup language that needs to be interpreted.


doc

Input file with the content to analyze. The supported formats for file contents can be found here.


url

URL with the content to classify. Currently only non-authenticated HTTP and FTP are supported. The content types supported for URL contents can be found here.

The fields txt, doc and url are mutually exclusive; in other words, at least one of them must not be empty (a content parameter is required).

Operating Parameters

tt

The list of topic types to extract will be specified through a string with the letters assigned to each one of the topic types that are to be extracted.

uw

Deal with unknown words. This feature adds a stage to the topic extraction in which the engine, much like a spellchecker, tries to find a suitable analysis to the unknown words resulted from the initial analysis assignment. It is specially useful to decrease the impact typos have in text analyses.

rt

Deal with relaxed typography. This parameter indicates how reliable the text (as far as spelling, typography, etc. are concerned) to analyze is, and influences how strict the engine will be when it comes to take these factors into account in the topic extraction.

st

Show subtopics. This parameter will indicate if subtopics are to be shown. Currently, only subentity_list is enabled.

(for entities)

dm

Disambiguation level applied. There are three possible values:

  • n: no disambiguation is applied
  • m: (morphosyntactic disambiguation mode) only morphosyntactic disambiguation is applied, that is, all possible senses are shown. This mode's impact is appreciated mainly on the quality syntactic analysis.
  • s: (semantic disambiguation mode) morphosyntactic and semantic disambiguation are applied (default).

sdg

Semantic disambiguation grouping. This parameter will only apply when semantic disambiguation is activated (dm=s). There are four possible values:

  • n: none
  • t: intersection by type
  • l: intersection by type - smallest location (default)
  • g: global intersection

cont

Disambiguation context. Context prioritization for entity semantic disambiguation.

(for time expressions)

timeref

This value allows to set a specific time reference to detect the actual value of all the relative time expressions detected in the text. The time reference has to follow the following format:

    YYYY-MM-DD hh:mm:ss GMT±HH:MM

If no time reference is set, the value used by default will be the current time at the moment the request is made.

Dictionaries

ud

The user dictionary allows to include user-defined entities and concepts in the topics analysis. It provides a mechanism to adapt the process to focus on specific domains or on terms relevant to a user's interests, either to increase the precision in any of the domains already taken into account in our ontology to include a new one, or just to add a new semantic meaning to known terms.

To create your own dictionary, just go to our customization engine.

Raw Formatted