Topics Extraction is MeaningCloud's solution for extracting elements of relevant information from unstructured text:

  • Named entities: people, organizations, places, etc.
  • Concepts: significant keywords
  • Time and money expressions
  • Quantity expressions [beta]
  • Quotes
  • Relations

This detection process is carried out by combining a number of complex natural language processing techniques that allow us to obtain morphological, syntactic, and semantic analyses of a text and use them to identify different types of significant elements. The current supported languages are Spanish, English, French, Italian, Portuguese and Catalan.

Differentiators:

  • Because the API is highly configurable you can adjust its behavior to very diverse operating scenarios, not only to obtain exactly the type of information relevant to the user, but also to cover different source formats, languages and even language registers.
  • Recognizes names of people, organizations, and a hierarchy of 200 entity types.
  • Extracts multiword concepts (e.g. "financial crisis").
  • Disambiguates and detects co-occurrences in several languages.
  • Users can create their own dictionaries.

You can also use your own resources in the extraction process by creating a dictionary through our customization engine.

Documentation

Everything and anything you need to take advantage of this API's full potential.

Test Console

Choose an input and a configuration, and immediately check the results!

Developer Tools

Do you want to integrate this API into your environment? Check our Developer Tools!

Versions

Version Date Status
2.0 13/November/2017

2.0.12 (13/November/2017)

  • Minor bugs have been fixed, and resources have been updated.

2.0.11 (19/September/2017)

  • Heuristic detection of companies has been improved.
  • Time and quantity expressions detection has been improved.
  • Several minor bugs have been fixed, and resources have been updated.

2.0.10 (26/June/2017)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.9 (27/March/2017)

  • Several minor bugs have been fixed, and resources have been updated, especially keywords.
  • Heuristic detection of entities has been improved.
  • Variant generation for entity detection has become stricter in order to avoid false positives.

2.0.8 (27/October/2016)

  • Bug in specific texts with parentheses has been fixed.
  • Several minor bugs have been fixed, and resources have been updated.
  • Money expressions in Italian have been improved.
  • Heuristic detection in all languages has been improved.

2.0.7 (27/July/2016)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.6 (13/June/2016)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.5 (26/April/2016)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.4 (07/April/2016)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.3 (02/March/2016)

  • We have improved the precision on the entity type for heuristic detection.
  • Several minor bugs have been fixed, and resources have been updated.

2.0.2 (02/February/2016)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.1 (22/December/2015)

  • Several minor bugs have been fixed, and resources have been updated.

2.0.0 (01/December/2015)

  • New element quantity_expression has been added.
  • uri_expression and phone_expressions have been integrated inside entity_list for greater coherence.
  • Traceability with user dictionaries has been improved.
  • The disambiguation parameters have been restructured to add clarity to what they do.
  • The possibility of specifying an interface language has been added, making it easier to work with multilingual sources.
  • The standard element has been homogenized in all its appearances.
  • Some fields in the output of money_expression and quotation have been changed to improve usability.
1.2 02/March/2016

1.2.14 (02/March/2016)

  • Version retired.

1.2.13 (22/December/2015)

  • Resources have been updated.

1.2.12 (01/December/2015)

  • Several minor bugs have been fixed, and resources have been updated.

1.2.11 (06/October/2015)

  • Several minor bugs have been fixed, and resources have been updated.

1.2.10 (09/September/2015)

  • Significant improvements have been added to URL and HTML text processing.
  • Several minor bugs have been fixed, and resources have been updated.

1.2.9 (28/July/2015)

  • Resources have been updated.

1.2.8 (14/July/2015)

  • Several minor bugs have been fixed, and resources have been updated.

1.2.7 (02/June/2015)

  • Several minor bugs have been fixed, and resources have been updated.
  • Python client has been improved.
  • The relaxed typography parameter, rt, now has a new value related to ud.

1.2.6 (18/May/2015)

  • Several minor bugs have been fixed.
  • CASHTAG has been added as a new node of the ontology.
  • Resources have been updated (including cashtag elements).
  • Memory leaks issue related to user dictionaries has been solved.
  • Smart prefix detection has been improved.
  • For English the following points have been improved:
    • Disambiguation between common and proper nouns.
    • Use of stop words depending on the typography.

1.2.5 (06/April/2015)

  • Several minor bugs and minor concurrency problems have been fixed.
  • Resources have been updated.
  • Suggestions for unknown words has been improved, especially for short words, based in typing mistakes and letters repetition.
  • Smart typography detection added.

1.2.4 (24/June/2014)

  • Several minor bugs have been fixed, and resources have been updated.

1.2.3 (20/May/2014)

  • Several minor bugs have been fixed, and resources have been updated.

1.2.2 (17/March/2014)

  • Several bugs have been fixed in entity detection and resources have been updated.
  • Response time has been improved in the documentation pages.

1.2.1 (04/February/2014)

  • Several minor bugs have been fixed, and resources have been updated.
  • Heuristic rules for entity detection have been improved, increasing the quantity and the classification quality of the unknown entities detected.

1.2 (23/September/2013)

  • Attribute naming for semantic information has been standardized so that every element that can be an array has '_list' in its name. This allows flexibility when it comes to defining new attributes and ensures that the output will always be the same regardless of the number of values the specific case has.
  • The response headers have been updated so that the content type is correct for all output formats supported.
  • Resources have been upgraded.
  • Bugs reported through our feedback section have been fixed.
  • Error messages in all APIs have been unified.
  • Related Facebook and Twitter links have been added to the semantic linked data information (semld) of known entities and concepts.
  • The documentation has been improved, both in format and contents.

Click on the version number to see the change log.

Languages

  • English
  • Spanish
  • French
  • Italian
  • Portuguese
  • Catalan

Integrations

Related Links

Contact us

Do you have any questions? Have you detected a bug? Contact us through our feedback section or at support@meaningcloud.com