MeaningCloud Release: new Language Identification API and more

As we recently advanced, during these last few months we have been working on new functionality. We are planning to start releasing it over the next few months.

In the latest release of MeaningCloud we have included some of this functionality:

Language Identification

Even though the language identification task is in most cases an auxiliary function rather than an analysis itself, we are aware of how important it is, specially in multilingual applications. In the previous version of the API, Language Identification 1.1, there were a number of limitations so we’ve worked to fix them. These are some of the main changes you will see in the Language Identification 2.0:

  • More languages are supported. We’ve gone from detecting 60 languages to almost 160.
  • This larger number of languages detected includes script languages, one of the weakness from the previous version.
  • The API provides more information for the languages detected: instead of just its ISO639 code, now its name, the two-character and three-character version of the ISO639 and some additional information are available in the output.
  • More control over the output: you can limit the languages in the output, either using a relevance value, or a white-list for the target languages.

Migrating from Language Identification 1.1

If you are currently using Language Identification 1.1, the migration process to the new version is very simple:

  • Request: you do not need to change anything in the request, although you may want to consider if you want to use any of the new parameters available.
  • Response: this is where the main changes are. Instead of an array of strings with the possible languages, the new version of the API will return an array of objects. The field language within this object will correspond to the value returned in Language Identification 1.1.
Language Identification 1.1 will be retired on November 8th, 2017.

Integrations

In this release, we’ve included updates on two of our integrations: our Excel add-on and the GATE plugin.

  • Excel Add-in (v 3.2.0.0)
    • The Excel add-in now is integrated with Language Identification API 2.0.
    • We’ve also done some general maintenance, and improved the integration with the user resources.
  • GATE Plugin (v 3.0)
    • The GATE plugin now is integrated with Language Identification API 2.0.
    • We’ve also done a lot of under-the-hood work, improving performance and the usability of all the processing resources.

New functionality

We are welcoming two new APIs to MeaningCloud: Summarization and Document Structure Analysis.

  • Summarization 1.0 (beta)
    • This API receives a text or a document and tries to extract a summary for it, selecting the most relevant sentences in it to try to sum up what it is about.
  • Document Structure Analysis 1.0 (beta)
    • Extracts different sections of a given document with markup content (which includes formatted documents such as PDF or Microsoft Word files), including the title, headings, abstract and parts of an email.

If you have any questions or just want to talk to us, we are always available at support@meaningcloud.com!


Leave a Reply

Your email address will not be published. Required fields are marked *

*
*