Author Archives: Blanca Galego

Text Classification 2.0: Migration Guide

We’ve recently published a new version of our Text Classification API, which comes hand in hand with a new version of the Classification Models Customization console.

In both these new versions, the main focus is on user models. We know how important it is to easily define the exact criteria you need, so the new classification API supports a new type of resource, the one generated by the Classification Model Customization Console 2.0.

In this post, we will talk about how to migrate to these new versions if you are currently using the old ones. Text Classification 1.1 and Classification Models 1.0 will be retired on 15/Sep/2020. Continue reading


New Release: Text Classification 2.0

We’re happy to announce we have just published a new version of our Text Classification API, which comes hand in hand with a new version of the Classification Models Customization console.

In both these new versions, the main focus is on user-defined models. We know how important it is to easily define the exact criteria you need, so the new classification API supports a new type of resource, the one generated with the Classification Models Customization console 2.0.

With these new versions, we’ve aimed to:

  • Make criteria definition easier: more user-friendly operators to improve overall rule readability, and new operators to provide more flexibility.
  • Remove dependencies between categories in a model that made their maintenance and evolution cumbersome.
  • Give the user more control over where the relevance assigned to the categories comes from.
MeaningCloud release

Let’s see with a little more detail what’s new. Continue reading


COVID-19 Crisis: doing our part

As our CEO, Jose Gonzalez, announced a couple of days ago, in MeaningCloud, we believe that every little helps, and so we are adopting some measures to help our users and clients in these trying times.

Starting now:

  • We will provide full access to anyone participating in any of the NLP-related tasks published by platforms such as Kaggle to help in the research to fight or analyze the impact of COVID-19. Just contact us at support.
  • The Start-Up plan is doubled in volume: all current subscriptions, as well as subscriptions created while the COVID-19 crisis, persists will allow up to 240k credits per month.
fight COVID-19
  • The Start-Up plan now comes with a discount of $25 over its usual price for the next 3 months:
    • The discount has already been applied to all our current Start-Up subscriptions.
    • Any new subscription may benefit from this discount adding the coupon COVID-STARTUP in the upgrade process.
  • All our packs now cost $99 for the next 3 months. This discount has already been applied to current subscriptions to packs. New subscriptions should use the following coupons on checkout depending on the pack:

The upgrade process should include only the item the coupon applies to. These coupons are valid until May 31, 2020, or until the crisis ends.

Stay safe and take care of each other.

The MeaningCloud team


RapidMiner + Python + MeaningCloud = 🚀

Integrations with third-party software are something extremely useful: they allow you to use technology outside the tool you are using, giving you additional features outside its core functionality or just providing auxiliary tools to make your day to day easier.

One of the downsides is that you are limited by the functionality the integration provides. Usually, this is not much of a problem as standard integrations tend to cover the most common use cases, but in the case of tools that can be used in many scenarios, these uses cases may not be exactly what you need or want for your application.

MeaningCloud is not an exception to this. We provide many different APIs, each one of them with several types of analyses and with tons of possible applications. It’s not surprising that not all of them are included in MeaningCloud’s extension for RapidMiner.

mc+python+rm

If you want something like the global polarity Sentiment Analysis provides, then the extension for RapidMiner has you covered, but it may not be the case for other analyses. It can go from wanting to use a MeaningCloud API not included in the extension such as the Summarization API or to something as small as needing the label of the resulting categories in an automatic classification process instead of the code the extension provides.

Last year, RapidMiner published a new Python scripting extension: Execute Python. This operator allows you to run a Python script in RapidMiner, which enables you to include any processing you want and can code in a Python script in your RapidMiner process.

Using this new functionality and MeaningCloud’s Python SDK, we can create a Python script to use any of MeaningCloud APIs directly from RapidMiner. The SDK enables us to work with the API output easily and to extract whatever information we want to add to our RapidMiner processes.

Let’s see how we can do this! Continue reading


الصيحة! Text Analytics in Arabic

At MeaningCloud we aim to provide the most advanced text analytics product with the broadest language coverage in the market. That’s why before we finished 2019 we worked on launching several new language packs to increase the coverage given by our standard pack — English, Spanish, French, Italian, Portuguese and Catalan — and our Nordic pack — Swedish, Danish, Norwegian and Finnish.

The third pack we launched is the Arabic pack. Arabic, the fifth most spoken language in the world, is the official language in twenty countries and co-official in six others. It is the first language of 280 million speakers, and the second language of another 250 million. Moreover, for religious reasons, several million Muslims living in other countries have knowledge of Arabic.

Its most peculiar characteristic is that it uses its own writing system, from right to left, joining the letters together. In this way, each letter can have up to four forms. It is also interesting that, despite the fact that they were introduced in the 1920s, there are no capital letters in Arabic. Since sometimes common names can be confused with proper names, the latter are usually enclosed in parentheses or quotes.

MeaningCloud now provides coverage for Arabic for the following functionality:

Arabic

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading


Ура! Text Analytics in Russian

At MeaningCloud we aim to provide the most advanced text analytics product with the broadest language coverage in the market. That’s why before we finished 2019 we worked on launching several new language packs to increase the coverage given by our standard pack — English, Spanish, French, Italian, Portuguese and Catalan — and our Nordic pack — Swedish, Danish, Norwegian and Finnish.

The second pack we launched is the Russian pack. Russian is the official language of the Russian Federation, Belarus, Kazakhstan and Kyrgyzstan. It was the de facto language in the Soviet Union, so its use it’s also common in the Baltic States, the Caucasus and Central Asia. It’s the most common of the Slavic languages with almost 144 million speakers.

Russian is written using the Cyrillic alphabet, and although transliteration into the Latin alphabet has been common due to the technical restrictions and to the unavailability of Cyrillic keyboards abroad, it’s used less and less thanks to the Unicode extension that incorporates the Russian alphabet and the many free programs that leverage it.

MeaningCloud now provides coverage for Russian for the following functionality:

Russian pack

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading


好棒! Text Analytics in Chinese

At MeaningCloud we aim to provide the most advanced text analytics product with the broadest language coverage in the market. That’s why before we finish 2019 we have worked on launching several new language packs to increase the coverage given by our standard pack — English, Spanish, French, Italian, Portuguese and Catalan — and our Nordic pack — Swedish, Danish, Norwegian and Finnish.

The first pack we are launching is the Chinese pack. Chinese, the official language of the People’s Republic of China. It’s the language with the most native speakers, almost a 16% of the global population.

Chinese (in all its varieties) is a group of languages based on ideograms, traditionally arranged in vertical columns, read from top to bottom down a column and right to left across columns. The variety covered by MeaningCloud is simplified Chinese.

MeaningCloud now provides coverage for Chinese for the following functionality:

Chinese pack

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading


MeaningCloud achieves ISO/IEC 27001 certification

In MeaningCloud, we know how important it is to manage and ensure information security, even more so for a platform that processes all kinds of texts — including texts with sensitive information — to help you extract insightful information from them. For this reason, at the end of last year Sngular prioritized confirming and improving our good practices by obtaining the ISO 27001 certification, which we achieved in our first attempt in February after following an extensive audit process carried out by RINA.

For those unfamiliar with it, ISO/IEC 27001 is an information security standard that specifies a management system that is intended to bring information security under management control and gives specific requirements.

Organizations that meet the requirements may be certified by an accredited certification body following successful completion of an audit. The standard is published by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) under a joint subcommittee.

ISO27001

The certification obtained applies to both MeaningCloud in its SaaS and its on-premises version, and includes all its stages: development, maintenance and deployment.


New Release: Financial Industry Vertical Pack

Some text analytics scenarios need more than general purpose resources to get the results you need. If you are familiar with MeaningCloud, you’ll know that resource customization is one of our main features and great advantages. The parametrization available in the different analyses we offer enables you to adapt our tools to exactly the type of analysis you want. You can do this in two ways: using any of our predefined resources or creating your own with our customization consoles.

In this line, we are happy to announce that we have released a new vertical pack for the finance industry. This pack will allow you to analyze your financial contents and interpret them according to a standard vocabulary (FIBO™).

MeaningCloud release

Continue reading


Tutorial: create your own deep categorization model

As you have probably know by now if you follow us, we’ve recently released our new customization console for deep categorization models.

Deep Categorization models are the resource we use in our Deep Categorization API. This API combines the morphosyntactic and semantic information we obtain from our core engines (which includes sentiment analysis as well as resource customization) with a flexible rule language that’s both powerful and easy to understand. This enables us to carry out accurate categorization in scenarios where reaching a high level of linguistic precision is key to obtain good results.

In this tutorial, we are going to show you how to create our own model using the customization console: we will define a model that suits our needs and we will see how we can reflect the criteria we want to through the rule language available.

The scenario we have selected is a very common one: support ticketing categorization. We have extracted (anonymized) tickets from our own support ticketing system and we are going to create a model to automatically categorize them. As we have done in other tutorials, we are going to use our Excel add-in to quickly analyze our texts. You can download the spreadsheet here if you want to follow the tutorial along. If you don’t use Microsoft Excel, you can use the Google Sheets add-on.

The spreadsheet contains two sheets with two different data sets, the first one with 30 entries, the second one with 20. For each data set, we have included an ID, the subject and the description of the ticket, and then a manual tagging of the category it should be categorized into. We’ve also added an additional column that concatenates the subject and the description, as we will use both fields combined in the analysis.

To get started, you need to register at MeaningCloud (if you haven’t already), and download and install the Excel add-in on your computer. Here you can read a detailed step by step guide to the process. Let’s get started! Continue reading