Category Archives: Text Analytics

Post that discuss text analytics technology.

It’s time to celebrate! We are rolling an update to our APIs that allows analyzing texts in fifty-seven languages. Our dream of multilingual text analytics based on our deep semantic approach has come true.

Until now, to analyze texts in different languages, we needed to maintain a model per language. This gives good results, although it was hard and expensive. We can do better.

Zero to near sixty in no time

We have integrated our APIs with deep neural network technology to translate all these languages into English. Thanks to this, our users can analyze texts in many languages maintaining only one model. And no action is required. All the new languages will appear in your test console and are available in the APIs.

The complete list of languages includes Chinese, Hindi, Arabic, Russian, Japanese, Turkish, German, and many others that our customers have requested frequently. This adds to our current offering for English, Spanish, French, Italian, Portuguese, and the languages we were covering partially (in some APIs).

Multilingual Text Analytics

Continue reading →

Machine Learning for NLP/Text Analytics, beyond Machine Learning

04/March/2021

Categorization Cognitive Computing Deep Semantic Analytics Language Technology Research Semantic Processing Text Analytics

deep learning explainability machine learning rule induction rule-based models semantic expansion word embeddings

In the field of text analytics, aside from the development of categorization models, the application of machine learning (and more specifically, deep learning) has proved to be very helpful for supporting our teams in the process of building/improving rule-based models.

This post analyzes some of the applications of machine/deep learning for NLP tasks, beyond machine/deep learning itself, that are used to approach different scenarios in projects for our customers.

Continue reading →

New Excel 365 add-in for Text Analytics!

14/December/2020

APIs Integrations Text Analytics

excel add-in

Our new Excel 365 add-in has finally arrived!

Excel is the preferred tool for many MeaningCloud users. They access MeaningCloud APIs directly from Excel with our add-in. In the last months, we have received a lot of inquiries about Mac support. So, we partnered with Microsoft to build a new multiplatform version.

Installation

Installing it is a breeze on all platforms. The new add-in is available in Microsoft AppSource:

https://appsource.microsoft.com/en-us/product/office/WA200002421

Click on Get it Now and follow the instructions.

Configuration

You only need your API key to use MeaningCloud. Paste it in the License Key field and you’re ready to start analyzing.

Don’t have one? Create an account for free – no payment method required.

Configuring the MeaningCloud add-in

Usage

You can use the APIs directly from the ribbon:

MeaningCloud add-in ribbon

The user interface page describes the different buttons. Paste your texts in the spreadsheet, select the tool in the ribbon, review the parameters and click in Analyze:

Take a look at the documentation for more information about add-in usage.

But I don’t use Office 365!

No worries. If you use another Excel version, we still offer the previous add-in version. If you don’t use Microsoft Excel at all, you can use our Google Spreadsheets add-on.

Questions?

If you have any questions or issues, we will be glad to hear from you. Drop us a line at support@meaningcloud.com and tell us about your experience.

Communication during the Coronavirus (I): Thematic analysis in Spanish digital news media

17/April/2020

APIs Application Areas of Text Analytics Categorization Deep Semantic Analytics Marketing and Advertising Industry Publishing Industry Semantic Processing Semantic Publishing Sentiment Analysis Social Media Text Analytics

coronavirus covid-19 media social content Spain twitter

While it is obvious that the priority during this pandemic is to cure the sick, to prevent new cases from surfacing and to ensure there are economic and social measures in place to help the people and businesses most afflicted overcome the current situation; without a doubt, in the near future, the analysis of content related to the coronavirus that has been generated by the media and social network users will be the object of research for numerous disciplines such as sociology, philology, linguistics, audio-visual communication, and politics, to name a few.

At MeaningCloud we want to do our bit in this area, by applying our experience and our Text Analytics solutions to analyze the enormous volume of information in natural language, in Spanish and in other languages, in Spain and in other countries, given that, unfortunately, this is a global crisis.

This first article in the series centers on the thematic analysis of content that has been generated in Spanish by digital media platforms in Spain over the last month, how it has evolved during this period of time and the informative positioning of the main media platforms in Spain.

These other articles (only available, at the moment, in Spanish) analyse conversation topics on Twitter in Spain (both from the hashtags and general topics perspective and also applying a specific thematic categorization) and the linguistic analysis of presidential speeches related to this crisis.

Continue reading →

الصيحة! Text Analytics in Arabic

16/January/2020

Language Packs Text Analytics

supported languages

At MeaningCloud we aim to provide the most advanced text analytics product with the broadest language coverage in the market. That’s why before we finished 2019 we worked on launching several new language packs to increase the coverage given by our standard pack — English, Spanish, French, Italian, Portuguese and Catalan — and our Nordic pack — Swedish, Danish, Norwegian and Finnish.

The third pack we launched is the Arabic pack. Arabic, the fifth most spoken language in the world, is the official language in twenty countries and co-official in six others. It is the first language of 280 million speakers, and the second language of another 250 million. Moreover, for religious reasons, several million Muslims living in other countries have knowledge of Arabic.

Its most peculiar characteristic is that it uses its own writing system, from right to left, joining the letters together. In this way, each letter can have up to four forms. It is also interesting that, despite the fact that they were introduced in the 1920s, there are no capital letters in Arabic. Since sometimes common names can be confused with proper names, the latter are usually enclosed in parentheses or quotes.

MeaningCloud now provides coverage for Arabic for the following functionality:

Topics Extraction: covers the detection of entities and, partially, expressions of time.
Text Clustering: full coverage.

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading →

Ура! Text Analytics in Russian

15/January/2020

Language Packs Text Analytics

supported languages

The second pack we launched is the Russian pack. Russian is the official language of the Russian Federation, Belarus, Kazakhstan and Kyrgyzstan. It was the de facto language in the Soviet Union, so its use it’s also common in the Baltic States, the Caucasus and Central Asia. It’s the most common of the Slavic languages with almost 144 million speakers.

Russian is written using the Cyrillic alphabet, and although transliteration into the Latin alphabet has been common due to the technical restrictions and to the unavailability of Cyrillic keyboards abroad, it’s used less and less thanks to the Unicode extension that incorporates the Russian alphabet and the many free programs that leverage it.

MeaningCloud now provides coverage for Russian for the following functionality:

Topics Extraction: covers the detection of entities and partially, expressions of time.
Text Clustering: full coverage.

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading →

好棒! Text Analytics in Chinese

14/January/2020

Language Packs Text Analytics

supported languages

At MeaningCloud we aim to provide the most advanced text analytics product with the broadest language coverage in the market. That’s why before we finish 2019 we have worked on launching several new language packs to increase the coverage given by our standard pack — English, Spanish, French, Italian, Portuguese and Catalan — and our Nordic pack — Swedish, Danish, Norwegian and Finnish.

The first pack we are launching is the Chinese pack. Chinese, the official language of the People’s Republic of China. It’s the language with the most native speakers, almost a 16% of the global population.

Chinese (in all its varieties) is a group of languages based on ideograms, traditionally arranged in vertical columns, read from top to bottom down a column and right to left across columns. The variety covered by MeaningCloud is simplified Chinese.

MeaningCloud now provides coverage for Chinese for the following functionality:

Topics Extraction: covers the detection of entities and partially, expressions of time.
Text Clustering: full coverage.

This coverage will be extended through the successive product releases depending on the market demand. Find detailed information on our new language coverage page.

So, what are these text analytics tasks and what are they used for?
Continue reading →

Performance Metrics for Text Categorization

11/December/2019

Categorization Language Technology Research Text Analytics

metrics performance evaluation

One of the most common and extensively studied knowledge extraction task is text categorization. Frequently customers ask how we evaluate the quality of the output of our categorization models, especially in scenarios where each document may belong to several categories.

The idea is to be able to keep track of changes in the continuous improvement cycle of models and know if those changes have been for good or bad, to commit or reject them.

This post gives answer to this question describing the metrics that we commonly adopt for model quality assessment, depending on the categorization scenario that we are facing.

Continue reading →

NLP technologies: state of the art, trends and challenges

20/November/2019

Artificial Intelligence Cognitive Computing Deep Semantic Analytics Research Semantic Processing Text Analytics

state of the art vision of technology

This post presents MeaningCloud’s vision on the state of Natural Language Processing technology by the end of 2019, based on our work with customers and research projects.

NLP technology has practically achieved human quality (or even better) in many different tasks, mainly based on advances in machine learning/deep learning techniques, which allow to make use of large sets of training data to build language models, but also due to the improvement in core text processing engines and the availability of semantic knowledge databases.

Continue reading →

Case Study: Text Analytics against Fake News

16/October/2019

APIs Content Industry Research Semantic Publishing Text Analytics

citation extraction fake news text classification

Everybody has heard about fake news. Fake news is a neologism that can be formally defined as a type of yellow journalism or propaganda that consists of deliberate disinformation or hoaxes spread via traditional print and broadcast news media or online social media. It is also commonly used to refer to fabricated or junk news, with no basis in fact, but presented as being factually accurate.

The reason for putting someone’s efforts in creating fake news is mainly to cause financial, political or reputational damage to people, companies or organizations, using sensationalist, dishonest, or outright fabricated headlines to increase readership and dissemination among readers using viralization. In addition, clickbait stories, a special type of fake news, earn direct advertising revenue from this activity.

Continue reading →

Blog

Zero to near sixty in no time

Installation

Configuration

Usage

But I don’t use Office 365!

Questions?