Category Archives: Language Technology

Posts about language technology.

Language Technology Industry meets in Brussels, May 16-17, 2016

Language Technology Industry Summit


LT-Innovate, the Language Technology Industry Association, organizes a new edition of its annual Summit. It is the yearly point of convergence between the Language Technology Industry, its clients, research partners and policy makers. According to its Memorandum of Association (London, 2012), LT-Innovate is a non-governmental organization consisting of all parties involved in the field of Language Technologies (LT) and services. Main goals are the promotion of common interests in the successful development, production, delivery and use of language technologies and services and the implementation of services that may help to promote the industry.

LTI Cloud

Besides traditional sections, as Solution Showcases, Technology Spotlights, and Project Results, Language Technology Industry Summit 2016 will serve as the official launch of one of the most important endeavours undertaken by the Association since its inception: the LTI Cloud.

LTI Cloud is the one-stop-shop platform for making available, discovering, assembling, testing and prototyping language technology components. If you are a potential provider of LT APIs (researcher, developer, startup…), and you want to get exposure, testing, or simply customers, consider using LTI Cloud, as it is a ready-to-use platform.

Still in a pilot phase until May 17th, you can be among the first adopters of LTI Cloud. And remember that it serves not only LT providers but also final users. Jochen Hummel, the leader of this initiative, will make the presentation at the Conference. By now, you can take a look at this preview.

Coming back to the Summit, I would like to stress a traditional track: “Customers challenge the Industry”. This year’s challenge comes from Elsevier: “Dynamic Knowledge Stores and Machine Translation”. It will be presented by Michelle Gregory and Pascal Coupet.

MeaningCloud User Profiling API


Being s|ngular one of the founding companies of LT-Innovate (through its subsidiary Singular Meaning), we are proud to take an active role again in this year’s event. On Tuesday 17th of May, I will be presenting our recent work on “Automatic Extraction of Rich Customer Profiles from their Activity in Social Networks”. It is about our brand new MeaningCloud API for automatic profiling of Twitter users. User Profiling API allows extracting some important demographics according to different aspects for a given Twitter user: which topics the user talks about, personal and professional information, hobbies and interests, etc. This information extraction is based on a mixed rule-based and machine learning approaches.

Conference Discount Code

Come and join us at the LT-Innovate Summit. And, before registration, do not forget to ask for a special discount code through our helpdesk (

Voice of the Customer, Voice of the Employee and NPS

More and more companies have come to understand that to grow profitably in competitive scenarios, satisfied customers are the key to success. And they know that employees have a fundamental role in achieving a better customer experience.

In this challenge to improve customer loyalty, companies must be able to listen to their customers and understand what they are saying. It is what we call the Voice of the Customer (VoC).

However, a mission — such as customer satisfaction — that lacks a precise measure of success (or failure) is just hot air. Quoting Lord Kelvin, “If you can not measure it, you can not improve it.”

The Net Promoter Score (NPS) has become, for a number of companies, the key metric for measuring customer satisfaction. By the same standard, the mission to get motivated and happy people in an organization also has its key metric: the eNPS (Employee NPS).

As discussed below, in order to improve customer and employee experience, both the NPS and the eNPS need to find the reason that justifies the score given.

When asked What is the primary reason for your score? the NPS and the eNPS collect and analyze the open answers of thousands of customers and employees. Here is where the linguistic technology of Meaning Cloud intervenes.
Continue reading

Some conclusions from our Text Analytics survey

What does “text analytics” mean to you and your organization? How do you plan to use Text Analytics in 2016? For MeaningCloud, as a text analytics tool vendor, having some answers to these questions is key to understand our market and define our product strategy: this was the purpose of the survey we kept open during some weeks, since the beginning of last October.

Even though the number of respondents was quite low (60) it is definitely possible to draw some conclusions and trends that we summarize in this post.

Applications: customer is first

What is your text analytics application scenario? No doubt this is the main question when one needs to analyze the uses of this technology. In our results, Understanding customer attitudes, behaviors, and needs was the most mentioned scenario (62%), followed by Research (48%) and Content Classification, recommendation, and personalization (43%) as it can be seen in the figure. The following two categories were Customer service, improving customer experience (40%) and Brand/reputation management (38%), which means that everything related to customer understanding, improving customer experience, and managing the brand lead the text analytics application area, coping 3 of the 5 first positions.

Continue reading

#ILovePolitics: Political discourse analysis in social media

We continue with the #ILovePolitics series of tutorials! We will show how to use MeaningCloud for extracting interesting insights to build your own Political Intel Reports and, at the same price, turning you into a Data Scientist giant in the field of Social Media Analytics.

political issues

Political issues

Politics and Social Media Analytics

Our research objective is to study and compare the discourse of different politicians during the electoral campaign, using their messages in Twitter. We are going to compare tweets by the four most popular (mentioned) politicians in our previous tutorial: Barack Obama (@barackobama), Hillary Clinton (@HillaryClinton), Donald Trump (@realDonaldTrump) and Jeb Bush (@JebBush).

  • What are their key messages?
  • What do they focus on?
  • Are really there different ways of doing politics?

Before we start, three remarks: 1) we will focus on U.S. Politics, in English language, but the same analysis can be adapted for your own country or language as long as it is supported in MeaningCloud, 2) this is a technical tutorial: we will develop some coding, but in general, everyone can understand the purpose of this tutorial, and 3) although this tutorial will use PHP, any non-rookie programmer can translate the programs to any language.

Continue reading

How might your organization employ Text Analytics in 2016?

Help us design the best Text Analytics tool

If you are a MeaningCloud user or are otherwise involved in Content Analytics or Text Mining, we’d like to hear your opinion.

We want to know what “text analytics” means to you and your organization. We are researching current trends and issues in the market, both business- and solution-related, including adoption by industry and business function, successes and failures, and requirements for the software tools of the future.

Please take part in our survey. Respondents will receive a copy of the conclusions.

The survey is at

and it’s open till the end of  November 18th.

Take the Survey

Thank you!

An Introduction to Sentiment Analysis (Opinion Mining)

In the last decade, sentiment analysis (SA), also known as opinion mining, has attracted an increasing interest. It is a hard challenge for language technologies, and achieving good results is much more difficult than some people think. The task of automatically classifying a text written in a natural language into a positive or negative feeling, opinion or subjectivity (Pang and Lee, 2008), is sometimes so complicated that even different human annotators disagree on the classification to be assigned to a given text. Personal interpretation by an individual is different from others, and this is also affected by cultural factors and each person’s experience. And the shorter the text, and the worse written, the more difficult the task becomes, as in the case of messages on social networks like Twitter or Facebook.

Continue reading

What You Need To Know about Text Analytics

You have enough to worry about. You know your industry inside and out. You know your products and services and how they compare with the competition’s strengths and weakness. In business, you have to be an expert in a range of topics. What you don’t need to worry about is the in’s and out’s of every technology, algorithm and software program.

This is especially true of an inherently complex technology such as natural language processing. As a business owner you have enough to worry about. Do you really have time to understand morphological segmentation?Text analytics should just be another tool in your toolbox to achieve your business ends. The only thing you need to know is what problems you have that can be solved by Natural Language Processing. Anaphoric referencing? Don’t worry about it. We have it covered and anything else you might need from language technology.

Text Analytics

What you do need to know about text analytics?

Text analytics goes by many names: natural language processing, NLP, text analysis, text mining, computational linguistics. There are shades of difference in these terms but let the boffins work that out. What you need to know is that these terms describe a variety of algorithms and technology that is able to process raw text written in a human language (often referred to as a natural language) to provide enriched text. That enrichment could mean a number of things:

  • Categorization. The categorization of the text according to themes, categories or a taxonomy.
  • Topic Extraction. The identification of the key named entities and concepts being talked about in the text such as people, place, organizations and brands.
  • Sentiment Analysis. The analysis of whether the text is talking about those concepts in a positive or negative light.

Continue reading

#ILovePolitics: Popularity analysis in the news

If you love politics, regardless of your party or political orientation, you may know that election periods are exciting moments and having good information is a must to increase the fun. This is why you follow the news, watch or listen to political analysis programs on TV or radio, read surveys or compare different points of view from one or the other side.

American politics in a nutshell

American politics

Starting with this, we are publishing a series of tutorials where we will show how to use MeaningCloud for extracting interesting political insights to build your own political intel reports. MeaningCloud provides useful capabilities for extracting meaning from multilingual content in a simple and efficient way. Combining API calls with open source libraries in your favorite programming language is so easy and powerful at the same time that will awaken for sure the Political Data Scientist hidden inside of you. Be warned!

Our research objective is to analyze mentions to people, places, or entities in general in the Politics section of different news media. We will try to carry out an analysis that can answer the following questions:

  • Which are the most popular names?
  • Does their popularity depend on the political orientation of the newspaper?
  • Is it correlated somehow to the popularity surveys or voting intentions polls?
  • Do these trends change over time?

Before we begin

This is a technical tutorial in which we will develop some coding. However, we will try to guide you through the whole process, so everyone can follow the explanations and understand the purpose of the tutorial.

For the sake of generality and better understanding, we will focus on U.S. Politics in English, but obviously you can easily adapt the same analysis for your own country or (MeaningCloud supported) language.

And last but not least, this tutorial will use PHP as programming language for the code examples. However, any non-rookie programmer should be able to translate the scripts into any language of their choice.

Continue reading

Is Cognitive Computing too Cool to Be True?

According to IBM, “Cognitive Computing systems learn and interact naturally with people to extend what either humans or machines could do on their own. They help human experts make better decisions by penetrating the complexity of Big Data.” Dharmendra Modha, Manager of Cognitive Computing at IBM Research, talks about cognitive computing as an algorithm being able to solve a vast array of problems.

With this definition in mind, it seems that this algorithm requires a way to interact with humans in order to learn and to think as they do. Nice, great words! Anyway, it is the same well-known goal of Artificial Intelligence (AI), a more common name that almost everybody has heard about. Why change it? Ok, when a company is investing at least $1 billion in something, it must be cool and fancy enough to draw people’s attention, and AI is quite old-fashioned. Nevertheless, machines still cannot think! And I believe it will take some time.

How does Cognitive Computing work? According to the given definition, to enable the human-machine interaction, some kind of voice and image processing solutions must be integrated. I am not an expert on image processing, but voice recognition systems, dialog management models and Natuking-640388_1280ral Language Processing techniques have been studied for a while. Even Question Answering methods (i.e. the ability of a software system to return the exact answer to a question instead of a set of documents as traditional search engines do) have been deeply studied. We ourselves have been doing (and still do) research on this topic since 2007, which resulted in the development of virtual assistants, a combination of dialogue management and question answering techniques. Do you remember Ikea’s example called Anna? In spite of the fame she gained at that time, she is not working anymore. Perhaps, for users, that kind of interaction through a website was not effective enough. On the other hand, virtual assistants like Siri, supported by an enormous company as Apple, are gaining attention. There are other virtual assistants for environments different from iOS but they are far less known, perhaps because the companies behind them are quite smaller than Apple.

Several aspects of the thinking capabilities required by the mentioned algorithm have to do with the concept of Machine Learning. There are a lot of well-known algorithms which are able to generate models from a set of examples or even from raw data (in the case of unsupervised processes). This enables a machine to learn how to classify things or to group items together, like a baby piling up those coloured geometric pieces. So, combining Machine Learning and NLP models it is possible for a machine to understand a text. This process is what we call Structuring Unstructured Data (much less fancy than Cognitive Computing). That is, making your information actionable. We have been working on this during several years, but now it is called cognitive computing.

So, as you might imagine, Cognitive Computing techniques are not different from the ones we have already developed; a lot of researchers and companies have been combining them. And, if you think about it, does it really matter if a machine thinks or not? The relevant added value of this technology is helping humans to do their job with all the relevant information at hand, at the right moment, so they can make thoughtful and reasonable decisions. This is our goal at MeaningCloud.

Exploring Social Media for Healthcare Data

People enjoy sharing information through social media, including healthcare data. Yeah, it is true! And it constitutes the starting point of the research work titled ‘Exploring Spanish health social media for detecting drug effects’, which aims at following social media conversations to identify how people talk about their relation with drug consumption. This allows identifying possible adverse effects previously unknown related to these drugs. Although there is a protocol to communicate to the authorities the identification of a drug adverse effect, only a 5 – 20% of them are reported. Besides, conversations around drugs, symptoms, conditions and diseases can be analyzed to learn more about them. For example, it is possible to see how people search for specific drugs using social media, while others sell them, perhaps illegally. Many others talk about mixing alcohol with drugs or other illegal substances. Of course, one cannot believe everything that appears on the Internet this is another issue—, but it can highlight some hypothesis for further research.


Some researchers from the Advanced Databases Group at Carlos III University of Madrid have carried out the mentioned study, designing hybrid models to capture the needed knowledge to identify adverse effects. The Natural Language Processing platform which supports the implementation of the analysis process based on such models is MeaningCloud. The customization capabilities provided by the platform have been decisive to include specific vocabulary and medical domain knowledge. As we know, the names of drugs and symptoms might be complex and, in some cases, difficult to write properly. The algorithm’s results are promising, with a 10% increase in recall when compared to other known algorithms. You can find further details in the scientific paper published by the BMC Medical Informatics and Decision Making Journal.

These developments have been part of the TrendMiner project, and are now available in the prototype website TrendMiner Health Analytics Dashboard, which shows people’s comments about antidepressants gathered from social media. The console displays the mentions of antidepressants and related symptoms and, by clicking on any of them, their evolution over time. Moreover, the source texts analyzed to compute those mentions are shown at the bottom, with labels highlighting the names of drugs, symptoms or diseases, and any relations among them. Such relations might say if a drug is indicated for a symptom or if a disease is an adverse effect of the mentioned drug. The prototype also allows searching by the ATC code (Anatomical Therapeutic Chemical Classification System) and the corresponding level according to this classification scheme. So, if you mark the ‘By Active Substance’ selector, you are searching any drug containing the active substance of the product you inserted in the search box. Furthermore, the predictive search functionality makes easier to find the right expression for a drug or disease. Please, have a look at the prototype and tell us what you think about it. If you find a chart useful, you can even tweet it from there! Any comment is more than welcome.