Web scraping and text analytics

Text analytics projects are often dependent on Internet-based public sources such as the World Wide Web. These projects usually begin by extracting data from a variety of websites. We call this process “web scraping” (or “web harvesting”). While users can handle web scraping manually, the term often refers to automated methods executed utilizing a web crawler.

Examples of projects that offer a valuable wealth of information include customer experience (in the same way as patient experience or employee experience), dynamic pricing and revenue optimization, competitor monitoring, or compliance checking. Continue reading


People Analytics: MeaningCloud book on Amazon!

People Analytics. Data and Text Analytics for Human Resources

People Analytics. Data and Text Analytics for Human Resources. This MeaningCloud book is available on Amazon.

In People Analytics, and in this book, we use the evidence that the data provides to respond to several questions:

  • Which candidate will be high-performing, effective, loyal, and aligned with the corporate culture?
  • How can we measure the economic impact of a training program?
  • How can I segment the workforce to make their actions more effective?
  • Which people are considering leaving the organization?
  • What net benefit will employees contribute throughout time in a particular position?
  • How does employee commitment affect productivity and economic outcomes?
  • How can I design a study that is statistically and mathematically valid?

Continue reading


MeaningCloud Release: Sentiment + Nordic Pack

Not long ago we published the first of our Language Packs: the Nordic pack, which includes several text analytics tasks in Swedish, Danish, Norwegian and Finnish.

Among the text analytics tasks supported, there’s one that was missed by many of you: Sentiment Analysis API. Well, no more!

We are happy to announce that from now on you can also analyze sentiment in the four languages included in the Nordic pack. And what’s more, for those of you that are already subscribed to the pack, it has been automatically included and so you can start using it right away without any change in pricing.

MeaningCloud release

For those of you that are not subscribed to the Nordic pack, remember that you can test all our packs full functionality by requesting a 30 day period trial. It’s super easy!

Continue reading


MeaningCloud participates in T3chFest 2019

This year MeaningCloud participates in T3chFest, the technology fair in University Carlos III de Madrid.

T3chFest was born as a show of the research works made in the Department of Informatics. Today, the event has become a reference in Spain’s technology scene. In the last edition 1600 people attended to more than 80 talks.

This year we have submitted a call titled “NLP for Small Data“, where we review the state of the art in the Natural Language Processing. We will also discuss the advances in Deep Learning and the usage of Linguistic Models.

The talk will be presented by two members of our Linguistics team: Concepción Polo, Director of Linguistics, and María José García, computational linguist. They are actively involved in every linguistic model in all our products, from the initial model sketch to its final fine tuning.

Continue reading


Text analytics explained: MeaningCloud in Italian

In previous posts we spoke about text analysis performed in French and Portuguese. Today we’re wrapping up this linguistics series by discussing the analyses that can be done with Italian texts.

Italian is spoken in several European countries such as Italy, San Marino and Switzerland, totaling almost 70 million speakers. As Italians have migrated all over the world, its language is also present on the other side of the pond. In South America, for instance, it is the second most spoken language in Argentina. In the US, even though it is not an officially spoken language, many of its citizens are of Italian descendent and thus speak the language at home. We wanted to include such a widely spread language in our Standard Languages Pack.

Hello in many languages

Similarly to our previous posts, we are going to explain, in a linguistically-inclined way, what Text Analytics is and which functionalities MeaningCloud provides in Italian.

Continue reading


Are you listening to the Voice of the Customer?

Voice of the Customer

“Your most unhappy customers are your greatest source of learning.” Bill Gates

In a widely digitalized market, open to all and undoubtedly more accelerated than just a decade ago, quickly identifying customer complaints and needs is key to preserve a company’s competitiveness within its industry. Technological democratization has provided users with skills and tools that not only turn the product but also many other aspects into an experience. If after several years of investment and development, your product has come to position itself among the best in the market, does it make sense for a poorly designed purchasing process to threaten the conviction of potential customers that you are worth choosing?

Continue reading


TASS 2018: Fostering Research on Semantic Analysis in Spanish

MeaningCloud and University of Jaen have been the organizers of TASS, the Workshop on Semantic Analysis in Spanish language at SEPLN (International Conference of the Spanish Society for Natural Language Processing), again in 2018.

TASS logo

During the years, the research has extended to other tasks related to the processing of the semantics of texts that attempt to further improve natural language understanding systems. Apart from sentiment analysis, other tasks attracting the interest of the research community are stance classification, negation handling, rumor identification, fake news identification, open information extraction, argumentation mining, classification of semantic relations, and question answering of non-factoid questions, to name a few.

TASS 2018 was the 7th event of the series and was held in conjunction with the 34rd International Conference of the Spanish Society for Natural Language Processing, in Seville (Spain), on September 18th, 2018. Four research tasks were proposed. MeaningCloud sponsored this edition with prizes for the best systems in each of the tasks. A comprehensive description paper is (to be) published in Procesamiento del Lenguaje Natural journal, vol 62: TASS 2018: The Strength of Deep Learning in Language Understanding Tasks.

Continue reading


MeaningCloud Release: VoC vertical pack upgrade

In the latest MeaningCloud update, we have published a new upgrade for our Voice of the Customer vertical pack. This update has two significant changes:

  • We’ve added a new domain to the four we already supported: telecommunications. This domain is huge and has a vast amount of unstructured data available and ready to be analyzed. You can check out the categories for this new model in the documentation.
  • We’ve refactored the models we already provided. Most of this refactorization has been done under-the-hood, but there are some categories that have changed names, either to give a more intuitive idea of what they refer to or to narrow down the criteria.
MeaningCloud release

Continue reading


Pharmacovigilance: Monitoring the Voice of the Patient

Pharmacovigilance: Voice of the Patient

For the pharmaceutical industry, it is essential to listen and understand the feedback that their current and potential patients communicate through all sorts of channels and touchpoints.

Although there is a protocol that requires any identified Adverse Drug Reactions (ADRs) to be disclosed to the authorities, only 5–20% of them are reported. Fortunately, discussions regarding drugs, symptoms, conditions, and diseases can be analyzed to learn more about said branches of pharmaceutics. Artificial Intelligence significantly contributes in monitoring adverse episodes and understanding their impact in every phase of development.

Patient narratives of medicines and their adverse effects on social media represent an extra data source for drug safety monitoring.

At MeaningCloud, we have developed a platform to automate the process of monitoring ADRs on social media.

Continue reading