Category Archives: Categorization

Machine Learning for NLP/Text Analytics, beyond Machine Learning

In the field of text analytics, aside from the development of categorization models, the application of machine learning (and more specifically, deep learning) has proved to be very helpful for supporting our teams in the process of building/improving rule-based models.

This post analyzes some of the applications of machine/deep learning for NLP tasks, beyond machine/deep learning itself, that are used to approach different scenarios in projects for our customers.

Continue reading


IAB Taxonomy Level 3 now available in our Deep Categorization API

IAB - Interactive Advertising BureauDigital marketing is becoming a fundamental pillar, by leaps and bounds, in the business plans of practically every business model. Methods are being refined and the search for the connection between brand and user is expected to become increasingly more precise: a related advertisement is no longer sufficient, now the advertisement must appear at the right time and in the right place. This is where categorization proves to be an exceedingly useful tool.

That is why, at MeaningCloud, we have improved our IAB categorization model in English, that is integrated in our Deep Categorization API:

  • Adding a third level of content taxonomy to the hierarchy of categories (IAB Taxonomy Level 3).
  • Improving the precision of pre-existing categories.
  • Including the unique identifiers defined by IAB itself for each of the categories.

Continue reading


Communication during the Coronavirus (I): Thematic analysis in Spanish digital news media

While it is obvious that the priority during this pandemic is to cure the sick, to prevent new cases from surfacing and to ensure there are economic and social measures in place to help the people and businesses most afflicted overcome the current situation; without a doubt, in the near future, the analysis of content related to the coronavirus that has been generated by the media and social network users will be the object of research for numerous disciplines such as sociology, philology, linguistics, audio-visual communication, and politics, to name a few.

At MeaningCloud we want to do our bit in this area, by applying our experience and our Text Analytics solutions to analyze the enormous volume of information in natural language, in Spanish and in other languages, in Spain and in other countries, given that, unfortunately, this is a global crisis.

This first article in the series centers on the thematic analysis of content that has been generated in Spanish by digital media platforms in Spain over the last month, how it has evolved during this period of time and the informative positioning of the main media platforms in Spain.

These other articles (only available, at the moment, in Spanish) analyse conversation topics on Twitter in Spain (both from the hashtags and general topics perspective and also applying a specific thematic categorization) and the linguistic analysis of presidential speeches related to this crisis.

Continue reading


Performance Metrics for Text Categorization

One of the most common and extensively studied knowledge extraction task is text categorization. Frequently customers ask how we evaluate the quality of the output of our categorization models, especially in scenarios where each document may belong to several categories.

The idea is to be able to keep track of changes in the continuous improvement cycle of models and know if those changes have been for good or bad, to commit or reject them.

This post gives answer to this question describing the metrics that we commonly adopt for model quality assessment, depending on the categorization scenario that we are facing.

 

Continue reading


Recorded webinar: Solve the most wicked text categorization problems

Thank you all for your interest in our webinar “A new tool for solving wicked text categorization problems” that we delivered last June 19th, where we explained how to use our Deep Categorization customization tool to cope with text classification scenarios where traditional machine learning technologies present limitations.

During the session we covered these items:

  • Developing categorization models in the real world
  • Categorization based on pure machine learning
  • Deep Categorization API. Pre-defined models and vertical packs
  • The new Deep Categorization Customization Tool. Semantic rule language
  • Case Study: development of a categorization model
  • Deep Categorization – Text Classification. When to use one or the other
  • Agile model development process. Combination with machine learning

IMPORTANT: this article is a tutorial based on the demonstration that we delived and that includes the data to analyze and the results of the analysis.

Interested? Here you have the presentation and the recording of the webinar.

(También presentamos este webinar en español. Tenéis la grabación aquí.)
Continue reading