Author Archives: Jose C. Gonzalez

About Jose C. Gonzalez

CEO at MeaningCloud. PhD in Telecommunication Engineering. Professor at Technical University of Madrid (1985-2015).

MeaningCloud Sponsors the Real World Evidence Forum 2017

Real World Evidence Forum

Real-World Evidence Forum

Real-World Evidence Forum Philadelphia, July 17-18

At MeaningCloud, we are proud to sponsor the Real World Evidence Forum. The RWE Forum, taking place on July 17-18, 2017 in Philadelphia, will bring together clinical health professionals to address:

  • How to operationalize the process of collecting real-world data.
  • How to utilize real-world evidence to demonstrate both the clinical effectiveness and cost-effectiveness of drugs.

Attendees will gain a better understanding of how electronic data sources are changing the way real-world data is being collected. This conference will offer attendees insight into how real-world evidence will help decrease costs, define innovative outcomes and minimize the number of patients exposed to potentially harmful medications.

Text Analytics and Real World Evidence

MeaningCloud, as a Text Analytics provider, has evolved a highly specialized offering for the Health and Pharma industries. We count among our clients some the largest companies in the Pharmaceutical industry.

Join us in Philadelphia. If you are interested in attending the Real World Evidence Forum next July 17-18, just drop us a line to info@meaningcloud.com. We have a surprise for you!

Stay tuned to access our presentation at the conference, that we will publish on this blog. In the meanwhile, if you are curious about how our technology works in the health area, just take a look at our Text Analytics Health Demo.

Looking forward to seeing you at the Real-World Evidence Forum!

Continue reading


Language Technology Industry meets in Brussels, May 16-17, 2016

Language Technology Industry Summit

Logo_LTi_2016

LT-Innovate, the Language Technology Industry Association, organizes a new edition of its annual Summit. It is the yearly point of convergence between the Language Technology Industry, its clients, research partners and policy makers. According to its Memorandum of Association (London, 2012), LT-Innovate is a non-governmental organization consisting of all parties involved in the field of Language Technologies (LT) and services. Main goals are the promotion of common interests in the successful development, production, delivery and use of language technologies and services and the implementation of services that may help to promote the industry.

LTI Cloud

Besides traditional sections, as Solution Showcases, Technology Spotlights, and Project Results, Language Technology Industry Summit 2016 will serve as the official launch of one of the most important endeavours undertaken by the Association since its inception: the LTI Cloud.

LTI Cloud is the one-stop-shop platform for making available, discovering, assembling, testing and prototyping language technology components. If you are a potential provider of LT APIs (researcher, developer, startup…), and you want to get exposure, testing, or simply customers, consider using LTI Cloud, as it is a ready-to-use platform.

Still in a pilot phase until May 17th, you can be among the first adopters of LTI Cloud. And remember that it serves not only LT providers but also final users. Jochen Hummel, the leader of this initiative, will make the presentation at the Conference. By now, you can take a look at this preview.

Coming back to the Summit, I would like to stress a traditional track: “Customers challenge the Industry”. This year’s challenge comes from Elsevier: “Dynamic Knowledge Stores and Machine Translation”. It will be presented by Michelle Gregory and Pascal Coupet.

MeaningCloud User Profiling API

Being MeaningCloud one of the founding companies of LT-Innovate, we are proud to take an active role again in this year’s event. On Tuesday 17th of May, I will be presenting our recent work on “Automatic Extraction of Rich Customer Profiles from their Activity in Social Networks”. It is about our brand new MeaningCloud API for automatic profiling of Twitter users. User Profiling API allows extracting some important demographics according to different aspects for a given Twitter user: which topics the user talks about, personal and professional information, hobbies and interests, etc. This information extraction is based on a mixed rule-based and machine learning approaches.

Conference Discount Code

Come and join us at the LT-Innovate Summit. And, before registration, do not forget to ask for a special discount code through our helpdesk (support@meaningcloud.com).


Daedalus is now Sngular Meaning

[See the UPDATE section at the end of this post to know about the relationship between Sngular and MeaningCloud, as of June 1st, 2017.]

I am thrilled to announce that Daedalus, the company that I founded in 1998 in Spain, is now part of the Sngular group. This operation is part of a merger of five complementary IT companies to form a corporation based on talent and innovation, with the purpose of serving better our customers in times of accelerated changes.

As a consequence of this M&A operation, Daedalus has been renamed Sngular Meaning, raising series C funding from Sngular, its parent company, to accelerate its international development.

What does this deal mean for MeaningCloud?

Logo MeaningCloudMeaningCloud LLC is the branch of Sngular Meaning in the United States, in charge of development and marketing of our text analytics services. It is our strategic bid to consolidate as an international reference in the field of semantic technologies.

For MeaningCloud, this deal assures:

  • The financial resources for a faster expansion of our international business.
  • New marketing channels through cooperation with our Sngular sibling companies.
  • New opportunities to build specific solutions for vertical markets.

Some figures about MeaningCloud (that quickly become obsolete):

  • 5,000 registered users.
  • 1,000 active users in the last month.
  • 5 million API calls per day.

What is the structure of Sngular?

The companies that form the Sngular group are:

In total, we add up to a total of 300 people, with branches in the United States, Mexico and Spain. We define ourselves as a talent tech team. We will be visible under the domain sngular.team. Our CEO is Jose Luis Vallejo.

Regarding Sngular Meaning, besides the incorporation of Jose Luis Vallejo to the Board of Directors, there are no other changes at the management level. On my side, I will continue as President at Sngular Meaning and CEO at MeaningCloud. We can assure the continuity of our strategy around our trademarks MeaningCloud and Stilus.

When is the kick off?

On October 8th we will make a public presentation of the new Sngular group. This will be an event for employees and customers (by invitation only), but we plan other open events for later.

This is an exciting moment for us. We look at the future with confidence. I am sure that, as members of the Sngular family, we will continue enjoying the affection and support of all of you: customers, business partners and friends. Wish us luck and thank you for remaining at our side!

UPDATE as of June 1st, 2017

Almost two years later, anybody can see that the Sngular merger was a great success. What was founded as an umbrella corporation formed by five sister companies is now a strong IT company with multinational presence, where four of the original founding companies are fully merged. Besides this, other companies have been merged into Sngular through different mechanisms during the last months.

Regarding Sngular Meaning, we have jointly decided to integrate the service-oriented Data Science and Big Data activities in Sngular, while retaining all the activities and assets in the area of Text Analytics and Natural Language Processing in general. The Spain-based company Sngular Meaning (formerly Daedalus) has been renamed MeaningCloud Europe SL, owning 100% of the US-based company MeaningCloud LLC. Sngular maintains a non-controlling interest in our renewed and renamed company.

Long live Sngular! Long live MeaningCloud!

Jose C. Gonzalez


Emergency Management through Real-Time Analysis of Social Media

Serving citizens without paying attention to social media?

App Llamada Emergencias

The traditional access channels to the public emergency services (typically the phone number 112 in Europe) should be extended to the real-time analysis of social media (web 2.0 channels). This observation is the starting point of one of the lines which the Telefónica Group (a reference global provider of integrated systems for emergency management) has been working in, with a view to its integration in its SENECA platform.

Social dashboard for emergency management

At Daedalus (now MeaningCloud) we have been working for Telefónica in the development of a social dashboard that analyzes and organizes the information shared in social networks (Twitter, initially) before, during and after an incident of interest to emergency care services. From the functional point of view, this entails:

  • Collecting the interactions (tweets) related to incidents in a given geographical area
  • Classifying them according to the type of incident (gatherings, accidents, natural disasters…)
  • Identifying the phase in the life cycle of the incident (alert or pre-incident, incident or post-incident)

Benefits for organizations that manage emergencies

Love Parade Duisburg

Love Parade Duisburg

Anticipate incidents

Anticipation of events which, due to their unpredictability or unknown magnitude, should be object of further attention by the emergency services. Within this scenario are the events involving gatherings of people which are called, spread or simply commented through social networks (attendance to leisure or sport events, demonstrations, etc.). Predicting the dimensions and scope of these events is fundamental for planning the operations of different authorities. We recall in this respect the case of the disorders resulting from a birthday party called on Facebook in the Dutch town of Haren in 2012 or the tragedy of the Love Parade in Duisburg.

Flood in Elizondo, Navarre, 2014

Flood in Elizondo, Navarre, 2014

Enrich the available information

Social networks enable the instant sharing of images and videos that are often sources of information of the utmost importance to know the conditions of an emergency scenario before the arrival of the assistance services. User-generated contents can be incorporated to an incident’s record in real time, in order to help clarify its magnitude, the exact location or an unknown perspective of the event.

 

 

Text Analytics technology

Logo MeaningCloud

For the analysis of social content, the text analytics semantic technology (text mining) of MeaningCloud is employed. Its cloud services are used to:

  • Identify the language of the message
  • Classify the message according to a taxonomy (ontology) developed for this scenario (accidents of various kinds, assaults, natural disasters, gatherings, etc.)
  • Extract the mentioned entities (names of people, organizations, places) and the message’s relevant concepts
  • Identify the author or transmitter of each tweet.
  • Extract the geographic location of the transmitter and the incident
  • Extract the time of the message and the incident
  • Classify the impact of the message
  • Extract audiovisual (pictures and videos) and reference (links to web pages, attached documents…) material mentioned in the tweet for documenting the incident
  • Group automatically the messages relating to a same incident within an open record
  • Extract tag clouds related to incidents

Twalert Console

Twalert ConsoleA multidimensional social perspective

Text analytics components are integrated into a web application that constitutes a complete social dashboard offering three perspectives:

  • Geographical perspective, with maps showing the location of the messages’ transmitters, with the possibility of zooming on specific areas.
  • Temporal perspective: a timeline with the evolution of the impact of an incident on social networks, incorporating sentiment analysis.
  • Record perspective: gathering all the information about an incident.

Twitter Accidente Trafico

LT-Accelerate

Telefónica and Daedalus (now MeaningCloud) at LT-Accelerate

Telefónica and Daedalus (now MeaningCloud) will jointly present these solutions at the LT-Accelerate conference (organized by LT-Innovate and Seth Grimes), which will be held in Brussels, on December 4 and 5, 2014. We invite you to join us and visit our stand as sponsor of this event. We will tell you how we use language processing technologies for the benefit of our customers in this and other industries.

 

Register at LT-Accelerate. It is the ideal forum in Europe for the users and customers (current or potential) of text analysis technologies.

Telefonica_logo

 

 

 

 

 

Jose C. Gonzalez (@jc_gonzalez)

[Translation from Spanish by Luca de Filippis]


The Role of Text Mining in the Insurance Industry

What can insurance companies do to exploit all their unstructured information?

A typical big data scenario

Insurance companies collect huge volumes of text on a daily basis and through multiple channels (their agents, customer care centers, emails, social networks, web in general). The information collected includes policies, expert and health reports, claims and complaints, results of surveys, relevant interactions between customers and no-customers in social networks, etc. It is impossible to handle, classify, interpret or extract the essential information from all that material.

The Insurance Industry is among the ones that most can benefit from the application of technologies for the intelligent analysis of free text (known as Text Analytics, Text Mining or Natural Language Processing).

Insurance companies have to cope also with the challenge of combining the results of the analysis of these textual contents with structured data (stored in conventional databases) to improve decision-making. In this sense, industry analysts consider essential the use of multiple technologies based on Artificial Intelligence (intelligent systems), Machine Learning (data mining) and Natural Language Processing (both statistical and symbolic or semantic).

Most promising areas of text analytics in the Insurance Sector

Fraud detection

Detección de Fraude

According to Accenture, in a report released in 2013, it is estimated that in Europe insurance companies lose between 8,000 and 12,000 million euros per year due to fraudulent claims, with an increasing trend. Additionally, the industry estimates that between 5% and 10% of the compensations paid by the companies in the previous year were due to fraudulent reasons, which could not be detected due to the lack of predictive analytic tools.

According to the specialized publication “Health Data Management”, Medicare’s fraud prevention system in the United States, which is based on predictive algorithms that analyze patterns in the providers’ billing, in 2013 saved more than 200 million dollars in rejected payments.

Continue reading


The Analysis of Customer Experience, Touchstone in the Evolution of the Market of Language Technologies

The LT-Innovate 2014 Conference has just been held in Brussels. LT-Innovate is a forum and association of European companies in the sector of language technologies. To get an idea of the meaning and the importance of this market, suffice it to say that in Europe some 450 companies (mainly innovative SMEs) are part of it, and are responsible for 0.12% of European GDP. Daedalus is one of the fifteen European companies (and the only one from Spain) formally members of LT-Innovate Ltd. since its formation as an association, with headquarters in the United Kingdom, in 2012.

LTI_Manifesto_2014

LT-Innovate Innovation Manifesto 2014

In this 2014 edition, the document “LT-Innovate Innovation Manifesto:” Unleashing the Promise of the Language Technology Industry for a Language-Neutral Digital Single Market” has been published. I had the honor of being part of the round table which opened the conference. The main subject of my speech was the qualitative change experienced in recent times by the role of our technologies in the markets in which we operate. For years we have been incorporating our systems to solve in very limited areas the specific problems of our more or less visionary or innovative customers. This situation has already changed completely: language technologies now play a central role in a growing number of businesses.

Language Technologies in the Media Sector

In a recent post, I referred to this same issue with regard to the media sector. If before we would incorporate a solution to automate the annotation of file contents, now we deploy solutions that affect most aspects of the publishing business: we tag semantically pieces of news to improve the search experience on any channel (web, mobile, tablets), to recommend related content or additional one according to the interest profile of a specific reader, to facilitate findability and indexing by search engines (SEO, Search Engine Optimization), to place advertising related to the news context or the reader’s intention, to help monetize content in new forms, etc.

Continue reading


Semantic Publishing: a Case Study for the Media Industry

Semantic Publishing at Unidad Editorial: a Client Case Study in the Media Industry 

Last year, the Spanish media group Unidad Editorial deployed a new CMS developed in-house for its integrated newsroom. Unidad Editorial is a subsidiary of the Italian RCS MediaGroup, and publishes some of the newspapers and magazines with highest circulation in Spain, besides owning nation-wide radio stations and a license of DTTV incorporating four TV channels.

Newsroom El Mundo

Newsroom El Mundo

When a journalist adds a piece of news to the system, its content has to be tagged, which constitutes one of the first steps in a workflow that will end with the delivery of this item in different formats, through different channels (print, web, tablet and mobile apps) and for different mastheads. After evaluation of different provider’s solutions in the previous months, the company then decided that semantic tagging would be done through Daedalus’ text analytics technology. Semantic publishing included, in this case, the identification (with disambiguation) of named entities (people, places, organizations, etc.), time and money expressions, concepts, classification according to the IPTC scheme (an international standard for the media industry, with around 1400 classes organized in three levels), sentiment analysis, etc.

Continue reading


Textalytics sponsors the Sentiment Analysis Symposium

Next March 5-6, New York will host a new edition of the Sentiment Analysis Symposium. This is the seventh event of a series organized by industry expert Seth Grimes since year 2010 in San Francisco and NYC.

This is a unique conference in several aspects. First, it is designed specifically to serve the community of professionals interested in Human Analytics and its business application. Second, its audience is integrated by a mix of experts, strategists, practitioners, researchers, and solution providers, which makes a perfect breeding ground for discussion and exchange of points of view. Third, it is designed by just one person (not by a committee), a guarantee of consistency. Being an expert in the consultancy business, Seth Grimes achieves an excellent balance of presentations covering from technology to business application. I attended the New York 2012 edition, where I gave an enlightening talk, and I can tell that the experience was really enriching.

Sentiment Analysis Symposium 2014

Do not be misled by the title: do not interpret “Sentiment Analysis” in a narrow sense. The conference is about discovering business value in opinions, emotions, and attitudes in social media, news, and enterprise feedback. Moreover, the scope is not limited to text sources: speech and image are terms of the equation too.

Continue reading


Recognizing entities in a text: not as easy as you might think!

Entities recognition: the engineering problem

As in every engineering endeavor, when you face the problem of automating the identification of entities (proper names: people, places, organizations, etc.) mentioned in a particular text, you should look for the right balance between quality (in terms of precision and recall) and cost from the perspective of your goals. You may be tempted to compile a simple list of such entities and apply simple but straightforward pattern matching techniques to identify a predefined set of entities appearing “literally” in a particular piece of news, in a tweet or in a (transcribed) phone call. If this solution is enough for your purposes (you can achieve high precision at the cost of a low recall), it is clear that quality was not among your priorities. However… What if you can add a bit of excellence to your solution without technological burden for… free? If you are interested in this proposition, skip the following detailed technological discussion and go directly to the final section by clicking here.

Continue reading