Category Archives: Publishing Industry

Posts about the publishing industry.

Books Are a Service

Semantic Publishing and Voice of the Customer understanding for the media&content industry

The reason for publishing being a key industry to take advantage of text analytics is also the reason why the industry finds it so hard to engage with the technology.

Books are a serviceThe reason? Text. And a lot of it. The publishing world has struggled to understand how data relates to text and understand the value of data. This is changing, too slow for many, as the industry moves from seeing themselves as a ‘product’ based company (e.g. making books, e-books or physical) to a ‘service’ based company. In other words smart publishers are starting to see their service to customers as the creator and curator of information. This content is abled to be mixed and mashed-up in dynamic ways across a number of formats. This service is not bound, saddle-stitch or otherwise, to a specific product. This 180-degree perspective change requires publishers to think more directly about customer experience in the same way more traditional service based industries like hospitality or even retail banking.

Continue reading

Text Analytics for Publishing: there’s metadata and smarter metadata

Everyone agrees metadata is great. It helps simplify the management and packaging of content and data. It creates consistency and provenance of your content and data across an organization. Metadata gives you that 35000 feet perspective that is needed to make strategic decisions. This is especially important for publishers whose stock in trade is human language, which is completely opaque to machines whose world consists of zeros and ones. Your customers aren’t calling or emailing you to know what is in such and such database. No. They are contacting you because they want to know what monographs you have by such and such professor or asking you for all the archival material on ‘cats’, ‘World War 2’ or ‘nanotubes’. As a human, you understand exactly what they are looking for. If your ICT has a smidgeon of metadata, you can dig around that such-and-such database and deliver the content and have a happy customer.

Intelligent content for Semantic Publishing

Metadata TagMetadata makes your content more intelligent. That’s why everyone agrees metadata is great. Great until they have to either enter the metadata or maintain the vocabularies. Some organizations are lucky. They have ensured there is support within the workflow and people with the expertise to do the hard work so when that customer searches on the website, they quickly find what they are looking for and go away happy. But, even those lucky few do not live in isolation. There is no publisher of consequence who doesn’t have do deal with 3rd party content and data. A huge amount of additional effort is spent shoehorning 3rd party content into the metadata models of the organization. Every publisher has a workflow that includes completely throwing away existing metadata and spending additional time and wasteful effort to add metadata that their CMS can handle. Does that sound familiar? Does it feel better to know you aren’t the only one?

Continue reading

#ILovePolitics: Popularity analysis in the news

If you love politics, regardless of your party or political orientation, you may know that election periods are exciting moments and having good information is a must to increase the fun. This is why you follow the news, watch or listen to political analysis programs on TV or radio, read surveys or compare different points of view from one or the other side.

American politics in a nutshell

American politics

Starting with this, we are publishing a series of tutorials where we will show how to use MeaningCloud for extracting interesting political insights to build your own political intel reports. MeaningCloud provides useful capabilities for extracting meaning from multilingual content in a simple and efficient way. Combining API calls with open source libraries in your favorite programming language is so easy and powerful at the same time that will awaken for sure the Political Data Scientist hidden inside of you. Be warned!

Our research objective is to analyze mentions to people, places, or entities in general in the Politics section of different news media. We will try to carry out an analysis that can answer the following questions:

  • Which are the most popular names?
  • Does their popularity depend on the political orientation of the newspaper?
  • Is it correlated somehow to the popularity surveys or voting intentions polls?
  • Do these trends change over time?

Before we begin

This is a technical tutorial in which we will develop some coding. However, we will try to guide you through the whole process, so everyone can follow the explanations and understand the purpose of the tutorial.

For the sake of generality and better understanding, we will focus on U.S. Politics in English, but obviously you can easily adapt the same analysis for your own country or (MeaningCloud supported) language.

And last but not least, this tutorial will use PHP as programming language for the code examples. However, any non-rookie programmer should be able to translate the scripts into any language of their choice.

Continue reading

The Analysis of Customer Experience, Touchstone in the Evolution of the Market of Language Technologies

The LT-Innovate 2014 Conference has just been held in Brussels. LT-Innovate is a forum and association of European companies in the sector of language technologies. To get an idea of the meaning and the importance of this market, suffice it to say that in Europe some 450 companies (mainly innovative SMEs) are part of it, and are responsible for 0.12% of European GDP. Daedalus is one of the fifteen European companies (and the only one from Spain) formally members of LT-Innovate Ltd. since its formation as an association, with headquarters in the United Kingdom, in 2012.


LT-Innovate Innovation Manifesto 2014

In this 2014 edition, the document “LT-Innovate Innovation Manifesto:” Unleashing the Promise of the Language Technology Industry for a Language-Neutral Digital Single Market” has been published. I had the honor of being part of the round table which opened the conference. The main subject of my speech was the qualitative change experienced in recent times by the role of our technologies in the markets in which we operate. For years we have been incorporating our systems to solve in very limited areas the specific problems of our more or less visionary or innovative customers. This situation has already changed completely: language technologies now play a central role in a growing number of businesses.

Language Technologies in the Media Sector

In a recent post, I referred to this same issue with regard to the media sector. If before we would incorporate a solution to automate the annotation of file contents, now we deploy solutions that affect most aspects of the publishing business: we tag semantically pieces of news to improve the search experience on any channel (web, mobile, tablets), to recommend related content or additional one according to the interest profile of a specific reader, to facilitate findability and indexing by search engines (SEO, Search Engine Optimization), to place advertising related to the news context or the reader’s intention, to help monetize content in new forms, etc.

Continue reading

Semantic Publishing: a Case Study for the Media Industry

Semantic Publishing at Unidad Editorial: a Client Case Study in the Media Industry 

Last year, the Spanish media group Unidad Editorial deployed a new CMS developed in-house for its integrated newsroom. Unidad Editorial is a subsidiary of the Italian RCS MediaGroup, and publishes some of the newspapers and magazines with highest circulation in Spain, besides owning nation-wide radio stations and a license of DTTV incorporating four TV channels.

Newsroom El Mundo

Newsroom El Mundo

When a journalist adds a piece of news to the system, its content has to be tagged, which constitutes one of the first steps in a workflow that will end with the delivery of this item in different formats, through different channels (print, web, tablet and mobile apps) and for different mastheads. After evaluation of different provider’s solutions in the previous months, the company then decided that semantic tagging would be done through Daedalus’ text analytics technology. Semantic publishing included, in this case, the identification (with disambiguation) of named entities (people, places, organizations, etc.), time and money expressions, concepts, classification according to the IPTC scheme (an international standard for the media industry, with around 1400 classes organized in three levels), sentiment analysis, etc.

Continue reading