In the last decade, sentiment analysis (SA), also known as opinion mining, has attracted an increasing interest. It is a hard challenge for language technologies, and achieving good results is much more difficult than some people think. The task of automatically classifying a text written in a natural language into a positive or negative feeling, opinion or subjectivity (Pang and Lee, 2008), is sometimes so complicated that even different human annotators disagree on the classification to be assigned to a given text. Personal interpretation by an individual is different from others, and this is also affected by cultural factors and each person’s experience. And the shorter the text, and the worse written, the more difficult the task becomes, as in the case of messages on social networks like Twitter or Facebook.
The problem has been tackled mainly from two different approaches (Liu, 2012): computational learning techniques (Pang, Lee, and Vaithyanathan, 2002) and semantic approaches (Turney, 2002).
Semantic approaches are characterized by the use of dictionaries of words (lexicons) with semantic orientation of polarity or opinion. Systems typically preprocess the text and divide it into words, with proper removal of stop words and a linguistic normalization with stemming or lemmatization, and then check the presence or absence of each term of the lexicon, using the sum of the polarity values of the terms for assigning the global polarity value of the text. Typically, systems also include i) a more or less advanced treatment of modifier terms (such as very, too, little) that increase or decrease the polarity of the accompanying terms; and ii) inversion terms or negations (such as no, never), which reverse the polarity of the terms to which they affect.
Moreover, the learning-based approaches consist on training a classifier using any supervised learning algorithm from a collection of annotated texts, where each text is usually represented by a vector of words (bag of words), n-grams or skip-grams, in combination with other types of semantic features that attempt to model the syntactic structure of sentences, intensification, negation, subjectivity or irony. Systems use different techniques, but the most popular are classifiers based on SVM (Support Vector Machines), Naive Bayes and KNN (K-Nearest Neighbor). More advanced techniques appear in the most recent investigations, such as LSA (Latent Semantic Analysis) and Deep Learning.
Pros and Cons
The main advantage of semantic approaches is that errors are relatively easy to correct, adding as many words as necessary, and theoretically, we could get a precision as high as we would like, simply investing more time in building the lexicon. In this regard, machine learning approaches are often a black box in which to correct errors or add new knowledge is more complicated, and it is often only possible by expanding the collection of texts and re-training the model.
On the other hand, the advantage of learning-based approaches is that it is quite easy and fast to build a sentiment/opinion analysis engine trained with the collection of tagged texts. It is therefore relatively easy to build classifiers adapted to a particular domain. In contrast, the effort to build a lexicon for a certain domain, starting from scratch, is very high, because it is based on a hard manual work, so these systems are generally less adaptable.
There are numerous national and international workshops for sentiment analysis evaluation and assessment. The most popular for English is SemEval. We ourselves organize the TASS workshop for sentiment analysis focused on Spanish, as a satellite event since 2012 of the annual SEPLN Congress (Spanish Society for Natural Language Processing) (Villena-Román et al, 2015). These forums are designed in a competition style: participants are provided with a collection of labeled texts (training set) in a given domain (more or less generalist), which is used to build the systems, which then are run on a different collection (test set) to get the results with which the evaluation is performed and systems are ranked.
Since a labeled collection is provided, the majority trend is to save manual effort and use approaches based on machine learning in one way or another. And since these approaches adapt better to different domains, they often outperform semantic approaches that are much more expensive to develop.
However, from our point of view, these evaluations are not totally realistic because the task becomes to build a classifier that best matches the given collection, completely different from the task of sentiment analysis on any unknown domain.
Learning based systems seem to be better as they adapt better to the proposed domain, but in general a complete retraining is needed for porting them to a different domain. Although they work well on “simple” cases, as no elaborate treatment on the complexity of natural language is performed, they are not capable of treating coordination and subordination, they fail in sentences with a combination of polarities and several inversion terms, in the presence of comparisons, etc.
Our Sentiment Analysis API uses semantic approaches based on advanced natural language in all aspects of morphology, syntax, semantics and pragmatics. First our engine generates a syntactic-semantic tree of the text, and over this, terms of the lexicon are applied to spread their polarity values along the tree, properly combining the values depending on the morphological category of the word and the syntactic relations that affect them.
In addition to the overall polarity of the text, the engine returns the polarity for word groups or segments of the text, in 6 possible levels: positive (P) and negative (N), very positive (P+) and very negative (N+), neutral (NEU) and none (NONE) in the event that no polarity is involved.
For example, given the text:
I do not like the astonishing rise of the stock this week
Our engine returns an overall N polarity, and also indicates that the segment “the astonishing rise of the stock this week” has P+ polarity. Also it marks the phrase as subjective. This information allows a more detailed and accurate interpretation of the message contained in the text.
If instead of “I do not like“, the phrase was “I do not like at all“, the overall value would be N+ polarity (very negative), which is an important subtlety to analyze.
Obviously, the effort to build a lexicon adapted to each of the domains in which we work is very high, although whenever possible we try to apply machine learning to support the generation process of these lexicons, at least partially.
However, the most important advantage is that this approach allows us to tackle the task as a particular case of “text understanding” so we can anticipate a response to any complex case that may occur in our projects with customers.
Aspects are the future
Furthermore, the trend is to move a step forward from the analysis of the overall polarity at the document level. The market demands a detailed fine-grained analysis of the messages expressed in a given text. Thus, the actual task evolves into aspect-based sentiment analysis (ABSA), whose objective is the extraction and classification of feeling and opinion on a specific aspect, which can be a particular entity, a concept, a topic label, or, in general, any analysis dimension of interest.
General-purpose machine learning based systems cannot successfully address this detailed analysis as generally no knowledge of the syntactic-semantic structure of the text is involved in their logic.
This task has two parts: aspect extraction (identification in the text) and polarity analysis. Our linguistic approach allows us to address these two subtasks in combination and in the same process of analysis. Furthermore, as the system can be extended with user dictionaries, including definitions of entities, concepts and key aspects of each particular domain, we can meet any need for analysis that we may find.
Telefonica got disappointing results while Vodafone has not been greatly affected by the crisis
The overall polarity is NEU (neutral), because there are two segments with opposite polarities “Telefonica got disappointing results” (N) and “Vodafone has not been greatly affected by the crisis” (P). The overall polarity of this second segment is P even though it contains the subsegment “very affected by the crisis” that has N+ polarity. Moreover both the entities “Telefonica” and “Vodafone” and the aspects “business results” and “economic crisis”, are detected with the right polarity values (N and P, respectively, in both cases).
Our current technology easily achieves precision values of 70-75% in a general case with a null or very small domain adaptation, as verified for example in results in TASS (Villena-Roman et al, 2015) or SemEval. However, investing a limited effort in our ABSA engine, different evaluations in different contexts and domains have verified that it is possible to get baseline precision values around 80-85%. Moreover, these values are even better in domains where the influence of the phenomena of humor, irony, third-person language, etc. are less common in the analyzed messages.
These technologies are directly applicable to the analysis of Voice of Customer (or Citizen, or Employee, or Supplier), unstructured information sources which, due to their immediacy and spontaneity, have shown be the most revealing sources of the true emotions and opinions of our audience.
The automation of sentiment/opinion analysis allows to process this data that due to its volume, variety and velocity, would be otherwise unmanageable only by human means. It would be impossible to extract full value from interactions in the contact center, conversations in social media, product reviews on forums and other websites (in number of thousands, if not hundreds of thousands) by a purely manual processing.
Our solutions provide the ability to process high volumes of data with minimal delay, high accuracy, consistency and low cost, which can complement the human analysis in many scenarios.
Liu, Bing. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1):1-167.
Pang, Bo and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2):1-135.
Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing – Volume 10, EMNLP’02, pp 79-86, Stroudsburg, PA, USA. Association for Computational Linguistics.
Turney, Peter D. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics – ACL ’02, 417, Philadelphia, Pennsylvania.
Villena-Román Julio, Janine García-Morera, Miguel A. García Cumbreras, Eugenio Martínez Cámara, M.Teresa Martín Valdivia, and L. Alfonso Ureña López, eds. 2015. Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN. CEUR WS Vol 1397, http://ceur-ws.org/Vol-1397/.