Have you ever tried to understand the buzz around your brand in social networks? Simple metrics about the amount of friends or followers may matter, but what are they are actually saying? How do you extract insights from all those comments? At MeaningCloud, we are planning a series of tutorials to show you how you could use text analytics monitor your brand’s health.
Today, we will talk about the fanciest feature: Sentiment Analysis. We will build a simple tool using Python to measure the sentiment about a brand in Twitter. The key ingredient is MeaningCloud Media Analysis API which will help to detect the sentiment in a tweet. We will also use Twitter Search API to retrieve tweets and the library matplotlib to chart the results.
Listening to what customers say on social networks about brands and competitors has become paramount for every kind of enterprise. Whether your purpose is marketing, product research or public relations, the understanding of sentiment, the perception and the topics related to your brand would provide you valuable insights. This is the purpose of MeaningCloud Media Analysis API, make easier the extraction of these insights from the myriad of comments that are potentially talking about a brand. This tutorial will guide you through the process of building an application that listens to Twitter for your brand keywords and extract the related sentiment.
Open your favorite editor and start coding…
The application may be divided in three steps:
- Search tweets that mention a given brand using the Twitter Search API v1.1
- Analyze the text of every tweet using MeaningCloud Media Analysis API v1.0 and retrieve the sentiment associated.
- Aggregate the counts for each of the sentiment values (positive, negative, etc.) and plot the values in a chart. We will use a simple pie chart built with matplotlib.
Create a Twitter application
First of all, you need to create a Twitter application as a developer. Twitter API v1.1 uses OAuth to authenticate applications, which makes the process a little more complicated, but once you know the basics, it is easy to continue.
- Visit Twitter Developers site and sign in with your Twitter username and password.
- Under your avatar you will find a menu called “My applications” where you can “Create New App”. Choose a unique name and accept the conditions.
- Voila! Your Twitter app is created and under “API Keys” you will find your “API Key” and “API Secret” which are the consumer_key and consumer_secret in OAuth.
- You will also need an access token secret and key which can be obtained by clicking on “Create my access token”.
Get a sample of tweets using Twitter Search API and TwitterAPI library.
Next step is the search of the tweets that mention your brand. We use the TwitterAPI library to retrieve the most recent tweets that contain specific keywords. TwitterAPI is a convenient Python wrapper library around the Twitter API. If you have not used it before, you can install it using
pip install TwitterAPI .
Searching the Twitter stream requires building a query for the search endpoint setting your search keywords as the value of the parameter
q in a JSON object. Besides, we filter by language and limit the number of results to 100.
The result is a list of tweets that match your keywords as JSON objects. Although it may seem simple, a tweet contains a few dozens of fields. However in this tutorial we are only interested in getting the text to analyze its sentiment.
The following script shows you how to search Twitter. Just copy and paste your own credentials.
Get a license key for MeaningCloud Media Analysis
If you have not done already, register in MeaningCloud and you will receive a confirmation email that redirects you to your Personal Area. If you already have a username and password, just “Sign in”. In the Personal Area, you can see the list of MeaningCloud APIs; find the Media Analysis API (green one) and click on ‘Get License’.
Using MeaningCloud Media Analysis to carry out Sentiment Analysis
MeaningCloud Media Analysis API provides sentiment polarity at a document level. In other words, it reveals the prevailing sentiment of a document. This is usually enough if you are analyzing only tweets and we assume that just one opinion has been expressed within the 140 characters. For longer documents, you may be interested in measuring entity or phrase-level sentiment which can be achieved in MeaningCloud Sentiment Analysis API. The four values for the sentiment field in Media Analysis response are:
- P means that the document contains positive opinions.
- N is used for negative opinions.
- NEU if the document contains no marked opinions or there are both positive and negatives ones in equal measure.
- NONE is used if the document is entirely objective.
We provide a libraries, which wrap the HTTP request and provides domain objects for a document and the annotations. Download them from our sdk page and place them in your work folder. In order to call MeaningCloud Media Analysis we build a
object with an
id and the
text. As we already know that we are going to analyze tweets, we also set the value of field
source in the document. That selects an appropriate processing pipeline tailored to deal with Twitter slang, incorrect spelling and other specific issues of micropost sublanguage. We set
language as well, with a similar purpose, to select the appropriate language pipeline for processing our tweets.
Visualize the results with matplotlib
Last step: put all the pieces together and visualize the results. We will use matplotlib Python library to plot the percentage of tweets with the same sentiment values. Again if you have not used it before, install matplotlib using
The complete Python script will retrieve the most recent comments related to a brand using Twitter API, extract their text and analyze it using MeaningCloud Media Analysis to detect the its sentiment. Finally, we count positive, negative, neutral and objective tweets to produce a pie chart that visualizes the sentiment associated to our brand. As the pie chart uses percentages, we normalize the raw count by using the total number of retrieved tweets.
Below you can check the full code. You can also find it in Github.
Here is an example of the output of our simple brand monitoring tool for the brand Nokia. Modify it using different parameters and your own metrics! Tough simple, this is a good start, isn’t it? Check out the rest of MeaningCloud Media Analysis API features to help you improve on that. You can use Entity Extraction to discover what other topics are co-occurring with the mentions of your brand or use Text Classification to filter out irrelevant tweets.