In the previous tutorial we published about Sentiment Analysis and MeaningCloud’s Excel add-in, we showed you step by step how to do a sentiment analysis using an example spreadsheet. Then we showed you a possible analysis you could obtain with its global polarity results.
In this tutorial we are going a bit further: instead of analyzing the global polarity obtained for different texts, we are going to focus on the analysis of different aspects that appear in them and how to use our dictionaries customization console to improve them and to extract easily the exact information you are interested in.
We are going to work withe same example as before: reviews for Japanese restaurants in London extracted from Yelp. If you don’t have it already from the previous tutorial, you can download the spreadsheet with the data here.
If you followed the previous tutorial, you will remember that when you run the sentiment analysis without changing its default settings, two new sheets appear: Global Sentiment Analysis and Topics Sentiment Analysis. Topics Sentiment Analysis shows you the concepts and entities detected in each one of the texts and the sentiment analysis associated to each one of them.
But what can we do when these are not the aspects of the text we are interested in analyzing? This is where our customization tools come in. Our dictionaries customization console allows you to create a dictionary with any of the concepts or entities you want to detect in your analysis, down the type you want them to have associated.
So how do we create this user dictionary?
Step 2: add entries
Once we have created the dictionary, we can start adding the entries we want to detect in the sentiment analysis. In this scenario we want to detect two different aspects: dishes the restaurants serve and qualities of restaurants people talk about in their reviews.
To create an entry we need to know three things: the entry we want to detect (its lexical form), if we want it to be an entity or a concept (its entry type), and the type we want to associate to it (its ontology type).
Usually the form and the ontology type are quite immediate; the entry type, not so much. A rule of thumb for this is to think of how it’s going to appear in a text and thus, how we want to be able to detect it.
For instance, if we want to add “pork belly bun” as a dish, we can guess that they will also be mentioned in plural, while when we talk about other dishes such as “ramen” or “sushi“, we are only ever going to find them like that (we don’t talk about “ramens” or “sushis“). Concepts automatically consider morphological variants as aliases, which means that if we want the plural (or the gender for languages other than English) to also be detected, we can define the entry as a concept, and we won’t have to add it as an alias.
In the following image you can see how we are going to create the entry for “ramen“. We are going to define it as an entry of the type Dish, a subtype in the type Restaurants:
As you can see, when we define our own ontology type, we use the character greater than, ‘>’, to indicate hierarchy. Much in the same way as this example, we are going to define restaurant qualities, which we will associate to the ontology type Quality (also be a subtype of Restaurants):
Step 3: configure the analysis to use the dictionary
Now that we have defined our dictionary, we can use it in the sentiment analysis we are doing using the Excel add-in.
After clicking on Analyze, the process will launch, creating two new sheets in your spreadsheet when it’s done: Global Sentiment Analysis, with the global sentiment results of the texts and Topics Sentiment Analysis, with aspect-based sentiment analysis. We talked about the first one in the previous tutorial, so now we are going to focus on the second one.
Step 4: analyze the results
With the new results we can see review by review the dishes that are talked about, and what clients think about the different qualities of a restaurant that we have defined in our dictionary. With this information and using Excel’s tools, we can obtain easily analyses as simple or as complex as we want.
These are two examples of pivot charts you can create (click on them to make them bigger):
We’ve obtained both charts by inserting a pivot chart from the filtered data. For the first one we’ve chosen a clustered column chart. To configure it, we’ve set the following:
- In the Axis fields area, the fields Form and ID, in that order.
- In the Report filter area, the field Type filtering by Top>Restaurants>Qualities.
- In the Legend fields area, the field Polarity.
- In the Values area, the field Polarity configured as “Count of Polarity”.
After that, and tweaking a bit the colors, we obtain the graphic on the right, where we can compare the polarities detected in the reviews for the same qualities for each one of the restaurants.
The second chart is pretty similar to the first one. This time we are using a clustered bar chart, and the configuration changes in two things:
- Instead of filtering by qualities, we are selecting the dishes, so the field Type filters now by the value Top>Restaurants>Dishes.
- The order of the fields in the Axis fields area changes: ID will appear first and then Form.
In this second chart we can see the different opinions the reviews offer on the same dishes, so for instance, we can see that the ramen in Shoryu Ramen has mixed the reviews, while the okonomiaki in Abeno seems to be a hit.
You can download the spreadsheet with the results and the analyses we have just described here.
As you can see, detecting the polarity for you specific domain is fairly easy, and once you’ve defined what you need, you can combine it with any kind of analysis you want to add to your usual workflow.
Stay tuned for our next tutorial, in which we will show you how to improve the sentiment analysis in your domain by using sentiment analysis. And of course, if you have any questions, we’ll be happy to answer them at firstname.lastname@example.org.