Over the past two weeks, the internet’s viral outrage has been targeting United Airlines, the brand that has been in crisis mode after a bloodied passenger was forcibly dragged off a plane. Millions of people witnessed videos of the incident being spread over social media.
With today’s interconnected world, a public relations crisis can start with one tweet. While United Airline might be feeling great about the millions of mentions the brand received on Twitter during the last two weeks – the brand is in trouble, as these mentions are filled with complaints and sarcasm. Here’s a visualization of the United Airlines’ brand mention and negative tweets growth since the incident:
As you can see, references to the United Airlines brand grew exponentially since April 10th and the emotions of the tweets greatly skewed towards negative.
In this blog, I will walk you through how to conduct a step-by-step sentiment analysis using United Airlines’ Tweets as an example. Before we start, let’s first introduce the topic of sentiment analysis and discuss the purpose behind an sentiment analysis.
What is Sentiment Analysis?
Sentiment Analysis is the process of determining whether a piece of writing (product/movie review, tweet, etc.) is positive, negative or neutral. It can be used to identify the customer or follower's attitude towards a brand through the use of variables such as context, tone, emotion, etc. Marketers can use sentiment analysis to research public opinion of their company and products, or to analyze customer satisfaction. Organizations can also use this analysis to gather critical feedback about problems in newly released products.
Sentiment analysis not only helps companies understand how they’re doing with their customers, it also gives them a better picture of how they stack up against their competitors. For example, if your company has 20% negative sentiment, is that bad? It depends. If your competitors have a roughly 50% positive and 10% negative sentiment, while yours is 20% negative, that merits more discovery to understand the drivers of these opinions. Knowing the sentiments associated with competitors helps companies evaluate their own performance and search for ways to improve.
How to Perform Sentiment Analysis?
There are many tools that provide automated sentiment analysis solutions. In this blog, I will illustrate how to perform sentiment analysis with MonkeyLearn and Python (for those individuals who want to build the sentiment analyzer from the scratch). MonkeyLearn is a highly scalable machine learning tool that automates text classification and sentiment analysis. With built-in public modules in MonkeyLearn, we will be able to get results quickly with no machine learning knowledge. Regardless of what tool you use for sentiment analysis, the first step is to crawl tweets on Twitter.
Step 1: Crawl Tweets Against Hash Tags
To have access to the Twitter API, you'll need to login the Twitter Developer website and create an application. Enter your desired Application Name, Description and your website address making sure to enter the full address including the http://. You can leave the callback URL empty.
After registering, create an access token and grab your application’s Consumer Key, Consumer Secret, Access token and Access token secret from Keys and Access Tokens tab.
Then, put this information into the variables defined in the Python code attached here:
This Twitter Crawler allows you to scrape tweets against hash tags and store the tweets into a csv. For this tutorial, I scraped all the tweets containing #UnitedAirlines from April 3, 2017 to April 16, 2017.
Step 2. Analyzing Tweets for Sentiment
MonkeyLearn has a built-in module “English tweets airlines sentiment analysis” that analyzes sentiments for tweets about airline reviews. This module can classify airline tweets into positive, negative and neutral with an accuracy of 81%.
When uploading the .csv file that contains United Airline tweets in MonkeyLearn, we need to discard the first row and ignore the time column.
In this built-in module, text data is automatically preprocessed and stop words are filtered out before applying the support vector machine algorithm.
With this text classifier, we can label each Tweet as positive, negative and neutral sentiment in a few minutes. However, human language is complex. Teaching a machine to analyze the various grammatical nuances, cultural variations, slang and misspellings that occur in social media is a difficult process. Teaching a machine to understand how context can affect tone is even more difficult. Humans are fairly intuitive when it comes to interpreting the tone of a piece of writing. Consider the following example:
Most humans would be able to quickly interpret that the tweet was being ironic about United Airlines’ disastrous “passenger re-accommodation”. By applying this contextual understanding to the sentence, we can easily identify the sentiment as negative towards United Airlines. Without contextual understanding, a machine looking at the sentence above might see the word “genius” and categorize it as positive.
The above example shows how sentiment analysis has its limitations and is not to be used as a 100% accurate marker. As with any automated process, it is prone to error and often needs a human eye to watch over it.
Step 3: Visualizing the Results
After United Airlines’ forcibly removed a passenger of a plane, brand mentions of United Airlines exploded on Twitter overnight. At the same time, a fake slogan for Southwest airlines, “we beat the competition, not you”, was widely spread across social media (which was not produced by Southwest). The graph below visualizes how brand mentions of both United Airlines and Southwest Airlines fluctuated from April 3 to April 16 (one week before and after the incident).
As you can see with this graph, there are much more negative tweets about United Airlines than positive tweets since April 10. It also shows how many brand mentions Southwest gained from United Airlines’ PR crisis. From this graph, we establish an idea of how much attention (positive vs negative) this topic brought during the past two weeks. The graph below specifically visualizes the negative sentiments for both United and Southwest Airlines.
This graph shows the ratio of the number of negative tweets to the total number of tweets mentioning United Airlines and Southwest Airlines. This explains how negative the discussion is on Twitter surrounding the two major airlines. All in all, United Airlines has a much higher percentage of negative tweets comparing to Southwest Airlines even before the incident. Interestingly, the ratio of negative tweets mentioning United slightly decreased after the incident, which occurred on April 10th.
On the contrary, the percentage of negative tweets discussing Southwest increased. This is due to the fact that before United Airlines’ PR crisis, there were very few tweets mentioning United or Southwest Airlines. For example, if United Airlines only had three negative tweets and one positive tweet that day, the ratio of negative tweets would turn out to be 0.75. With the increasing number of tweets, the ratio of negative tweets tends to be more robust to small changes.
In the beginning, it might be difficult to understand why the percentage of negative tweets for Southwest Airlines, who was supposed to be benefiting from this crisis, increased instead of decreased. With further investigation, we discovered that the machine was confused by negative sentiments that include both United Airlines and Southwest Airlines. For example, a tweet that claims, “Don't think I'll be flying #unitedAIRLINES ever again... #SouthwestAirlines” will confuse the sentiment analyzer and be classified as negative. This indicates another limitation of sentiment analysis: it’s hard for machines to distinguish sentiments for different subjects.
Sentiment analysis can also discover the most frequently used words among positive, negative or neutral tweets. In the chart below, the larger the size of the bubble indicates the higher frequency of the word appearing in a set of tweets. From the bubble chart, we can see the most frequent words in negative tweets are “united”, “airlines”, “passenger” and “overbook”. While compared to negative sentiments, words in positive tweets appear in much smaller magnitude, such as “friend”, “opportunity”, “brilliant” and “welcome”, some of them are actually sarcastic about this incident.
Build Your Own Sentiment Analyzer
For those who are interested in the methodology behind sentiment analysis, I will briefly explain the algorithm and introduce a way to build your own sentiment extractor in Python. The purpose of the sentiment extractor is to be able to automatically classify a tweet as a positive, negative or neutral sentiment, which can be reformulated as a classification task in machine learning world. Classification is the task of choosing the correct class label for a given input. One example of a classification task is: deciding what the topic of a news article is, from a fixed list of topic areas such as "sports," "technology," and "politics."
In this section, we will deal with the Naive Bayes Classifier and Support Vector Machines, which are both built on training corpora while containing the correct label for each input, hence these methods fall under the category of supervised classification. You can find detailed code here.
Step 1: Training the Classifiers
To obtain training data for sentiment analysis, I downloaded the airline Twitter sentiment dataset from Figure Eight (previously CrowdFlower), which is also used in the “English tweets airlines sentiment analysis” module from MonkeyLearn.
Here are some sample tweets along with classified sentiments:
Step 2: Preprocess Tweets
Before we start building the analyzer, we first need to remove noise and preprocess tweets by using the following steps:
- Lower Case - Convert the tweets to lower case.
- URLs - Eliminate all of these URLs via regular expression matching or replace with generic word URL.
- @username - Remove "@username" via regex matching or replace it with generic word AT_USER.
- #hashtag - replace hashtags with the exact same word without the hash (hash tags may provide some useful information), e.g. #boycottUnitedAirlines replaced with ' boycottUnitedAirlines '.
- Punctuations and additional white spaces - remove punctuation at the start and ending of the tweets, e.g: ' the day is beautiful! ' replaced with 'the day is beautiful'. We also replace multiple whitespaces with a single whitespace.
Step 3: Extract Feature Vectors
One important step in building a classifier is deciding what features of the input are relevant, and how to encode those features. For example, we can use the ending letter of the names as a feature and build a classifier to identify gender with these distinctive features. Specifically, names ending in a, e and i are likely to be female, while names ending in k, o, r, s and t are likely to be male.
Similarly, we can use the presence or absence of words that appear in tweet as features. In the training data, we can split each tweet into words and add each word to the feature vector. Some of the words might not indicate the sentiment of a tweet and we can filter them out. Then merge individual feature vector into a large list that contains all the features and remove duplicates in this list.
Adding individual words to the feature vector is referred to as 'unigrams' approach. Here, for simplicity, we will only consider the unigrams and below are some examples of features extracted from tweets.
To explain how a Naive Bayes Classifier or Support Vector Machine works is beyond the scope of this post. In my earlier post, I have some examples explaining the algorithms of Naive Bayes and Support Vector Machines. In this post, I will gloss over the mathematical and statistical underpinnings of these techniques, focusing instead on how and when to use them.
Naive Bayes Classifier
In Naive Bayes classifiers, every feature impacts which label should be assigned to a given input value. To choose a label for an input value, the naive Bayes classifier begins by calculating the prior probability of each label, which is determined by checking frequency of each label in the training set. At this point, we have a training set, so all we need to do is instantiate a classifier and classify test tweets.
Support Vector Machines
The Support Vector Machine (SVM) algorithm tries to find a hyperplane which separates the data in two classes as optimally as possible. However, for sentiment analysis we have three classes (positive, neutral, negative). How can we tackle datasets with more than two classes?
There are two approaches for decomposing a multiclass classification problem to a binary classification problem: the one-vs-all and one-vs-one approach. In the one-vs-all approach one SVM Classifier is built per class. This Classifier takes that one class as the positive class and the rest of the classes as the negative class. In the one-vs-one approach, you build one SVM Classifier per chosen pair of classes. To make things simple, we chose the one-versus-all approach where we decomposed our problem into training three binary-classifiers.
When building your own analyzer, there is no best algorithm fits all type of text data. According to a Stanford University research, Naive Bayes outperforms Support Vector Machines on short snippet sentiment classification tasks, while an SVM is a stronger performer when analyzing sentiments for longer reviews.
To summarize, this blog provides two methods to perform sentiment analysis for both marketers and data scientists either with MonkeyLearn or Python. Monkeylearn is a quick and convenient tool to start sentiment analysis. Once you are comfortable with sentiment analysis, you can start building and experimenting on your own sentiment analyzer. Here are some useful resources for building sentiment analyzer: