Tutorial: Building a Twitter Sentiment Analysis Process

Tutorial on collecting and analyzing tweets using the “Text Analysis by AYLIEN” extension for RapidMiner.

This tutorial explains how to collect and analyze tweets using the “Text Analysis by AYLIEN” aylien-rapidminer-bannerextension for RapidMiner. If you’re new to RapidMiner, or it’s your first time using the Text Analysis Extension you should first read our Getting Started tutorial which takes you through the installation process. Also, If you haven’t got an AYLIEN account, which you’ll need to use the Extension, you can grab one here. So, here’s what we’re going to do:
  1. Collect tweets using the Twitter Search Operator
  2. Analyze their Sentiment using the Analyze Sentiment Operator
  3. Assign the tweets to different categories using the Categorize Operator
  4. Visualize our results and make them more consumable and understandable
You can download the finished Process from Sample Processes page. Step 1. Gathering tweets Create a new Process in RapidMiner and add a Search Twitter Operator. Build your desired search as you would using the Twitter search API. You can see from the screenshot below we’re searching for tweets containing the keyword “Samsung”. We’ve cleaned up our search a little by removing retweets (-rt) and links (-http). We’ve also restricted the number of tweets to collect to 20 and decided we only want to see English tweets by adding “en” in the language parameter. We’ve also indicated that we want only recent or popular tweets to be returned using the Result type parameter. data-collection Let's have a look at what kind of results our search returns. Once you hit Run (don’t forget to connect your Operators) the results from the Twitter search are displayed in an ExampleSet tab, like the one below: tweet-search Step 2. Analyzing tweets for Sentiment   So now we have a collection of 20 tweets stored in an ExampleSet that are ready to be further analyzed. The first thing we’re going to do from an analysis point of view is, try and determine what the Sentiment of each tweet is, i.e. whether they are Positive, Negative or Neutral. We do this by adding the Analyze Sentiment Operator to our Process and selecting “text” as our “Input attribute” on the right hand side, as shown in the screenshot below: tweet-ananlyze-sentiment So now we have a relatively simple Twitter Sentiment Analysis Process that collects tweets about “Samsung” and analyzes them to determine the Polarity (i.e. positive, neutral or negative) and Subjectivity (i.e. subjective or objective) of each tweet. As is displayed in the ExampleSet below, the results now contain not only the tweets that were pulled in but their corresponding Polarity and Subjectivity as well as a confidence score for both: tweet-example-set