Music Sentiment Analysis With the Twitter API and 🤗 Transformers

Learn how to scrape data and perform sentiment analysis using the Twitter API and Hugging Face Transformers

Temidayo Omoniyi
Heartbeat

--

Image by author

Table of Contents:

  1. Introduction
  2. What is the Twitter API?
  3. Creating a Twitter Developer account
  4. What is a Hugging Face Transformer?
  5. Installing all necessary libraries
  6. Configuration & authentication
  7. Introduction to the Tweepy library
  8. Data transformation & preprocessing
    8.1. Creating a Transformer sentiment analysis model
    8.2. Insight and visualization
  9. Conclusion

1. Introduction

The unprecedented growth of social media platforms in recent of years has caused thousands — maybe even millions — of pieces of valuable data to be generated on a daily basis. In order to collect and access all of this meaningful data, however, the need for relevant APIs has also exploded. In today’s digital age, data professionals and enthusiasts alike must become familiar with APIs and how to interact with them for their data and project needs.

Twitter is far becoming the fastest growing microblogging website, with over 100 million active users daily and more than half a billion tweets sent daily. That’s a lot of data — most of which is open to the public! Utilizing this extensive data source has never been simpler, thanks to this API’s excellent documentation and user-friendly interface.

In this article, I will show you how to interact with one of the most popular social media APIs, the Twitter API. In order to do this, we will create a Twitter Developer account to collect Tweets, and then we’ll apply sentiment analysis to them using Hugging Face Transformers.

I find such a project interesting and I am sure you will too! Being an Afrobeat music lover myself, in this tutorial I will be utilizing Tweets about this genre of music, but you can go to my Github, fork the repository, and create a project with your own favorite artist!

2. What is the Twitter API?

Application Programming Interfaces, or APIs, as they are popularly known, are software mediators that enables two or more applications to communicate with one another.

How APIs works

The Twitter API is a set of programmatic endpoints that can be used to build a conversation on Twitter, or just to understand how Twitter works.

This API allows you to locate, retrieve, interact with, or create a number of resources, including bots and other Twitter-based apps. With Twitter API you are also able to access various resources like Tweets, Users, Spaces, Direct Messages, Lists, Trends, Media, and Places.

The Developer Platform contains more in-depth documentation, including use cases and information on API offerings.

It is important to note that there are some restrictions associated with this service as well, which are detailed in the Twitter FAQ section.

3. Creating a Twitter Developer account

In order to gain access to the Twitter API, you must have a Twitter account. Through your account, you obtain credentials that are needed for access the API service. If you don’t already have an account, you can create one for free by clicking on sign up and follow all necessary steps.

Next, log in to your new Developer Account with your Twitter credentials. You may be asked to provide additional information during this step.

Image by author

Once you have access to your new developer account, you’ll need to choose a name for your app. Make sure the name is unique and then click on Get Keys.

Image by author

Warning: Your API and access keys are important. We will be using them as credentials to scrape Twitter data, so keep them in a safe place that is easily accessible by you only. Don’t ever upload your API keys to a public forum.

Next, click on the lower right button Skip to Dashboard.

Image by author

In your dashboard, you will need to generate the Access Token and Access Token Secret Key. Next, click on the Generate button, found below.

Image by author

Remember to keep your keys in a safe place that is easily accessible by you only. After this, you are all set to use the Twitter API.

Image by author

4. What are Hugging Face Transformers?

Transformers offer APIs that make it simple to download and train modern pre-trained models. They reduce your computational cost and carbon footprint, and save you time from training a model from scratch.

These models can be used for numerous modality types, including:

  • Text: Text categorization, information extraction, answering questions, summarizing, and translation.
  • Image: Segmentation, object detection, and categorization of images.
  • Audio: Speech Recognition and Classification.
  • Multimodal: Visual question answering, video categorization, information extraction from scanned documents, and table question answering are some examples.

What kind of challenges arise when building ML models at scale? Chris Brossman from The RealReal discusses his team’s playbook for big tasks.

5. Installing all necessary libraries

Before starting the project we need to install some specific libraries. Transformers support seamless integration between three of the most popular deep learning libraries: PyTorch, TensorFlow, and JAX.

Transformer requirements

Note, before you start installing all necessary libraries, you will have to create a virtual environment on your local machine.

Test for Installed libraries

As you can see from the GIF above, all necessary libraries for this project are properly installed.

6. Configuration and authentication

We need to connect to the Twitter API in order to access the credential generated after importing the necessary modules.

A Python class called ConfigParser implements a simple configuration language for Python apps. It offers a framework similar to ini files for Microsoft Windows. ConfigParser enables the creation of Python applications that end users can simply modify for the purpose of security.

Steps in creating an ini file:

  • Create a new text file and rename it config.ini in the same directory folder as the Python file.
  • Copy and paste the following keys, as written in the image below: i. api_key = xxxxxxxxxx ii. api_key_secret = xxxxxxxxxx iii. access_token = xxxxxxxxxx iv. access_token_secret = xxxxxxxxxx

This prevents others from accessing your secret keys when you share your file.

Image by author

Next, we need to read the Config file we just created.

7. Introduction to Tweepy library

Tweepy is an open-source Python package that gives you access to the Twitter API. We will use Tweepy to authenticate your Twitter account.

Setting up Tweepy

For this project, we will be focused on Grammy Award-winner Burna Boy’s album “LOVEDAMINI.”

We start by setting the keywords. The keyword search_query will be “#LOVEDAMINI”

search_query = “#LOVEDAMINI” #The search query will scrap through Twitter and get us the trend word #Lovedamini.

Due to Twitter API limitations, we can only scrape 1000 tweets at a time. So, we’ll repeat this process multiple times over the course of an hour, and then save our data as CSV files that we will append into one file.

For more details about some Twitter API limitations see here.

Image by author

8. Data transformation and preprocessing

Now we’ll focus on cleaning up the data to make it more conducive for sentiment analysis.

#At this stage we will need to clean the data in different forms
Burna_Data_Damini = pd.read_csv(“Append.csv”)
Burna_Data_Damini.head()

Find and remove duplicates

Burna_Data_Damini.duplicated().sum() # This to count the amount of duplicate values(~Burna_Data_Damini.duplicated()).sum() #Count unique values

Removal of hyperlinks and unicode characters

Removal of hyperlinks, emojis, unicode and other unwanted text is also important for better analysis. The code below utilizes some common regex patterns to quickly and efficiently remove these characters from our data.

Image by author

8.1 Creating the sentiment analysis model

For our sentiment analysis, we will use the Hugging Face Transformers listed below.

Additionally, we will need to download the twitter-roberta-base-sentiment model. This is a roBERTa-based model trained on ~58M tweets, and fine-tuned for sentiment analysis with the Tweet Eval benchmark. Note that it is suitable for English analysis only.

This model uses three sentiment labels:

  • 0: Negative
  • 1: Neutral
  • 2: Positive

We can now run the model on our data using the code below:

Next ,we need to add a new column and group the sentiment scores based on the labels provided by roBERTa:

Image by author

8.2 Data insights and visualization

We start by importing all relevant libraries, and then use the following code to visualize our data:

Sentiment Score

Let get the top fans Tweeting about the album:

Image by author
Image by author
WordCloud by author

Split Datetime

Aggregate Output
Tweet by Days

9. Conclusion

Hugging Face Transformers are an amazing tool for performing sentiment analysis, as well as many other machine learning tasks. From here, you can further improve the performance of the model by performing some hyper-parameter tuning, and you can track your different experiment runs with a model management service like Comet. Comet provides a free platform where you can integrate existing infrastructure and tools, for the purpose of management, visualization, and optimizing your model from prototype to production.

Thank you for reading and I hope you found this tutorial helpful in your own projects!

References:

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Azure Data Engineer • Azure Solutions Architect • Azure Database Admin • SQL Server Admin • Power Platform Developer • MCT• YouTuber • Author