Sentiment Analysis Using ELECTRA

Prakash Verma
Heartbeat
Published in
6 min readMay 16, 2023

--

Photo by ROMAN ODINTSOV: https://www.pexels.com/photo/diy-yellow-easter-eggs-6898858/

Introduction

Did you know that machine learning is one of the most popular approaches for sentiment analysis? Sentiment analysis is a common natural language processing (NLP) task that involves determining the sentiment of a given piece of text, such as a tweet, product review, or customer feedback. Sentiment analysis is now more crucial than ever for companies and organizations to understand customer opinions and feedback in light of the growth of social media and online customer evaluations.

Machine learning algorithms can be trained on large datasets of labeled examples to automatically learn patterns and relationships between textual features and sentiment labels. These algorithms can then be used to predict the sentiment of the new, unseen text.

Image From: https://fiverr-res.cloudinary.com/images

In this article, we will explore how to perform sentiment analysis using the ELECTRA model.

What is ELECTRA?

ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) is a state-of-the-art pre-training technique for natural language processing (NLP) developed by Google AI Language in 2020. It is built using the Transformer architecture and is intended to increase the effectiveness and precision of pre-training models for a variety of NLP tasks, such as language modeling, text categorization, question answering, and sentiment analysis.

Using ELECTRA for sentiment analysis can help to improve the accuracy and efficiency of sentiment analysis models, leading to better insights and decision-making for businesses and organizations.

How does ELECTRA work?

Sentiment analysis is a popular NLP task that involves determining the sentiment or emotional tone of a piece of text. With the development of the ELECTRA pre-training technique, sentiment analysis can be performed more accurately and efficiently. Steps indicating how to use ELECTRA for sentiment analysis are listed below:

  1. Data Preparation: The first step is to collect and prepare a labeled dataset for training the sentiment analysis model. This dataset should include a range of text examples with corresponding sentiment labels (positive, negative, or neutral).
  2. Pre-training: Once the dataset is prepared, the next step is to pre-train the ELECTRA model on the dataset. This involves training the generator and discriminator networks on the input text and sentiment labels using the ELECTRA pre-trained algorithm.
  3. Fine-tuning: After pre-training, the ELECTRA model can be fine-tuned on a smaller dataset of labeled examples specific to the sentiment analysis task. This fine-tuning step helps to improve the accuracy of the model for the particular task at hand.
  4. Evaluation: After the model has been trained and adjusted, its accuracy and efficiency can be evaluated using a test dataset. Compare the predicted sentiment labels with the actual labels at this point to evaluate the model’s accuracy.
  5. Deployment: Finally, the trained model can be deployed to perform sentiment analysis on new, unseen text. The model takes the input text as input and outputs a sentiment label (positive, negative, or neutral) based on the pre-trained and fine-tuned weights of the ELECTRA model.

The Architecture of ELECTRA

The basic idea behind ELECTRA is to train an encoder that can efficiently distinguish between the original text and a modified version of the text, where some of the tokens have been replaced with generated tokens. By training the encoder to perform this task, the model can learn more efficient and effective representations of language.

The ELECTRA architecture consists of two main components: a generator and a discriminator.

The generator is typically a smaller model compared to the discriminator and is trained to generate high-quality tokens that are difficult for the discriminator to classify as generated. The generator takes the original text as input and randomly replaces some of the tokens with generated tokens.

The discriminator is a larger model and is trained to accurately classify each token in the input text as original or generated. The discriminator takes the modified text as input and predicts whether each token in the text is original or generated.

During pre-training, the generator is used to create a large corpus of modified text, which is then used to train the discriminator. The discriminator is trained to predict whether each token in the modified text is original or generated, and the generator is updated to produce more high-quality generated tokens.

Image from: https://media.licdn.com/dms/image/

Implementation

In this section, we will walk through the steps involved in implementing sentiment analysis using ELECTRA.

Step 1: Loading the Data

Load the data using a library such as Pandas. The data should have two columns one for the text and one for the sentiment. Later, you can split the data into training and testing sets.

import pandas as pd
from sklearn.model_selection import train_test_split

# Load data
data = pd.read_csv("dataSet.csv")

# Split data
train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)

Step 2: Preprocessing the Data

Preprocess the data by cleaning and tokenizing the text. You can use libraries such as NLTK.

import nltk
nltk.download('punkt')

from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

# Define stopwords
stop_words = set(stopwords.words('english'))

# Clean and tokenize text
def clean_text(text):
# Remove punctuation and convert to lowercase
text = text.lower()
text = re.sub(r'[^\w\s]', '', text)

# Tokenize text and remove stop words
tokens = word_tokenize(text)
tokens = [token for token in tokens if token not in stop_words]

# Join tokens back into string
text = " ".join(tokens)
return text

# Apply cleaning function to data
train_data['text'] = train_data['text'].apply(clean_text)
test_data['text'] = test_data['text'].apply(clean_text)

Step 3: Creating the Model

Create the sentiment analysis model using ELECTRA. You can use a pre-trained ELECTRA model.

import torch
import transformers

# Load the ELECTRA model
model_name1 = "google/electra-small-discriminator"

tokenizer = transformers.AutoTokenizer.from_pretrained(model_name1)

model = transformers.AutoModelForSequenceClassification.from_pretrained(model_name1)

# Set the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

Step 4: Training the Model

Train the model using the training data. You can use the AdamW optimizer and the CrossEntropyLoss function.

import torch.nn    as nn
import torch.optim as optim

# Define optimizer and loss function here
optimizer = optim.AdamW(model.parameters(), lr=1e-5)
criterion = nn.CrossEntropyLoss()

# Train model
for epoch in range(12):
running_loss = 0.0

for batch in train_loader:
inputs = tokenizer(batch['text'], padding=False, truncation=True, return_tensors="pt")
inputs.to(device)
labels = batch['sentiment'].to(device)

optimizer.zero_grad()

outputs = model(*inputs, labels=labels)
loss = outputs.loss

loss.backward()
optimizer.step()

running_loss1 += loss.item()

print(" Epoch {} Loss: {} ".format(epoch+1, running_loss1/len(train_loader)))

Step 5: Evaluate the Model

Set the model to evaluation mode using model.eval(). Then, loop through the test data and get the inputs and labels for each batch. Use the tokenizer to tokenize the text and convert it to a PyTorch tensor. We then use the model to make predictions on the inputs and calculate the number of correct predictions using the torch.sum() function. Finally, calculate the test accuracy by dividing the number of correct predictions by the total number of test examples.

# evaluate the model on the test dataset
model.eval()
test_corrects = 0

with torch.no_grad():
for batch in test_iterator:
# get the inputs and labels
inputs = tokenizer(batch.text, padding=False, truncation=True, return_tensors='pt').to(device)
labels = batch.label.to(device)

# forward it
outputs = model(*inputs).logits.squeeze(1)

# calculate statistics here
test_corrects += torch.sum((outputs > 0) == labels.byte())

# print the accuracy value of the model
test_acc1 = test_corrects.double() / len(test_data)
print(f'Test Acc: {test_acc1:.4f}')

Conclusion

Sentiment analysis using ELECTRA holds great potential for a wide range of applications in business, marketing, and social media analysis. This article describes how to implement ELECTRA for sentiment analysis. By leveraging the power of pre-trained models and transfer learning, you can easily perform sentiment analysis on large datasets and achieve state-of-the-art performance.

With the continued development and advancements in NLP and machine learning, you can expect even more accurate and efficient sentiment analysis techniques to emerge in the future.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Technical Writer and Developer having 13 years of work experience, My Primary Skill includes: Data Analyst, AI/ML, Deep Learning, Python, PySpark, AWS-Cloud,