How to Fine-Tune T5 for Question Answering Tasks with Hugging Face Transformers

Fine-tuning the T5 model for question answering tasks is simple with Hugging Face Transformers: provide the model with questions and context, and it will learn to generate the correct answers.

By Jayita Gulati on October 31, 2024 in Language Models

How to Fine-Tune T5 for Question Answering Tasks with Hugging Face Transformers

Image by Editor | Ideogram

T5 is a powerful model created to help computers understand and generate human language. T5 stands for "Text-to-Text Transfer Transformer." It is a model that can do many language tasks. T5 treats all tasks as text-to-text problems. In this article, we will learn how to improve T5 for question answering.

Install the Required Libraries

First we must install the necessary libraries for our purposes:

pip install transformers datasets torch

transformers: The Hugging Face library that provides the T5 model and other transformer architectures
datasets: A library for accessing and processing datasets
torch: A deep learning library that helps build and train neural networks

Load the Dataset

For fine-tuning T5 for question answering, we will use the BoolQ dataset, which contains question-answer pairs where the answers are binary (yes/no). You can load the BoolQ dataset using Hugging Face’s datasets library.

from datasets import load_dataset

# Load the BoolQ dataset
dataset = load_dataset("boolq")

# Display the first few rows of the dataset
print(dataset['train'].to_pandas().head())

Preprocessing the Data

T5 requires the input in a specific format. We need to change the dataset so that both the questions and answers are in text format. The inputs will be in the format question: context: , and the output will be the answer. Now, we need to load the T5 model and its tokenizer. The tokenizer will convert our text inputs into token IDs that the model can understand. Next, we need to tokenize our input and output data. The tokenizer will convert the text to input IDs and attention masks, which are necessary for training the model.

from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments

# Initialize the T5 tokenizer and model (T5-small in this case)
tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

# Preprocessing the dataset: Prepare input-output pairs for T5
def preprocess_function(examples):
    inputs = [f"Question: {question}  Passage: {passage}" for question, passage in zip(examples['question'], examples['passage'])]
    targets = ['true' if answer else 'false' for answer in examples['answer']]
    
    # Tokenize inputs and outputs
    model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding='max_length')
    labels = tokenizer(targets, max_length=10, truncation=True, padding='max_length')
    model_inputs["labels"] = labels["input_ids"]
    
    return model_inputs

# Preprocess the dataset
tokenized_dataset = dataset.map(preprocess_function, batched=True)

Fine-Tuning T5

Now that our data is prepared, we can fine-tune the T5 model. Hugging Face’s Trainer API simplifies this process by handling the training loop, optimization, and evaluation.

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
)

# Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["validation"],
)

# Fine-tune the model
trainer.train()

Evaluating the Model

After fine-tuning, it’s important to evaluate the model on the validation set to see how well it answers questions. You can use the evaluate method of the Trainer.

# Evaluate the model on the validation dataset
eval_results = trainer.evaluate()

# Print the evaluation results
print(f"Evaluation results: {eval_results}")

Evaluation results:  {‘eval_loss’: 0.03487783297896385, ‘eval_runtime’: 37.2638, ‘eval_samples_per_second’: 87.753, ‘eval_steps_per_second’: 10.976, ‘epoch’: 3.0}

Making Predictions

Once the T5 model is fine-tuned and evaluated, we can use it to predict new question-answering tasks. To do this, we can prepare a new input (question and context), tokenize it, and generate the output (answer) from the model.

from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the fine-tuned model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("./results")
tokenizer = T5Tokenizer.from_pretrained("t5-base")

# Prepare a new input
input_text = "question: Is the sky blue? context: The sky is blue on a clear day."

# Tokenize the input
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

# Generate the answer using the model
output_ids = model.generate(input_ids)

# Decode the generated tokens to get the predicted answer
predicted_answer = tokenizer.decode(output_ids[0], skip_special_tokens=True)

# Print the predicted answer
print(f"Predicted answer: {predicted_answer}")  # Predicted answer: yes

Conclusion

In conclusion, fine-tuning T5 helps it become better at answering questions. We learned how to prepare data and train the model. Using the Hugging Face Transformers library made the process easier. After training, T5 can understand questions and give correct answers. This is helpful for many uses, like chatbots or search engines.

Jayita Gulati is a machine learning enthusiast and technical writer driven by her passion for building machine learning models. She holds a Master's degree in Computer Science from the University of Liverpool.