How To Generate Meaningful Sentences Using a T5 Transformer

Read this article to see how to develop a text generation API using the T5 transformer.

By Vatsal Saglani, Machine Learning Engineer at Quinnox

Photo by Tech Daily on Unsplash (Tech Daily)

In the blog, Generating storylines using a T5 Transformer we saw how we can fine-tune a Sequence2Sequence (Text-To-Text) Transformer (T5) to generate storylines/plots by providing inputs like genre, director, cast, and ethnicity. In this blog, we will check out how we can use that trained T5 Model for inference. Later, we will also see how can we deploy it using gunicorn and flask.

How to do Model Inference?

Let’s set up the script with the imports

import os
import re
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm_notebook, tnrange
from sklearn.utils import shuffle
import pickle
import math
import torch
import torch.nn.functional as F
from transformers import T5Tokenizer, T5ForConditionalGeneration

Set the SEED value and load the model and tokenizer

torch.manual_seed(3007)model = T5ForConditionalGeneration.from_pretrained('./outputs/model_files')
tokenizer = T5Tokenizer.from_pretrained('./outputs/model_files')

Use the model.generate function to generate sequences

text = "generate plot for genre: horror"
input_ids = tokenizer.encode(text, return_tensors="pt")
greedyOp = model.generate(input_ids, max_length=100)
tokenizer.decode(greedyOp[0], skip_special_tokens=True)

Note: Read this amazing Hugging Face blog regarding how you to use different decoding strategies for Text Generation using Transformers

Let’s put this in a function

def generateStoryLine(text, seq_len, seq_num):
				'''
				args:
					text: input text eg. generate plot for: {genre} or generate plot for: {director}
					seq_len: Max sequence length for the generated text
					seq_num: Number of sequences to generate
				'''
        outputDict = dict()
        outputDict["plots"] = {}
        input_ids = tokenizer.encode(text, return_tensors = "pt")
        beamOp = model.generate(
            input_ids,
            max_length = seq_len,
            do_sample = True,
            top_k = 100,
            top_p = 0.95,
            num_return_sequences = seq_num
        )        for ix, sample_op in enumerate(beamOp):
            outputDict["plots"][ix] = self.tokenizer.decode(sample_op, skip_special_tokens = True)
            
        
        return outputDict

How to deploy this with Flask?

There are multiple ways a user can provide the inputs and the model might need to generate the plots. The user can provide only the genre, or they can provide genre and cast or they can even provide all the four i.e. genre, director, cast and ethnicity. But for the purpose of this implementation, I have kept it mandatory to provide a genre at the least.

You can check out the link below to see how the API will work.

Movie Plot Generator
I generate vague movie plots on the web (but sometimes they are good). But I can assure you that it will always be…

Let’s develop a backend to achieve the API call used for the link above

Install the requirements

pip install flask flask_cors tqdm rich gunicorn

Create an app.py file

Imports

# app.pyfrom flask import Flask, request, jsonify
import json
from flask_cors import CORS
import uuidfrom predict import PredictionModelObjectapp = Flask(__name__)
CORS(app)print("Loading Model Object")
predictionObject = PredictionModelObject()
print("Loaded Model Object")

Add an API route

@app.route('/api/generatePlot', methods=['POST'])
def gen_plot():    req = request.get_json()
    genre = req['genre']
    director = req['director'] if 'director' in req else None
    cast = req['cast'] if 'cast' in req else None
    ethnicity = req['ethnicity'] if 'ethnicity' in req else None
    num_plots = req['num_plots'] if 'num_plots' in req else 1
    seq_len = req['seq_len'] if 'seq_len' in req else 200    if not isinstance(num_plots, int) or not isinstance(seq_len, int):
        return jsonify({
            "message": "Number of words in plot and Number of plots must be integers",
            "status": "Fail"
        })
    
    try:
        plot, status = predictionObject.returnPlot(
            genre = genre, 
            director = director,
            cast = cast,
            ethnicity = ethnicity,
            seq_len = seq_len,
            seq_num = num_plots
        )        if status == 'Pass':
            
            plot["message"] = "Success!"
            plot["status"] = "Pass"
            return jsonify(plot)
        
        else:            return jsonify({"message": plot, "status": status})
    
    except Exception as e:        return jsonify({"message": "Error getting plot for the given input", "status": "Fail"})

The main block to run the flask app

if __name__ == "__main__":
    app.run(debug=True, port = 5000)

This script won’t work yet. You might receive an ImportError when executing the script at this point as we haven’t yet created the predict.py script with the PredictionModelObject

Create the `PredictionModelObject`

Create an predict.py file and import the following

# predict.py
import os
import re
import random
import torch
import torch.nn as nn
from rich.console import Console
from transformers import T5Tokenizer, T5ForConditionalGeneration
from collections import defaultdictconsole = Console(record = True)torch.cuda.manual_seed(3007)
torch.manual_seed(3007)

Create the PredictionModelObject class

# predict.py
class PredictionModelObject(object):    def __init__(self):console.log("Model Loading")
        self.model = T5ForConditionalGeneration.from_pretrained('./outputs/model_files')
        self.tokenizer = T5Tokenizer.from_pretrained('./outputs/model_files')
        console.log("Model Loaded")
    
    def beamSearch(self, text, seq_len, seq_num):        outputDict = dict()
        outputDict["plots"] = {}
        input_ids = self.tokenizer.encode(text, return_tensors = "pt")
        beamOp = self.model.generate(
            input_ids,
            max_length = seq_len,
            do_sample = True,
            top_k = 100,
            top_p = 0.95,
            num_return_sequences = seq_num
        )        for ix, sample_op in enumerate(beamOp):
            outputDict["plots"][ix] = self.tokenizer.decode(sample_op, skip_special_tokens = True)
            
        
        return outputDict    def genreToPlot(self, genre, seq_len, seq_num):        text = f"generate plot for genre: {genre}"        return self.beamSearch(text, seq_len, seq_num)    def genreDirectorToPlot(self, genre, director, seq_len, seq_num):        text = f"generate plot for genre: {genre} and director: {director}"
        
        return self.beamSearch(text, seq_len, seq_num)    def genreDirectorCastToPlot(self, genre, director, cast, seq_len, seq_num):        text = f"generate plot for genre: {genre} director: {director} cast: {cast}"        return self.beamSearch(text, seq_len, seq_num)    def genreDirectorCastEthnicityToPlot(self, genre, director, cast, ethnicity, seq_len, seq_num):        text = f"generate plot for genre: {genre} director: {director} cast: {cast} and ethnicity: {ethnicity}"        return self.beamSearch(text, seq_len, seq_num)
    
    def genreCastToPlot(self, genre, cast, seq_len, seq_num):        text = f"genreate plot for genre: {genre} and cast: {cast}"        return self.beamSearch(text, seq_len, seq_num)    def genreEthnicityToPlot(self, genre, ethnicity, seq_len, seq_num):        text = f"generate plot for genre: {genre} and ethnicity: {ethnicity}"        return self.beamSearch(text, seq_len, seq_num)    def returnPlot(self, genre, director, cast, ethnicity, seq_len, seq_num):
        console.log('Got genre: ', genre, 'director: ', director, 'cast: ', cast, 'seq_len: ', seq_len, 'seq_num: ', seq_num, 'ethnicity: ',ethnicity)
        
        seq_len = 200 if not seq_len else int(seq_len)
        
        seq_num = 1 if not seq_num else int(seq_num)
        
        if not director and not cast and not ethnicity:            return self.genreToPlot(genre, seq_len, seq_num), "Pass"
        
        elif genre and director and not cast and not ethnicity:            return self.genreDirectorToPlot(genre, director, seq_len, seq_num), "Pass"        elif genre and director and cast and not ethnicity:            return self.genreDirectorCastToPlot(genre, director, cast, seq_len, seq_num), "Pass"        elif genre and director and cast and ethnicity:            return self.genreDirectorCastEthnicityToPlot(genre, director, cast, ethnicity, seq_len, seq_num), "Pass"        elif genre and cast and not director and not ethnicity:            return self.genreCastToPlot(genre, cast, seq_len, seq_num), "Pass"
        
        elif genre and ethnicity and not director and not cast:            return self.genreEthnicityToPlot(genre, ethnicity, seq_len, seq_num), "Pass"        else:            return "Genre cannot be empty", "Fail"

Save the predict.py file and then run the app.py file in debug mode using,

python app.py

Test your API

Create a test_api.py file and execute

# test_api.py
import requests
import osurl = "<http://localhost:5000/api/generatePlot>"
json = {
    "genre": str(input("Genre: ")),
    "director": str(input("Director: ")),
    "cast": str(input("Cast: ")),
    "ethnicity": str(input("Ethnicity: ")),
    "num_plots": int(input("Num Plots: ")),
    "seq_len": int(input("Sequence Length: ")),
}r = requests.post(url, json = json)
print(r.json())

How to run with `gunicorn` ?

Using gunicorn with flask is very easy. While installing the requirements at the start we have installed the gunicorn command and now we need to go to the folder where the app.py file is located via. the terminal and run the following command

gunicorn -k gthread -w 2 -t 40000 --threads 3 -b:5000 app:app

The format and flags we use above represent the following

k: kind (type of workers)- gthread, gevent, etc...
w: number of workers
t: timeout time
threads: number of threads per worker
b: bind port number

If your filename is server.py or flask_app.py the app:app part will change to server:app or flask_app:app

In Summary

In this blog, we saw how can we use our previously trained T5 transformer to generate storylines and deploy it using flask and gunicorn. This blog is made to be followed pretty easily so you don't waste time going around different platforms to check out the issues. Hope you have fun reading and implementing this.

Bio: Vatsal Saglani (@saglanivatsal) is a Machine Learning Engineer at Quinnox.

Original. Reposted with permission.

Related:

How To Generate Meaningful Sentences Using a T5 Transformer

How to do Model Inference?

How to deploy this with Flask?

Install the requirements

Create an app.py file

Create the `PredictionModelObject`

Test your API

How to run with `gunicorn` ?

In Summary

More On This Topic

Latest Posts

Top Posts

How To Generate Meaningful Sentences Using a T5 Transformer

How to do Model Inference?

How to deploy this with Flask?

Install the requirements

Create an app.py file

Create the PredictionModelObject

Test your API

How to run with gunicorn ?

In Summary

More On This Topic

Latest Posts

Top Posts

Create the `PredictionModelObject`

How to run with `gunicorn` ?