Progress Bars in Python with tqdm for Fun and Profit

Add progress bar to the Python functions, Jupyter Notebook, and pandas dataframe.



Progress Bars in Python with tqdm for Fun and Profit
Gif by Author

 

tqdm

 

In Arabic, tqdm (taqadum) means progress, and it is used to create a smart progress bar for the loops. You just need to wrap tqdm on any iterable - tqdm(iterable)

tqdm can help you create progress bars for data processing, training machine learning models, multi-loop Python function, and downloading data from the internet.

Install the package using pip:

pip install tqdm


Copy paste the code below, and run it on our machine to experience the tqdm magic first hand. 

tqdm shows the progress bar, number of iterations, time taken to run the loop, and frequency of iterations per second. 

from tqdm import tqdm

for i in tqdm(range(10000)):
    pass


100%|????????????????????????????????????????| 10000/10000 [00:00<00:00, 1764759.54it/s]


In this tutorial, we will learn to customize the bar and integrate it with the pandas dataframe. We will also learn about additional functionalities such as concurrent.

 

Progress Bars in Python Tutorial

 

tqdm and Python function

 

In the example below, we have created a fun function that takes the integer x and runs it after x seconds delay. 

Then, we wrapped the tqdm around range function that will run a loop for 10 iterations starting from 0-9

The first iteration will take zero seconds to run. The second iteration will take 1 second, and so on. It took 45 seconds for the loop to complete, and we got to experience an animated progress bar. 

Awesome! 



 

tqdm.notebook on the list

 

In this part, we will use the tqdm.notebook module to show progress bars in Jupyter Notebook using Ipython widgets. 

First, create a simple list of different colors. Then, use the loop to print the names one by one with a one-second delay. 

We have added the wrapper around the list, and it is displaying a multicolor progress bar. 

Amazing!

 

Progress Bars in Python with tqdm for Fun and Profit

 

Multiple progress bars

 

Let’s create multi-loop progress bars to mimic machine learning model training.

  1. trange is a combination of tqdm wrapper around range function.
  2. The outer loop will run for 10 iterations with a 0.01 delay.
  3. desc is used to label the progress bar. I will display it before the progress bar.
  4. The inner loop will run for 10,000 iterations with a 0.001 delay.

As we can observe, the animation of multiple progress bars looks amazing. To understand it better, I would like you to copy the code, modify it, and run it on your machine to experience the magic. 

 

Progress Bars in Python with tqdm for Fun and Profit

 

tqdm for Data Science

 

In this part, we will integrate tqdm to the pandas dataframe and use progress_apply to apply functions to the dataframe with the progress bar. 

 

Loading the Dataset

 

First, we will load the Hotel Booking dataset from Kaggle. Then, we will display the top five rows of the dataframe.  

The dataset contains 119390 observations for city hotel and resorts hotel bookings between the 1st of July 2015 and the 31st of August 2017, including bookings that effectively arrived and bookings that were canceled.

You can scroll to the right to see the values and column names.



 

Apply with tqdm

 

In this part, we will create a new column “user_name” using the customer's name.

  1. tqdm.pandas to initiate progress bars for pandas dataframe. We will also add the bar label “Processing the name column”
  2. user_name function lowers the string and replace the space with “-”
  3. Applying function to dataframe using .progress_apply(). It is similar to apply() function. For map() function you can use .progress_map()
  4. Displaying the top three rows

If you scroll to the right, you will see a new column user_name with values.



 

Parallel Processing Utility  

 

tqdm is not just a progress bar for loops. It also offers utilities such as tqdm.contrib.concurrent for parallel processing. 

In this section, we will extract the email provider from the email column. 

  1. Import process_map from tqdm.contrib.concurrent. 
  2. provider_extraction function will split the text on “@” and then on “.”
  3. Use process_map to map the function on df["email"]. We will select the 8 max_worker based on the number of CPU and chunksize 64.
  4. Add the progress bar label and customize it to display a green color progress bar instead on black. 
  5. View top 5 values of "email_provider" column.

Success! We have extracted the email provider with a green progress bar. Isn't this awesome? 



 

Conclusion

 

Playing with tqdm and sharing my experience with you guys was a fun experience. Apart from fun, it provides the necessary functionality for software development.

I have used GitHub gist, Deepnote embeds, and Kaggle to add code blocks and Gifs. You can check out these tools and create your magic with tqdm.

In this blog, we have learned about tqdm on Python loops, lists, multi-level progress bar, pandas integration, and parallel processing with a concurrent module. 

Read the documentation to learn about additional functionalities:

  • Asynchronous
  • Callbacks (Dask, Keras)
  • Decorator for iterators (Tkinter, Matplotlib)
  • CLI (terminal, console)
  • Thin wrappers (concurrent, itertools)
  • Logging
  • Sends updates(slack, discord, telegram)

 
 
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.