Platinum BlogTabPy: Combining Python and Tableau

This article demonstrates how to get started using Python in Tableau.



By Bima Putra Pratama, Data Scientist

Figure

Photo by Paweł Czerwiński on Unsplash

 

Can we integrate the power of Python calculation with a Tableau?

 

That question was encourage me to start exploring the possibility of using Python calculation in Tableau, and I ended up with a TabPy.

So, What is TabPy? How can we use TabPy to integrating Python and Tableau?

In this article, I will introduce TabPy and go through an example of how we can use it.

 

TabPy Introduction

 
TabPy is an Analytics Extension from Tableau which enables us as a user to execute Python scripts and saved functions using Tableau. Using TabPy, Tableau can run Python script on the fly and display the results as a Visualization. Users can control data being sent to TabPy by interacting in their Tableau worksheet, dashboard, or stories using parameters.

You can read more about TabPy in the official Github Repository:

tableau/TabPy
Execute Python code on the fly and display results in Tableau visualizations: — tableau/TabPy
 

 

Installing TabPy

 
I assume you already have Python installed in your system. If you don’t, you can install it first by going to https://www.python.org/ to download the python installation. Then you can install it in your system.
Next, we can install TabPy as a python package by using pip:

pip install tabpy


 

Running TabPy

 
Once the installation success, we can run the services using the following command:

tabpy


If all goes well, you should see this:

Figure

Running TabPy. Image by Author

 

By default, this service will be running in your localhost on port 9004. You can also verify it by open it in your web browser.

Figure

TabPy Server Info. Image by Author

 

 

Enabling TabPy

 
Now, let’s go to our Tableau and set up the service. I am using Tableau Desktop version 2020.3.0. However, there will be no difference in the previous version as well.

First, go to Help, then choose Settings and Performance and select Manage Analytics Extension Connection.

Figure

Analytics Extension Connection Location. Image by Author

 

Then, you can set up the Server and Port. You can leave Sign in with a username and password blank, as we don’t set up credentials in our TabPy service.

Image

 

Once done, click the Test Connection. If successful, you will see this message:

Image for post
 

Congratulations!! Now, our Tableau is already connected with TabPy and ready to use.

 

Using TabPy

 
There are two ways that we can use to do Python calculation:

  • Write code directly as Tableau calculated fields. The code then will be immediately executed on the fly in the TabPy server.
  • Deploy a function into the TabPy server that can be reachable as a REST API endpoint.

In this article, I will only show how to do the first method, which we will write code directly as Tableau calculated fields.

As an example, we will perform clustering to the Airbnb dataset that publicly available through the Tableau site, and you can download it using this link. We will cluster each zipcode based on their housing characteristics using several popular clustering algorithms.

 

Step 1 Importing Data

 
In the first step, let’s import our data set to Tableau. This dataset has 13 columns.

Image for post

As our primary goal is to see how we use TabPy, We will not focus on making the best possible model. Thus, we will only use the following variables in this dataset to perform clustering:

  • The median number of beds in each zip code
  • The average price in each zip code
  • The median number of ratings in each zip code

 

Step 2 Create Control Parameters

 
We need to create two parameters that will be used to select our clustering method and number of clusters, which are:

  • Cluster Numbers
  • Clustering Algorithm
Figure

Create a Parameter. Image by Author

 

Figure

Cluster Numbers Parameter. Image by Author

 

Figure

Clustering Algorithm Parameter. Image by Author

 

 

Step 3 Create a Script

 
We will create a python script as a calculated field in Tableau.

Figure

Create a calculated field. Image by Author

 

You can then insert the following script in a calculated field.

This code is wrapped in SCRIPT_REAL() function from Tableau and will do the following:

  • Import required Python libraries.
  • Scaling features with Standard Scaler
  • Combine Scaled Features and handling null values
  • Conditional to check which algorithm to use and do the following
  • Return clustering results as a list.

Then we will convert the results into String data type to make it as categorical data.

One more thing to notice is we need to do the Table Calculation in Zipcode. So we need to change the Default Table Calculation to Zipcode to make this code works.

Figure

Change Default Table Calculation. Image by Author.

 

 

Step 4 Visualize Results

 
Now, it’s time to visualize the results. I use a Zipcode to create a Map to visualize the clustering results. We can use the parameter to change the number of clusters.

Image for post

 

Wrap Up

 

Figure

Photo by Elisha Terada on Unsplash

 

Let’s celebrate coming up to this point! If you follow the step, you have been successfully integrating Python and Tableau. This integration is a beginning step for a more advanced use case using Tableau and Python.

I’m looking forward to seeing what you build with this integration!

 

About the Author

 
Bima Putra Pratama is a Data Scientist with Tableau Desktop Specialist Certification, who always eager to expand his knowledge and skills. He was graduated as a Mining Engineer and began his Data Science journey through various online programs from HardvardX, IBM, Udacity, etc. Currently, he is making impacts together with DANA Indonesia in building a cashless society in Indonesia.

If you have any feedback or any topics to be discussed, please reach out to Bima via LinkedIn. I’m happy to connect with you!

 

References

 

 
Original. Reposted with permission.

Related: