10 Useful Python Data Visualization Libraries for Any Discipline

A great overview of 10 useful Python data visualization tools. It covers some of the big ones, like matplotlib and Seaborn, but also explores some more obscure libraries, like Gleam, Leather, and missingno.

By Melissa Bierly, Mode Analytics.

Scroll through the Python Package Index and you’ll find libraries for practically every data visualization need—from GazeParser for eye movement research to pastalog for realtime visualizations of neural network training. And while many of these libraries are intensely focused on accomplishing a specific task, some can be used no matter what your field.

Today, we’re giving an overview of 10 interdisciplinary Python data visualization libraries, from the well-known to the obscure. We’ve noted the ones you can take for a spin without the hassle of running Python locally, using Mode Python Notebooks.


Python data visualization - matplotlib

Two histograms (matplotlib)

matplotlib is the O.G. of Python data visualization libraries. Despite being over a decade old, it’s still the most widely used library for plotting in the Python community. It was designed to closely resemble MATLAB, a proprietary programming language developed in the 1980s.

Because matplotlib was the first Python data visualization library, many other libraries are built on top of it or designed to work in tandem with it during analysis. Some libraries like pandas and Seaborn are “wrappers” over matplotlib. They allow you to access a number of matplotlib’s methods with less code.

While matplotlib is good for getting a sense of the data, it’s not very useful for creating publication-quality charts quickly and easily. As Chris Moffitt points out in his overview of Python visualization tools, matplotlib “is extremely powerful but with that power comes complexity.”

matplotlib has long been criticized for its default styles, which have a distinct 1990s feel. The upcoming release of matplotlib 2.0 promises many new style changes to address this problem.

Created by: John D. Hunter, available in Mode
Where to learn more: matplotlib.org

Try matplotlib in Mode.


Python data visualization - Seaborn

Violinplot (Michael Waskom)

Seaborn harnesses the power of matplotlib to create beautiful charts in a few lines of code. The key difference is Seaborn’s default styles and color palettes, which are designed to be more aesthetically pleasing and modern. Since Seaborn is built on top of matplotlib, you’ll need to know matplotlib to tweak Seaborn’s defaults.

Created by: Michael Waskom, available in Mode
Where to learn more:http://web.stanford.edu/~mwaskom/software/seaborn/index.html

Try Seaborn in Mode.


Python data visualization - ggplot

Small multiples (ŷhat)

ggplot is based on ggplot2, an R plotting system, and concepts from The Grammar of Graphics. ggplot operates differently than matplotlib: it lets you layer components to create a complete plot. For instance, you can start with axes, then add points, then a line, a trendline, etc. Although The Grammar of Graphics has been praised as an “intuitive” method for plotting, seasoned matplotlib users might need time to adjust to this new mindset.

According to the creator, ggplot isn’t designed for creating highly customized graphics. It sacrifices complexity for a simpler method of plotting.

ggplot is tightly integrated with pandas, so it’s best to store your data in aDataFrame when using ggplot.

Created by: ŷhat
Where to learn more: http://ggplot.yhathq.com/


Python data visualization - Bokeh

Interactive weather statistics for three cities (Continuum Analytics)

Like ggplot, Bokeh is based on The Grammar of Graphics, but unlike ggplot, it’s native to Python, not ported over from R. Its strength lies in the ability to create interactive, web-ready plots, which can be easily output as JSON objects, HTML documents, or interactive web applications. Bokeh also supports streaming and real-time data.

Bokeh provides three interfaces with varying levels of control to accommodate different user types. The highest level is for creating charts quickly. It includes methods for creating common charts such as bar plots, box plots, and histograms. The middle level has the same specificity as matplotlib and allows you to control the basic building blocks of each chart (the dots in a scatter plot, for example). The lowest level is geared toward developers and software engineers. It has no pre-set defaults and requires you to define every element of the chart.

Created by: Continuum Analytics
Where to learn more: http://bokeh.pydata.org/en/latest/

Want to brush up on your Python skills? Check out our tutorial to learn how to analyze and visualize data using Python.