Tech이야기~!
welcom 자세히보기

Computer/Python

The Best Python Data Visualization Libraries [2018]

Enhold 2020. 2. 3. 13:51

The Python Package Index has libraries for practically every data visualization need—from Pastalog for real-time visualizations of neural network training to Gaze Parser for eye movement research. Some of these libraries can be used no matter the field of application, yet many of them are intensely focused on accomplishing a specific task.
An overview of 11 interdisciplinary Python data visualization libraries, from the most popular to the least follows.

Matplotlib

Matplotlib Python Library is used to generate simple yet powerful visualizations. More than a decade old, it is the most widely-used library for plotting in the Python community. Matplotlib is used to plot a wide range of graphs– from histograms to heat plots.

Matplotlob is the first Python data visualization library, therefore many other libraries are built on top of Matplotlib and are designed to work in conjunction with the analysis. Libraries like pandas and matplotlib are “wrappers” over Matplotlib allowing access to a number of Matplotlib’s methods with less code.

The versatility of Matplotlib can be used to make visualization types such as:

  • Scatter plots
  • Bar charts and Histograms
  • Line plots
  • Pie charts
  • Stem plots
  • Contour plots
  • Quiver plots
  • Spectrograms

You can create grids, labels, legends etc. with ease since everything is easily customizable.

Seaborn

Seaborn is a popular data visualization library that is built on top of Matplotlib. Seaborn’s default styles and color palettes are much more sophisticated than Matplotlib. Seaborn puts visualization at the core of understanding any data. Seaborn is a higher-level library- it’s easier to generate certain kinds of plots, including heat maps, time series, and violin plots.

ggplot

Ggplot is a Python visualization library based on R’s ggplot2 and the Grammar of Graphics. You can construct plots using high-level grammar without worrying about the implementation details. Ggplot operates differently compared to Matplotlib: it lets users layer components to create a full plot. For example, the user can start with axes, and then add points, then a line, a trend line, etc. The Grammar of Graphics has been hailed as an “intuitive” method for plotting, however seasoned Matplotlib users might need time to adjust to this new mindset.

Bokeh

Bokeh, native to Python is also based on The Grammar of Graphics like ggplot. It also supports streaming, and real-time data. The unique selling proposition is its ability to create interactive, web-ready plots, which can easily output as JSON objects, HTML documents, or interactive web applications.
Bokeh has three interfaces with varying degrees of control to accommodate different types of users. The topmost level is for creating charts quickly. It includes methods for creating common charts such as bar plots, box plots, and histograms. The middle level allows the user to control the basic building blocks of each chart (for example, the dots in a scatter plot) and has the same specificity as Matplotlib. The bottom level is geared toward developers and software engineers. It has no pre-set defaults and requires the user to define every element of the chart.

Plotly

While Plotly is widely known as an online platform for data visualization, very few people know that it can be can be accessed from a Python notebook. Like Bokeh, Plotly’s strength lies in making interactive plots, and it offers contour plots, which cannot be found in most libraries.

Pygal

Pygal, like Plotly and Bokeh, offers interactive plots that can be embedded in a web browser. The ability to output charts as SVGs is its prime differentiator. For work involving smaller datasets, SVGs will do just fine. However, for charts with hundreds of thousands of data points, they become sluggish and have trouble rendering.
It’s easy to create a nice-looking chart with just a few lines of code since each chart type is packaged into a method and the built-in styles are great.

Altair

Altair is a declarative statistical visualization python library based on Vega-Lite. You only need to mention the links between data columns to the encoding channels, such as x-axis, y-axis, color, etc. and the rest of the plotting details are handled automatically. This makes Altair simple, friendly and consistent. It is easy to design effective and beautiful visualizations with a minimal amount of code using Altair.

Geoplotlib

Geoplotlib is a toolbox used for plotting geographical data and map creation. It can be used to create a variety of map-types, like choropleths, heatmaps, and dot density maps. Pyglet (an object-oriented programming interface) is required to be installed to use Geoplotlib.

Geoplotlib reduces the complexity of designing visualizations by providing a set of in-built tools for the most common tasks such as density visualization, spatial graphs, and shape files.

Since most Python data visualization libraries don’t offer maps, it’s good to have a library dedicated to them.

Gleam

Gleam is inspired by R’s Shiny package. It allows the user to turn any analysis into interactive web apps using only Python scripts. Gleam users don’t need to know HTML, CSS, or JavaScript to do this. Gleam works with any Python data visualization library. Once users have created a plot, they can build fields on top of it to filter and sort data.

Missingno

Dealing with missing data is cumbersome. The completeness of a dataset can be gauged quickly with Missingno, rather than painstakingly searching through a table. The user can filter and sort data based on completion or spot correlations with a heat map or a dendrogram.

Leather

Leather is designed to work with all data types and produces charts such as SVGs, so that they can be scaled without losing image quality. Leather’s creator, Christopher Groskopf, puts it best: “Leather is the Python charting library for those who need charts now and don’t care if they’re perfect.” Since this library is relatively new, some of the documentation is still in progress. The charts that can be made are pretty basic—but that’s the intention.

There is a wide range of visualization tools, with a huge diversity, depending on the focus of the task at hand available for Python. This is reflected in the sheer number of libraries available. It is imperative for the users to bear in mind the differences between the approaches and their implications before zeroing in on a particular approach.

Would you add any other python data visualization libraries to this list? Please share your favorites in a comment below.

Guest Author – Quincy is part of the team at Springboard and is passionate about online learning and strong coffee.