7 Simple Data Visualizations You Should Know in R
This post presents a selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages.
Data visualization is an innovative and exciting field. Although it involves long hours behind a computer screen and a knack for numbers, it's a highly rewarding profession that is very much in its early stages — and it's growing every day.
Although there are few dedicated programs for visualizing data, many data scientists use a programming language called R — and it and its many available packages provide many different forms of visualization for nearly every scenario imaginable.
Below are selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages. The examples all make use of datasets included in a default R base installation.
Editor's note: Code for the first 5 visualizations has been provided by Elisa Du.
1. Bar Chart
You're probably already familiar with the basic bar chart from elementary school, high school and college. The concept of the bar chart in R is the same as it was in the past scenarios — to show a categorical comparison between two or more variables. However, there are several different types of bar charts to know and understand.
Horizontal and vertical bar charts are already common and familiar — they are standard formats in most academic or professional presentations. But R provides a stacked bar chart that lets you introduce different variables to each category.
Fig 1. Bar chart (courtesy of Elisa Du)
Histograms are standard in some academic fields, but they're usually reserved for the senior-most levels. These charts are best with highly precise or accurate numbers in R.
It ultimately provides a probability estimate of a variable — the period of time before a project's completion, for example. R provides a simple function for this as well.
Fig 2. Histogram (courtesy of Elisa Du)
3. Heat Map
The result is an attractive 2D image that is easy to interpret. As a basic example, a heat map highlights the popularity of competing items by ranking them according to their original market launch date. It breaks it down further by providing sales statistics and figures over the course of time.
Fig 3. Heat map (courtesy of Elisa Du)
4. Scatter Plot
Plotting is a popular alternative to charting or graphing. It provides a unique visualization involving various dots. The most standard iteration — the scatter plot — tracks two continuous variables over the course of time. A basic application of the scatter plot involves tracking the height and weight of children throughout the years.
Scatter plots are useful when trying to avoid misinformation in the visualization. Only use a plot if you're sure the audience is familiar with that type of chart, and always use it sparingly. When in doubt, go with one of your other options.
Fig 4. Scatter plot (courtesy of Elisa Du)
5. Box Plot
The box plot resembles a bar chart in many respects. Instead of focusing on categorical data, box plots provide visualization for both categorical and continuous variable data.
In the real world, box plots give detailed information on weather patterns and how they change over the course of time.
Fig 5. Box plot (courtesy of Elisa Du)
Correlated data is best visualized through corrplot. The 2D format is similar to a heat map, but it highlights statistics that are directly related.
Most correlograms highlight the amount of correlation between datasets at various points in time. Comparing sales data between different months or years is a basic example.
Fig 6. Correlogram with circles (courtesy of Abdul Majed Raja)
Fig 7. Correlogram with numbers (courtesy of Abdul Majed Raja)
7. Area Chart
Area charts express continuity between different variables or data sets. It's akin to the traditional line chart you know from grade school and is used in a similar fashion.
Most area charts highlight trends and their evolution over the course of time, making them highly effective when trying to expose underlying trends — whether they're positive or negative.
Fig 8. Area chart (courtesy of Abdul Majed Raja)
Data Visualization Is Entering the Mainstream in a Big Way
Studies show charts, graphs and other visualizations provide an easy way of remembering data when compared to monotonous spreadsheets and archaic reports.
Not only is this true in the professional world, but many academic institutions are embracing next-gen data visualizations in student essays, presentations and theses, too.
It seems there's hardly an area untouched by data visualization — and the field is still in its infancy.
Bio: Kayla Matthews discusses technology and big data on publications like The Week, The Data Center Journal and VentureBeat, and has been writing for more than five years. To read more posts from Kayla, subscribe to her blog Productivity Bytes.
- How To Choose The Right Chart Type For Your Data
- Choropleth Maps in R
- Best Practices in Data Visualization