OpenText Data Digest Nov 13: Making Relevant Data Easy to See

For this week, we provide some examples of how complex data can be displayed in an easy-to-understand fashion.

Often in data sciences, visualizations are criticized for having way too much information crammed into an overly busy design. We here at Data Driven Digest could not agree more.

Making your data relevant in a visual way allows you to communicate clearly the story or feeling you wish to express. Like other forms of art, data visualization requires an easy to understand framework that catches your attention and inspires more thinking.

“Music is powered by ideas. If you don’t have clarity of ideas, you’re just communicating sheer sound,” famed cellist Yo-Yo Ma once said.

For this week, we provide some examples of how complex data can be displayed in an easy-to-understand fashion.

pew mom-dad work pattern

Layers of Working Parents: The number of U.S. homes in which both parents work full time increased to 46 percent, up from 31 percent in 1970, according to the graphic above from a new Pew Research Center survey. The survey, conducted between September 15 and October 13 this year, illustrates some of the challenges in balancing work and family. Pew researchers interviewed more than 1,800 parents with children younger than 18 and cross-compared these results with its population surveys and other microdata.

What is significant about this report is that it compresses a lot of information about changing family structure into a simple, clear format—a layered bar graph that is marked inside the text instead of relying on a key with notes. The visualization is comprehensive, yet incomplete. The survey did not include data prior to 1970, because Pew only began asking about working couples once the women’s movement began taking hold. Additionally, same-sex parents were not included in the findings as Pew researchers wanted to show a relevant comparison between couples then and now. Despite the very general, high-level overview, you can see significant dips in employment, such as the fallout of the 2008 subprime mortgage crisis.

nba gm scatter plot

Plotting NBA Team Success: Scatter plot data visualizations are often used to show correlations between two variables that aren’t tied to a linear time sequence and are great for identifying trends. While business users might want a lot of points on the X-Y graph, it can be overwhelming for the eyes. Adding filters is the best way to overcome this obstacle. Our friends over at DataBucket are fans of U.S. professional basketball—an area benefiting immensely from data analysis. They’ve created a scatter plot based around the premise that great team chemistry and player retention are sure-fire ways for general managers to establish winning seasons year after year i.e. the dynasty approach.

data bucket nba team chemistry

“Plotting retention against chemistry allows us to classify NBA franchises into four categories as shown in the table above,” Data Buckets wrote. “Teams with low retention but above-average performance are indications of a newly formed core team, as newly acquired players have figured out how to work together in a short period of time. General managers with a high retention and above-average performance team have found a core group of players that work well together. In these two instances, GMs should not break up their rosters.”

Filtering for certain teams finds the Los Angeles Lakers knows how to build great teams with good chemistry in most years while the New York Knicks do not. The scatter plot predicts the Golden State Warriors, Atlanta Hawks and Memphis Grizzlies should perform exceedingly well this season.

college rankings

College Scorecard: Sometimes, the best visualization come in the form of a list or scorecard. It’s as simple as that. The visualization presented above is derived from the U.S. Department of Education’s College Scorecard database and Bureau of Economic Analysis using a methodology developed by Jonathan Rothwell (@jtrothwell), a fellow at Brookings Institution’s Metropolitan Policy Program.

Rankings for 3,173 colleges (1,507 two-year colleges and 1,666 four-year colleges) are scored against variables like curriculum value, STEM orientation, graduation rates, and faculty salaries.

As Jonathan notes: “Value-added measures attempt to isolate the contribution of the college to student outcomes, as distinct from what one might predict based on student characteristics or the level of degree offered. It is not a measure of return on investment, but rather a way to compare colleges on a more equal footing, by adjusting for the relative advantages or disadvantages faced by diverse students pursuing different levels of study across different local economies.”

The list’s simple design allows the user to filter and compile information so that parents and students can predict the long-term value of their college choices.


Bio: Michael Singer is a Senior Content Manager In Marketing at OpenText. Previously he was a technology journalist for publications such as the Economist Intelligence Unit, CNET, InformationWeek and ReadWrite.