OpenText Data Digest, Oct 9: Baseball Playoffs

We love our data the way we love our baseball—lots of visuals and power hitters that crush it over the wall for the walk-off win. Check out which visualizations made to top three bases for this week’s OpenText data visualization.


Take me out to the data. Take me out to the stats. Design me some bar charts and Venn diagrams. I don’t care if I error correct. Cause it’s Hadoop, Hadoop for NoSQL. If you can’t graph it’s a shame. For it’s one, two, x and y-axis… at the old data viz game.

We love our data the way we love our baseball—lots of visuals and power hitters that crush it over the wall for the walk-off win. The 2015 Major League Baseball playoffs are now underway with the Divisional Series brackets set. Every game, every inning, every play is packed full of data that can and has been visualized. In the spirit of the grand old game, we present three data visualizations that really swing for the bleachers. Enjoy!


Real Time baseball

FIRST BASE—Visualizing Last Night’s Game: Wild Card hopefuls like the Astros and Cubs look to unseat the Royals and Cardinals. The Mets and Dodgers along with the Rangers and Blue Jays are suiting up to settle their annual rivalries. For fans of baseball box scores, the team over at Statlas (stats + atlas) has reprised its visualization recaps of previous games. Laid out like a transit map, each row represents an at-bat with detailed notations similar to ones used by electrical engineers. The “Win Expectancy” column on the right is a statistical representation of the likelihood of the team winning the game based on the outcome of each batter and the impact of big plays. Chris Ring (@cringthis) and Mike Deal (@dealville) currently run the site founded in 2014 by Dan Chaparian, Mike Deal, and Geoff Beck. The site has even more interactive features if you run it on a tablet.

2014 oakland athletics visualization

SECOND BASE—Rise and Fall of 2014 Oakland Athletics: Sometimes it’s good to analyze the trends in a team’s season to determine where things improved and where things fell apart. Such was the case of the hapless Oakland A’s whose 2014 season started off like a rocket and then imploded in September with the release of outfielder Yoenis Cespedes and injuries to starting pitchers Jarrod Parker and A.J. Griffin. Digital Splash Media owner, Jeff Bennett (@DigitalSplash & @VizThinker), masterfully compiled several bar graphs and line graphs that illustrate the problems of the clubhouse that crafted the concept of Moneyball.

cherry pick baseball visualization

THIRD BASE—Who Is The Greatest Player?: Inadvertently, baseball conversations always boil down to who was the greatest player of the game? A tricky question to be certain as it depends on if the player is a pitcher or a position player. Does this player have the most hits or homers? How is their capacity to capitalize on the numbers of innings played versus the number plate appearances? In baseball statistics, there is also the WAR (Wins Above Replacement) that attempts to summarize a player’s total contributions to their team in one statistic. However, the debate continues. Helping categorize and visualize the oceans of data in making these determinations, Northeastern University assistant professor of history, Benjamin Schmidt has created an extensive and interactive Baseline Cherrypicker that you can use to see how baseball’s statistical leaders shape up against the field.

“The x axis shows the starting year for any stat: the y axis shows the length of time being measured,” Schmidt writes in his explanation. “So, for example, if you go down 7 cells from ‘1940,’ it will look up the player who led the league in WAR for the 7 years following 1940, and show the sentence ‘Ted Williams led the majors with 48.28 WAR from 1940 to 1947.’”

The visualization lets you hone in on the patches of interest. If you just run your mouse over the chart and read the text that pops up, you’ll start to get the general idea, he added.

The interactive visualization includes individual franchise and league leaderboards and would take about 120,000 pages, single-spaced to compile. That’s a lot of baseball to debate.

Like what you see? Every Friday we share great data visualizations and embedded analytics. If you have a favorite or trending example, tell us: Submit ideas to or add a comment below. Subscribe (at left) and we’ll email you when new entries are posted.