KDnuggets Home » News » 2011 » Dec » Publications » Data Mining Shakespeare  ( < Prev | 11:n31 | Next > )

Data Mining Shakespeare

Processing the bard plays thru DocuScope led to some unusual discoveries, such as that, linguistically, Othello a comedy. Also, listen to the podcast on Data Mining Shakespeare.

Shakespear The Data-Mining's The Thing: Shakespeare Takes Center Stage In The Digital Age

Fast Company, by Neal Ungerleider, Dec 14, 2011

Folger Shakespeare Library director Michael Witmore is using 21st-century tools to analyze the Bard's work. When data-mining techniques borrowed from the sciences and business research were applied to classic Shakespearean plays, surprising discoveries were made.

In a late October presentation at the Folger Shakespeare Library, Library director Michael Witmore described his use of innovative data-mining methods to analyze Shakespeare's First Folio.

Listen to this funny and interesting Data-Mining Shakespeare Podcast (with graphs)
Shakespeare in PCA space

The event was subsequently repackaged as a free podcast, and the ramifications are fascinating. By processing excerpts from the First Folio through word-analysis software, proof was found that Shakespeare's vocabulary and syntax varied wildly between his comedies, historical plays, and tragedies. More importantly, software analysis seems to prove that Othello--despite being a tragedy--was intentionally written with comedic stylistic cues that served to intensify the play's tragic aspects.

Witmore processed 767 different thousand-word excerpts of plays from the First Folio through a piece of software called DocuScope. DocuScope is a rhetorical analysis tool based on the Oxford English Dictionary that takes a database of 40 million English lingustic patterns and sorts them into more than 100 categories. These categories are rather dry--typical classifications include "Positive Emotion," "Directives," and "Narrative Verbs"--but they do the job.

... Data-mining and computer-led textual analysis uncovered patterns in Shakespeare's work that a human observer, trained in traditional academic reading methods, would never see. Such as the fact that--in purely linguistic terms--Othello is a comedy.

"Comedy" in Shakespearean terms is quite different from our own conception of the genre.

... As for Shakespeare's words and writing styles being distinct depending on whether he was writing comedies, tragedies, or historical epics? DocuScope analysis indicates the funniest thing Shakespeare ever wrote was a portion of The Merry Wives of Windsor, while a passage from Richard II was the most serious.

Read more.

KDnuggets Home » News » 2011 » Dec » Publications » Data Mining Shakespeare  ( < Prev | 11:n31 | Next > )