Most Popular Slideshare Presentations on Data Mining

SlideShare data mining presentations cover many topics, offering a unique way of consuming data mining content and exploring a variety of slideshows, both narrow and broad in scope.



By Grant Marshall, Nov 2014

Slideshare is a platform for uploading, annotating, sharing, and commenting on slide-based presentations. The platform has been around for some time, and has accumulated a great wealth of presentations on technical topics like Data Mining.

SlideShare Data Mining tags
Figure 1: Woordle of the tags associated with the presentations

Today, we will look at some of these top Data Mining presentations found on Slideshare. These presentations were retrieved by using a Python script and the Slideshare search_slideshow API, and then hand-curated to select the best, most relevant presentations. The slideshows and their associated metrics are shown below:

Title Date Views Downloads Favorites
Data Mining: Concepts and Techniques 2010-05-10

60288

932

8

Machine Learning and Data Mining: 11 Decision Trees 2007-04-02

52487

0

71

Big Data [sorry] & Data Science: What Does a Data Scientist Do? 2013-01-26

51793

0

138

Mining Social Data for Fun and Insight 2007-11-05

40434

1903

91

Building Tools to Data Mine Unstructured Text using a Machine Learning API 2012-02-14

32812

98

16

Machine Learning and Data Mining: 19 Mining Text And Web Data 2007-06-03

33329

2003

43

Introduction to Mahout and Machine Learning 2013-07-27

29261

575

49

Log Mining: Beyond Log Analysis 2007-09-27

27497

0

28

Social Data Mining 2013-11-02

28944

0

4

Machine Learning and Data Mining: 04 Association Rule Mining 2007-03-18

26448

0

42

Data mining: Classification and prediction 2010-08-19

27776

0

6

Chapter 11. Applications and Trends in Data Mining 2010-05-10

26285

1024

10

TextMining with R 2012-02-23

25380

4

63

Machine Learning and Data Mining: 01 Data Mining 2007-03-14

25407

0

123

Data Mining Concepts 2007-05-18

26125

2166

24

A Statistician’s View on Big Data and Data Science (Version 1) 2013-11-25

22832

0

60

Lecture 01 Data Mining 2008-03-11

22405

1474

18

Web Mining Tutorial 2010-05-10

21408

1319

35

Machine Learning and Data Mining: 12 Classification Rules 2007-04-11

20386

0

23

Analytics and Data Mining Industry Overview 2011-11-18

20604

1007

29

Data Mining With Excel 2007 And SQL Server 2008 2008-12-06

19718

824

9

Application of Data Warehousing & Data Mining to Exploitation for Supporting the Planning of Higher Education System in Sri Lanka 2009-07-25

20225

417

5

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University 2008-12-04

19715

608

6

Big Data v Data Mining 2013-01-31

19266

559

20


Here are some quick stats about the 24 slideshows in this table: there is an average of approximately 29,000 views, 621 downloads, 5 comments, and 38 favorites per slideshow. These aggregates can be deceptive, however.

With the comments, for example, a large number of these comments came from Machine Learning and Data Mining: 11 Decision Trees and TextMining with R. In the first case, most of these comments were requests for the slides (the author chose to disable downloads) and in the second case, most of the comments were requests for code that was excerpted in the presentation. Similarly, because some presentations had downloads disabled, the average download count is misleading. Adjusting for these comments and some downloads being disabled, the real averages are approximately 994 downloads and 3 comments per presentation.

Regardless, it appears that the social features are being put to use, and with some presentations, Analytics and Data Mining Industry Overview for example, the author can be seen responding to the comments. This shows how the format provides an interesting potential for people to interface with experts in data mining.

SlideShare Favorites vs. Counts
Figure 2: SlideShare Favorites vs. Counts

In this chart we see the diversity in the audiences that these different slideshows can draw. While there is a general upward trend in the number of favorites compared to the number of views, there are some exceptions. My hypothesis is that the more generally applicable lectures, like Data Mining: Concepts and Techniques, might draw a larger general viewership, the viewers will be less likely to favorite the slides. On the other hand, a more specific slideshow like Mining Social Data for Fun and Insight might not draw as large of a general crowd, but the viewers may be more likely to favorite it.

Related: