KDnuggets Home » FAQ :: Learning Data Mining and Data Science

Where to start with Data Mining and Data Science

Gregory Piatetsky answer:

You can best learn data mining and data science by doing, so start analyzing data as soon as you can! However, don't forget to learn the theory, since you need a good statistical and machine learning foundation to understand what you are doing and to find real nuggets of value in the noise of Big Data.

Here are 7 steps to learn data mining (many of these steps you can do in parallel:

  1. Learn R and Python
  2. Read 1-2 introductory books
  3. Take 1-2 introductory courses and watch some webinars
  4. Learn data mining software suites
  5. Check available data resources and find something there
  6. Participate in data mining competitions
  7. Interact with other data scientists, via social network, groups, and meetings

Also, don't forget to subscribe to KDnuggets News bi-weekly email and follow @kdnuggets - voted Top Big Data Twitter - for latest news on Analytics, Big Data, Data Mining, and Data Science.

1. Learning Languages

There are many great resources, but the most popular languages for data mining are R, Python, and SQL.

There are many resources for each, for example

2. Textbooks

There are many data mining and data science textbooks available, but you can check these

3. Data Mining and Data Science Tools

There are many data mining tools for different tasks, but it is best to learn using a data mining suite which supports the entire process of data analysis. You can start with open source (free) tools such as KNIME, RapidMiner, and Weka

However, for many analytics jobs you need to know SAS, which is the leading commercial tool and widely used.

Other popular Analytics and Data Mining Software include MATLAB, StatSoft Statistica, IBM SPSS, Microsoft SQL Server, Tableau, IBM SPSS Modeler, and Rattle.

4. Courses and Webinars

There are many online courses, short and long, many of them free - see KDnuggets online education directory.

Check in particular these courses:

There are also many free webinars and webcasts on latest topics in Analytics, Big Data, Data Mining, and Data Science.

5. Data

You will need data to analyze - see KDnuggets directory of Datasets for Data Mining, including

6. Competitions

Again, you will best learn by doing, so participate in Kaggle competitions - start with beginner competitions, such as Predicting Titanic Survival using Machine Learning

7. Interact: Meetings, Groups, and Social Networks

You can join many peer groups - see Top 30 LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science.

AnalyticBridge is a community for Analytics and Data Science.

You can attend many Meetings and Conferences on Analytics, Big Data, Data Mining, Data Science, & Knowledge Discovery.

Also, consider joining ACM SIGKDD, which organizes the annual KDD conference - the leading research conference in the field.

More ...

Check also other answers: