Continually Updated Data Science IPython Notebooks

Continually updated Data Science IPython Notebooks: Spark, Hadoop MapReduce, HDFS, AWS, Kaggle, scikit-learn, matplotlib, pandas, NumPy, SciPy, and various command lines.

By Gregory Piatetsky, @kdnuggets.

Here is a good reddit find: a collection of continually updated IPython notebooks prepared and maintained by Donne Martin.

Data Science IPython Notebooks by Donne Martin

Topics include:
  • Big data processing with Spark, Hadoop MapReduce, HDFS, Amazon Web Services
  • Machine learning with scikit-learn, Kaggle
  • Statistical inference with Scipy and the Python data stack
  • Python data stack with pandas, NumPy, matplotlib
  • Python language essentials geared towards data processing
  • Business analyses (ie churn modeling)
  • Various command lines


or Bitly: