This list is my summary of Quora question What are the best Python 2.7 modules for data mining?
Basics:
- numpy - numerical library, numpy.scipy.org/
- scipy - Advanced math, signal processing, optimization, statistics, www.scipy.org/
- matplotlib, python plotting - Matplotlib, matplotlib.org
- MDP, a collection of supervised and unsupervised learning algorithms, pypi.python.org/pypi/MDP/2.4
- mlpy, Machine Learning Python, mlpy.sourceforge.net
- NetworkX, for graph analysis, networkx.lanl.gov/
- Orange, Data Mining Fruitful & Fun, biolab.si
- pandas, Python Data Analysis Library, pandas.pydata.org
- pybrain, pybrain.org
- scikits-learn - Classic machine learning algorithms - Provide simple an efficient solutions to learning problems, scikit-learn.org/stable/
- NLTK, Natural Language Toolkit, nltk.org
- Scrapy, An open source web scraping framework for Python scrapy.org
- urllib/urllib2
![]() |
Next post ![]() |