I recently came across Quora question: What are the most interesting open-source projects in artificial intelligence and machine learning?
Besides the well-known tools like R, Weka, RapidMiner, Knime, Orange, and more - see
Free Software Suites for Data Mining, Analytics, and Knowledge Discovery for the directory
here are some of the more interesting and less known software systems for machine learning, data mining, and data science:
- Apache Mahout mahout.apache.org/, machine learning library, built on top of Hadoop for scalability; lot of great examples and libraries for getting started.
- mloss mloss.org/software/ machine learning open-source software catalog, gives brief idea of current state of art of machine learning and machine learning open source tools
- Natural Language Toolkit for Python (NLTK): nltk.org/
- GraphLab (A New Parallel Framework for Machine Learning): graphlab.org/
- scikits.learn (general-purpose machine learning in Python): scikit-learn.sourceforge.net/stable/
- Vowpal Wabbit, a fast algorithm for online algorithm which looks at data sample at a time
Two important tools missing here are : Mallet mallet.cs.umass.edu/ and Factorie factorie.cs.umass.edu/