Top 10 Machine Learning Projects on Github
The top 10 machine learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
5. Pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
★ 3799, 598
Pattern is a Python-based web mining toolkit coming out of the Computational Linguistics & Psycholinguistics (CLiPS) research center at the University of Antwerp. In this context, it has tools for the tasks of scraping, machine learning, natural language processing, network analysis, and visualization. Pattern can also easily mine data from several well-known web services. The project claims to be well-documented, and to include numerous examples and unit tests.
6. NuPIC (Numenta Platform for Intelligent Computing)
A brain-inspired machine intelligence platform, and biologically accurate neural network based on cortical learning algorithms.
★ 3647, 987
NuPIC implements the Hierarchical Temporal Memory (HTM) machine learning algorithms. HTM is an attempt to model the computation of the neocortex, and focuses on storing and recalling spatial and temporal patterns. NuPIC is ideally suited to pattern-related anomaly detection.
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
★ 2949, 827
Vowpal Wabbit aims for speedy modelling of massive datasets, and supports parallel learning. The project was started at Yahoo! and is currently developed at Microsoft Research. Vowpal Wabbit harnesses out-of-core learning, and has been used to learn a tera-feature dataset in an hour across 1000 compute nodes.
8. aerosolve
A machine learning package built for humans.
★ 2538, 245
aerosolve attempts to be different from other libraries, focusing on human-friendly debugging facilities, Scala code for training, an image content analysis engine for easy image ranking, and a feature transformation language giving users flexibility and control over features. aerosolve implements thrift based feature representation, wherein features are logically-grouped for the purposes of applying transformations to, or facilitating interactions between, entire features groups at once.
9. GoLearn
Machine Learning for Go.
★ 2334, 215
GoLearn is an actively developed machine learning library for Go. Its goals are to provide a fully-featured, simple-to-use, customizable package for Go developers. GoLearn implements the familiar (to many) fit/predict interface of Scikit-learn, making it easy to swap out estimators, and implements "helper functions" like cross validation and train/test splitting.
10. Code for Machine Learning for Hackers
Code accompanying the book "Machine Learning for Hackers."
★ 2003, 1446
This repo contains the code from the O'Reilly book Machine Learning for Hackers. All repo code is in R, relies on numerous R packages, and topics covered include the all-too common tasks of classification, ranking, and regression, as well as statistical procedures such as principal component analysis and multidimensional scaling.
* Determined by the top returned results to the query "machine learning" on Github search, sorted by most stars, as of December 10, 2015, 1:00PM EST.
Related: