Gregory Piatetsky, Feb 23, 2012.
The
Skytree startup emerged from stealth mode today and announced its product -
Skytree Server - a machine learning engine.
The secret sauce, according to co-founder and CEO Martin Hack is that Skytree has figured out how to make key machine algorithms very fast on a large scale, up to 10,000 faster according to Skytree benchmarks (see below). Hack is the former head for the secure operating system, Trusted Solaris, at Sun Microsystems.
The development of fast machine learning algorithms was led by
Alex Gray
(with whom I worked on starting KDD conferences 20 years ago), a professor at Georgia Tech, and Director of
FASTlab
focusing on scalable machine learning.
Skytree server can also be called from inside R or another familiar front-end.
Skytree algorithms are currently oriented towards 5 common applications:
- Recommender systems - provide profile-based targeted recommendations (e.g., products)
- Anomaly/outlier identification - finding unusual or 'special case' data records in big data sets
- Predictive analytics - making predictions based on similar historic data
- Clustering and market segmentation - finding natural groups within data
- Similarity search - find the closest existing data matching a record of interest
The underlying algorithms include
- Fast Nearest Neighbors
- Fast K-Means Clustering
- Fast Support Vector Machines Classification
- Fast Linear Regression
- Fast Kernel Density Estimation
- Fast Principal Component Analysis/Singular Value Decomposition
Skytree provides a free edition, limited to 100,000 rows, and an enterprise server version, starting at $2,999.
For a little more technical detail, here is a white paper Analyzing Massive Datasets, by Alexander Gray, Ph.D., CTO, Skytree.
Other coverage of Skytree: