Machine Learning Wars: Amazon vs Google vs BigML vs PredicSis

Comparing 4 Machine Learning APIs: Amazon Machine Learning, BigML, Google Prediction API and PredicSis on a real data from Kaggle, we find the most accurate, the fastest, the best tradeoff, and a surprise last place.





Approximate rank in the Kaggle competition
  • #60 for Amazon
  • #570 for PredicSis
  • #770 for BigML
  • #810 for Google
It’s important to note that, depending on your application, some of these 3 performance measures will be more critical than others. The leaderboard for this Kaggle challenge doesn’t take time into account, but for certain applications where you have to make predictions at high frequency (for instance when you want to predict if a user is going to click on an ad, for every user coming to a high-traffic website), prediction time will be super critical.

Summary

DISCLAIMER: this comparison was performed with a real-world dataset, but you may get different results with another dataset. You should try these APIs with your own data to figure out which is the best for you!
  • PredicSis offered the best trade-off between accuracy and speed by being the second fastest and second most accurate
  • BigML was the fastest in both training and predictions, but less accurate
  • Amazon was the most accurate, but at the cost of being the slowest in training and also very slow in predictions
  • Google was last on accuracy and prediction time


Towards an actual benchmark

This was a very simple comparison but it’s still a bit far from an actual benchmark. One of the first things I’d like to do to improve on this would be to make it easy for others to reproduce (and verify) these results. I used the web interfaces of these services to get the AUC values and it would be better to have code that computes AUC locally. For now, you can check out this repo for evaluating ML/prediction APIs. Pull requests are welcome! (e.g. new APIs, new evaluation metrics, etc.)

In a future benchmark, it would be interesting to also try regression problems, and to try various types of datasets: small, big, unbalanced, etc.

Learn more
  • If you’d like to learn more about PredicSis and BigML, they will both be at PAPIs Connect on 21 May in Paris — come join us!
  • BigML will also be at APIdays Mediterranea on 7 May in Barcelona with an exciting talk by their CTO on the future of ML APIs.
  • I’m giving away free tickets to both conferences! Sign up here for PAPIs Connect and here for APIdays Mediterranea.
  • With these new ML/prediction APIs, I’m thinking of updating my book, Bootstrapping Machine Learning, in which I already covered Google Prediction and BigML… But until then, you might be interested in checking out an excerpt of the current edition in my Machine Learning Starter Kit!


Get Machine Learning Starter kit here

Bio: Louis Dorard, @louisdorard is an independent consultant and General Chair of PAPIs.io, the International Conference on Predictive APIs and Apps. He is also the author of the book Bootstrapping Machine Learning. Louis holds a PhD in Machine Learning from University College London.

Original.

Related: