PredictionIO (Open Source Version) vs Microsoft Azure Machine Learning

Azure Machine Learning and PredictionIO are tools that both have similar visions and similar features, but when digging deeper you’ll notice key differences and key advantages to each.

Azure is lower level than PredictionIO in the sense that it doesn’t have a way to structure data processing flows, and their size on the canvas can be pretty overwhelming at times. In PredictionIO, you can build engines by recombining existing DASE components from different engines. In Azure however you’re stuck with the canvas and with having to “copy” everything (blocks and connections between them) by hand.

Ml Canvas Dase

Azure ML canvas. Blocks corresponding to PredictionIO’s DASE components are highlighted.

It might be fun to experiment with the Azure ML canvas but it removes the ability to program data processing flows, which can potentially limit the types of applications that can be built with it. Also, you won’t be able to use your favorite version control system as you make changes to your model/engine.

If you’re still unsure which tool is best for you, you should try them both! If you have already, please share your thoughts in the comments. Note that Microsoft customers who already have data on Azure can still try PredictionIO by installing it on an Azure VM and connecting it to your Hadoop cluster. Others can use Amazon or for a quick start. In a next post, I’ll compare PredictionIO to Dato (formerly known as Graphlab), another open source alternative for creating predictive applications.

Bio: Louis Dorard is an independent consultant and General Chair of, the International Conference on Predictive APIs and Apps. He is also the author of the book Bootstrapping Machine Learning. Louis holds a PhD in Machine Learning from University College London.