Standardizing the World of Machine Learning Web Service APIs

We introduce Protocols and Structures for Inference (PSI) API specification which enables delivering flexible Machine Learning by specifying how datasets, learning algorithms and predictors can be presented as web resources.

By James Montgomery (ICT).

Since the Weka machine learning toolkit was first released 16 years ago the range of software packages for ML has grown substantially, including Orange, Shogun and sci-kit learn. Each toolkit is written in a different programming language, has its own API, strengths and weaknesses, and loyal following. But this variety works against more widespread adoption of ML techniques, since any potential new user (individual or organisation) must invest considerable time into any one toolkit they choose to use. Switching to a different toolkit, or combining the best features of each, is non-trivial.

One way of improving the accessibility of ML toolkits is by providing an abstracted interface to them over the web. Software as a Service (SaaS) eliminates the need for users to install and manage these toolkits, or to learn their particular program language specific API. And over the last five or so years many ML web services have sprung up, including Google’s Prediction API, Microsoft’s Azure ML, Amazon ML, and a host of start-ups such as BigML,, PredicSis and Daitaiku. Like many other web services in the last decade, many of these ML services claim to have RESTful APIs, although these claims are often fairly weak and boil down to the fact that they communicate over HTTP and use the widely-understood JSON data exchange format. Despite this flourishing ecosystem of ML services, improvements in accessibility and interoperability have not necessarily followed. The language each service speaks—HTTP and JSON—may now be consistent, but each service presents its own distinct API while necessarily offering only a subset of inference tools. Composing these existing services to combine their features requires the client to understand each distinct API and to perform a considerable amount of data conversion on the client-side.

At PAPIs’15 this year we’ll introduce the PSI Restful APIProtocols and Structures for Inference (PSI) API specification for delivering flexible ML services. Rather than being a specific implementation, PSI specifies how datasets, learning algorithms and predictors can be presented as web resources. Each resource uses schema to describe its characteristics: dataset attributes can describe the data format they produce, learners can describe the structure of leaning tasks they can process, predictors can describe the types of data they accept and the form of the predictions they make. This allows PSI resources to be composed safely. And as an open specification, the PSI API can be implemented by anyone: a large company could provide a full-featured ML service, a research team could provide datasets, an individual could provide a single predictor, and all these resources can interoperate.

PSI is also mostly RESTful, certainly more RESTful than the majority of ML web services. We don’t hold that RESTfulness is a goal in its own right, instead that the API has been designed to enable automated discovery of a PSI service’s data, learners and predictors, flexible customisation of client controls without prior knowledge of the learner or predictor being used, and extension of the API by allowing services to provide links to related resources in a manner similar to the link element of HTML. We believe PSI offers an excellent way to offer new ML services and that providing PSI interfaces to existing services will improve their utility and broaden their user bases.

So why not come and see why we’re excited about the future of standardised predictive web APIs in Sydney this August? The conference is also the first event ever where the people behind Google Prediction API, Amazon ML, Microsoft Azure ML and BigML will all be on stage.

Learn more about PAPIs ’15, the 2nd International Conference on Predictive APIs and Apps, at

James Montgomery is a lecturer in ICT in the School of Engineering and ICT at the University of Tasmania. He worked on PSI’s development during a postdoc at the Australian National University, where he was based in the Machine Learning Research Group at National ICT Australia.