Report – MLconf: what industry leaders say about machine learning

MLconf hosted in 4 different cities, NYC, Seattle, Atlanta and San Francisco with speakers from big, established companies and from emerging startups, bringing more ideas and experience into the game.

By Nick Vasiloglou (MLconf Technical Chair).

MLconf MLconf is a community event for machine learning practitioners that come from the industry as well as theoreticians in the industry.

Theoreticians? Yes because as Kurt Lewin said “There is nothing more practical than a good theory”! Speakers are free to express their progress and results with their respective work within their careers in machine learning. I’ve enjoyed observing the challenges and the innovation paths of ML, while attending presentations at the past 3 years of MLconf events. First of all, it is really encouraging that after so many decades of research in machine learning and with so many algorithms in the literature, there are still new ones being used and developed. MLconf speakers present on their results and demonstrate the speed that these methods get absorbed in the production systems.

A great example was Steffen Rendle’s factorization machines invented in 2012, which he presented last November in San Francisco. Many other presenters already mentioned the success they had in production, such as Justin Basilico from Netflix, who presented last year in Atlanta and many others. An upcoming talk in this vein is Retail Demand Forecasting with Machine Learning, which will be presented by Ron Menich on March 27th in NYC. Another interesting aspect of Machine Learning has been the introduction of machine learning practices inside business operations. Elizabeth Elhassani from LexisNexis and Dan Mallinger from Think Big Analytics presented on their experiences with this last year in Atlanta. This year, Dan Mallinger will return in New York to present his views on that problem. As always distinguished scientists from big companies like Corinna Cortes, from Google, Ewa Dominowska, from Facebook, and Edo Liberty from Yahoo, have focused on scaling simple algorithms like nearest neighbors and decision trees with smart algorithmic and engineering tricks.

It is a common belief knowledge in the Machine Learning community that adding more data improves accuracy, so it makes sense to focus on scaling. Last Year, Xavier Amatriain from Netflix, now at Quora, gave a talk (10 lessons learned from building real life machine learning systems) that shook the audience and destroyed the myth that the power is only in the data, explaining when this is true and when the power is in the algorithms. The talk had also a lot of other interesting points as well. Netflix @Xamat: More Data vs Better Models

Of course it has been fascinating to watch the evolution of machine learning platforms like H2O, Graphlab (now Dato), Oracle, Ayasdi, Intel, Systap, Spark, and Ufora, which has been added this year and how they are placing their value proposition. MLconf has provided an x-ray of their time line and how they start reaching maturity. Most of them started developing fast and scalable algorithm implementations and started to gain massive adoption when they properly addressed the necessary engineering requirements.

Deep learning has a strong presence too, with presentations from companies such as Pandora and Google. This year in New York, we are particularly interested in seeing the details of a large scale implementation with GPUs on Facebook. We started with the pleasant surprise of new algorithms making it production, and we want to close with the new paradigms that come up such as probabilistic programming as it was presented last year in San Francisco by Lise Getoor.

This year we are very excited to have MLconf hosted in 4 different cities, NYC, Seattle, Atlanta and San Francisco with speakers from big, established companies and from emerging startups, bringing more ideas and experience into the game. Mention “kdnuggets” when registering and save 15% on tickets!

Nick VasiloglouNick Vasiloglou is a Technical Chair of MLconf- The Machine Learning Conference. His PhD from Georgia Tech was focused on scalable machine learning over massive datasets. His work has resulted in patents and production systems.