H2O World 2015 – Day 2 Highlights

Highlights from talks delivered by machine learning experts from H20.ai, Jawbone, Stanford, Quora & PayPal at H2O World held in Mountain View.

h2o-worldThe Machine Learning community gathered last week (Nov 9-11) at Computer History Museum, Mountain View for a very successful conference – H2O World 2015. H2O is the leading open source machine learning platform for smarter applications. H2O.ai was selected as a Gartner Cool Vendor in Data Science for 2015.

During the course of three days, there were many great tutorials and talks, some of them from machine learning industry experts well recognized across the industry.

Highlights from Day 1

Here are highlights from Day 2:

Sri Ambati(CEO, H2O.ai) and his team talked about some great features of H2O - 3. Talking about algorithms, Arno Candel(Chief Architect, H2O.ai) mentioned that H2O - 3 has "GLRM - Generalized Low Rank Modeling" which unifies various data science methodologies such as PCA, K-means, matrix factorization, imputation, etc. Co-ordinate descent and L-BFGS are new big-joinssolvers for GLM. H2O - 3 provides N-fold cross validation for all algorithms, Multinomial GLM, etc. He gave a quick demo of all these great features.

Cliff Click(CTO, H2O.ai) and Matt Dowle(Main author of R's data.table & Hacker, H2O.ai ) gave a tutorial on working with H2O using Python and R. Matt mentioned that data.table's radix join has been parallelized and distributed. High cardinality 1bn/1bn/1bn row join takes about 10 minutes when using data.table, while it takes just 1.4 mins with H2O 4 node 128 core. He also demonstrated the same.

monica-rogatiMonica Rogati, VP of Data, Jawbone gave a keynote on future world of "data natives" - people who expect technology to learn and adapt. She talked what it takes to build data products — from analytics, exploration and technical challenges to the role of user feedback and machine learning. Monica said "The world is eating the software and we are on the way to build that world". Data products should be smart and context sensitive. Understanding the data correctly can help solve many problems around the world. She mentioned the following key ingredients to understand data:
  • Good Instrumentation
  • Reliable Data Flow
  • Data Cleaning
  • Fast Iteration
  • Good User Interface
  • Context

She stated following key ingredients for building successful data products:
  • Obsessive Instrumentation
  • Production-Grade Data Flow
  • Impeccable Data Cleaning
  • Faster Iteration
  • Machine Learning

stephen-boydProf. Stephen Boyd, Stanford delivered a lecture-style talk on "Consensus Optimization and Machine Learning". After introducing convex optimization problem he mentioned that convex optimization has a lot of applications. Though various examples, he emphasized that Model fitting via convex optimization works really well in practical application but is not used as widely as it should be. Briefly talking about converting a problem into consensus form he showed ADMM consensus can do machine learning across distributed data sources without moving the data and getting same model as if you had collected all data in one place.
Continued on Page 2.