Exclusive: Interview with Jeremy Howard on Deep Learning, Kaggle, Data Science, and more

My exclusive interview with rock star Data Scientist Jeremy Howard, on his latest Deep Learning course, what is needed for success in Kaggle, how Enlitic is transforming medical diagnostics, and what Data Scientists should do to create value for their organization.

Jeremy HowardJeremy Howard, @jeremyphoward, is a true rock star in the world of Data Science. He was a precocious child, receiving some of the highest scores on tests in Australia, but was bored in school. He began an entrepreneurial career at age 12, selling pirated computer games, and was hired by McKinsey at age 18, as a self-taught computer and data analysis wizard. After a few years there he started Optimal Decisions Group which used data analysis to help insurance firms increase profits. His second startup, FastMail, was a very popular email provider. After successfully selling both in late 2000, he briefly retired, and took up hobbies like learning Chinese and building amplifiers.

Looking for an intellectual challenge, he entered a competition in 2010 at Kaggle, and was surprised to win the first place. He joined Kaggle as President and Chief Scientist and helped grow Kaggle to its dominant position. He left Kaggle in Dec 2013. In 2014 he started Enlitic with the mission of using Deep Learning to improve medical diagnostics and clinical decisions.

I first met Jeremy at KDD-2011 (?) conference, where he gave an unforgettable talk about Deep Learning. He did not have any slides or projector, but took a marker and proceeded to write on a white board (the only such speaker I can remember in history of KDD), explaining his ideas with surprising clarity and brilliance.

Jeremy's latest startup is fast.ai - read the details below.

See also an in-depth profile of Jeremy and Enlitic in Sydney Morning Herald (May 2016), and his TED talk: The Wonderful and terrifying implications of computers that can learn which gathered almost 2 million views.

Gregory Piatetsky, Q1. Tell us about your latest start-up fast.ai - what is it planning to do? How is your course Deep Learning for coders"Deep Learning for Coders" different from other Deep Learning courses?

Jeremy Howard: There are a number of deep learning courses available online, but none of them met what we felt were the most important needs. We wanted to show people how to select and use the most effective deep learning techniques for their problems. And we wanted to make it as accessible as possible, without dumbing it down.

Previous approaches were either highly mathematical (such as the Oxford course) or too high level to be of much use in solving anything but the most basic problems (such as the Udacity course).

We have seen again and again that deep learning can provide state-of-the-art results, but to get these results requires getting a lot of little details right. And these little details are not shared in papers or in books or in online courses. There the kinds of things which get discussed directly among practitioners. Furthermore, we have seen very little discussion about important practical matters like: how to train your model in a reasonable amount of time, spending a reasonable amount of money.

We realised based on our analysis of and to and solutions to a number of deep learning projects, that the most important thing for us to teach is transfer learning. That refers to using existing models, which have already been trained on large datasets, to provide a helpful starting point for your model. Using transfer learning can speed up training time by many orders of magnitude, provide much more accurate models, and require much less data.

We also wanted to separate out the latest research fads from those things which really work. So we made sure that we only taught methods which we could actually show can provide state-of-the-art results on real-world problems. We've heard from a lot of people now that our deep learning MOOC has allowed them to dramatically improve the accuracy and speed of their model, so it seems to be working!

GP: Q2. Before fast.ai, in 2014 you founded EnliticEnlitic, whose goal is to use deep learning to make doctors faster and more accurate, initially in radiology. What progress did they make and how do they compare with trained radiologists?

JH: I don't know the latest, because I haven't been there for quite a few months. However everything I saw in my time studying medical deep learning showed that the opportunities here are enormous. It is a huge field, with many specialties and sub-specialties, and everywhere we looked we saw opportunity. Most importantly, the opportunities have the potential to save lives, and greatly reduce healthcare costs, particularly in the developing world; this is where the needs are greatest.

(GP: Sydney Morning Herald reports:
Enlitic pitted its algorithm against four top radiologists. The humans failed to spot 7% of the cancers. Enlitic identified them all. The humans incorrectly diagnosed cancer in 66% cases, Enlitic in 47%.

GP: Q3. What are the obstacles for adopting Enlitic and similar automated technology in healthcare?

JH: One of the greatest obstacles is the lack of integrated datasets - that is, datasets that show a history of medical tests, interventions, and outcomes, over a long period of time, links together for each patient. It is only with such a dataset that you can build models which can provide diagnoses and treatment recommendations based on actual medical outcomes, rather than initial diagnostic guesses.

Another obstacle is the lack of data scientists working in this field. I am surprised at how many smart and capable people are deciding to spend their time on relatively low impact areas like advertising technology, product recommendations, and minor social network features. Also, a lot of deep learning researchers are focused on "building a brain", rather than on solving current problems of most significance to humanity.

A particular obstacle that surprised me is that medical experts are so specialised, that it is hard to find anybody who can provide educated advice about more general medical problem solving. Deep learning has the ability to solve problems right across the medical spectrum, so the traditional specialised approach to medicine quite a roadblock.

GP: Q4. You are probably most well-known for being a Kaggle GrandmasterKaggle top-ranked competitor and later Kaggle President. What were the highlights of your time at Kaggle? What advice do you have for competitors that want to improve their Kaggle Ranking?

JH: My time competing on Kaggle was the highlight for me - in fact, I learned more about machine learning during that time than in the two decades prior to it. Another highlight has been the great enjoyment I have taken in the last few months studying a number of Kaggle datasets in some depth, in preparation for our course; it's been fascinating to see how some recent advances in deep learning makes it possible to get (what would have been) the highest rank in some competitions very quickly and easily.

For competitors who wish to improve their ranking, or indeed for any machine learning practitioners who wish improve their skills, my advice is simple:
submit an entry to a competition every single day.

Ideally, try to spend at least 30 minutes creating that entry; but even just spending five minutes tweaking some parameters is better than nothing. If you submit thing every day, then by the end of the competition will have learned a great deal, and you will learn even more when the winners blog posts come out. In your day-to-day work you will have very few, if any, opportunities to work on such rigorously defined datasets and metrics, and you certainly will not have a chance to benchmark yourself against some of the world's best data scientists.

GP: Q5. With Data Science also being increasingly subject to automation, what skills should Data Scientists focus on to avoid being replaced by an algorithm, at least in the next 5 years?

JH: I hope that in the coming years role of "data scientist" will greatly lessen, and instead we will see data science being incorporated into other jobs, such as medical specialists, lawyers, logistics managers, and so forth. Therefore, I think that data scientists should develop an understanding of how organisations create value, and how different industries work, along with how organisations are structured. Most importantly, they should find ways to rigorously test their impact on the organisations they work for, and work with domain experts at those organisations to find out how to increase the impact.

I am not sure what, if any, of today's technical skills will be still important in five years time. What matters is how well you can adapt and learn.

GP: Q6. What do you expect Deep Learning technology to be able to achieve in 5 years? Will DL be able to eventually match and exceed human performance in every field, or are there some areas where humans will always remain ahead?

JH: It is very hard to know what the limits of deep learning are, or how long they will take to achieve, since we seem so far away from finding those limits at the moment. Nearly every time I have seen somebody try to improve the solutions to their particular problems using deep learning, I have seen them be successful. For instance, a PhD candidate in medical Informatics told me that five hours after training deep learning on his project he had greatly surpassed the results of his previous five years of research!

In areas of creativity and the display of skill, humans will always remain ahead, because humans are only interested in observing the performance of other humans. For example, in the area of creativity and art, see this fantastic post by Mike Loukides.

GP: Q7. You are (or were) the youngest member at Singularity University. What do you do there and what is your opinion on singularity - will it arrive and when? What will humans be doing afterwards?

JH: Actually, I don't think I'm the youngest any more! I teach data science there. One of the highlights each year for me is getting to teach at the Global Solutions Program. 80 of the smartest and most passionate people in the world come together each year to try to work on some of humanity's most pressing issues, and I am lucky enough to teach them how they can use data science to help with this.

Singularity University is not a university, and has nothing to do with the singularity. Perhaps one might say that it is poorly named... ;) I honestly don't know if there will be a technological singularity, and I don't really see how anybody can claim to know that this will happen, let alone when it will happen.

GP, Q8: I was not sure about asking you about why you left Kaggle and Enlitic, since such topics usually involve personal considerations, but if you want to comment on why you left those companies, that would be interesting to our readers.

JH: Leaving Kaggle was an easy decision. I never actually meant to be a full-time member of the company, but started out just helping as a volunteer. Much to my surprise, we raised a lot of money from venture capitalists, at which point I didn't have much choice but to join full-time! When Kaggle made their ill-advised decision to focus 100% on the oil and gas analytics business, there was no reason for me to stay - and I had been dying to spend more time researching into how deep learning can make a difference to society. I spent the next year studying this, which led me to move into medical informatics.

Leaving Enlitic was much harder. But I had already been away for a year, dealing with a family medical emergency. The company that I returned to was very different to the one that I had built. Before I started Enlitic I spent a lot of time trying to decide whether the best way to make an impact in medicine would be through going into academia or through starting a start-up. Based on this experience, it now seems to me that externally funded start-ups are not a good choice for solving problems that still need a lot of fundamental research to be done. There is too much pressure from investors and staff who wish to see their equity value rise as quickly and as much as possible.

Having said that, I'm not sure that academia is much better, which is why I've started a self-funded research institute, fast.ai, together with Rachel Thomas.

GP: Q9. What do you do for fun when away from a computer? What recent book you read and liked?

JH: My most enjoyable activity is playing with my baby daughter. I love how interested and curious she is in absolutely everything! I spend as much time reading deep learning papers as possible, which means that I don't really read other books. I enjoy reading deep learning papers so much that it would be hard for me to find something I enjoy reading more! Having said that, at night time I do listen to audiobooks - at the moment I am enjoying PG Wodehouse.