Caltech Prof. Abu-Mostafa on his MOOC course “Learning from Data” and Machine Learning

KDnuggets talks with top Caltech professor Yaser Abu-Mostafa about his current online MOOC course "Learning from Data", Machine Learning, and Big Data.

Gregory Piatetsky, Apr 5, 2013.

Online MOOC courses are very hot today and especially in the area of computer science, AI, and Machine Learning. My exchange of emails with Prof. Abu-Mostafa was initially in connection to his very successful (and free) course "Learning from Data" and it led to this interview.

Yaser S. Abu-MostafaDr. Yaser S. Abu-Mostafa is Professor of Electrical Engineering and Computer Science at Caltech, focusing on Machine Learning and Computational Finance. His PhD from Caltech received the Clauser Prize for the most original doctoral thesis. He also received the Feynman Prize for excellence in teaching. In 1987, he co-founded NIPS, the premiere international conference on machine learning today. In 2005, the Hertz Foundation established the Abu-Mostafa Fellowship in his honor. He taught an online machine learning course "Learning from Data" that attracted more than 200,000 participants worldwide, and co-authored a textbook on machine learning that became #1 bestseller in Computer Science on Amazon. He has served as scientific advisor to several corporations and start-up companies in the US and abroad, including Citibank for 9 years."

Gregory Piatetsky: Congratulations on your very successful online course Learning from Data (free at Learning from ). Is there a danger than free or very low-cost MOOCs will lead to only a few top schools like Caltech surviving, and elimination of local and community schools, which would not be able to compete?

Yaser S. Abu-Mostafa: I believe that MOOCs complement rather than replace current curricula. This will happen in two modes, which I label self-service and full-service modes:

1. Self-service (for my course is described in makes the MOOC available to *instructors* so that they can structure a more effective course and free up their class time for discussion and interactive learning. The MOOC lectures are used here as a high-quality "video textbook" from top experts. The instructor is fully needed for class time discussions, choosing and grading assignments, answering student questions, etc.

2. Full-service (for my course is described in makes the MOOC available to *universities* in its entirety to use as a for-credit-course. For relatively few courses on important subjects that require expertise that is not available in many colleges, this mode makes sense and does not infringe on university instructors.

In both modes, I believe MOOCs wouldn't and shouldn't replace current instructors.

GP: What business model do you expect to emerge for MOOCs? Will they remain free supported by the elite universities, or perhaps charge a fee for completion certificate, or something else?

YA: My personal preference is the free model by elite universities. The reputation dividend should make this worthwhile. However, with current funding pressures, some may explore charging for certificate-related or credit-related services. The danger is to have financial incentives drive the game, which may lead to watering down of courses just to increase the numbers as we have seen in some courses. As long as the university has its academic reputation to worry about, the quality of the courses it produces will be in check. The situation is much more troubling when it comes to for-profit commercial companies. The driving force for MOOCs should be academic standards not maximizing stockholder profits. For-profit companies, by law, have a fiduciary duty to do the latter so they cannot avoid it. There is a good reason why all worthwhile universities are non-profit.

GP: How do you see the differences between Machine Learning, Data Mining, Data Science, and Statistics ?

All of these fields in one form or another use a set of observations (data) to uncover an underlying process (target). Some of the differences between them are legacy differences, and some are differences of emphasis. As a branch of mathematics, Statistics emphasizes full rigorous solutions under concrete assumptions, where the process to be uncovered is a probability distribution. Data Mining is on the practical side, emphasizing algorithmic/computational issues where the observations happen in huge databases. Data Science and Big Data are more of umbrella fields that encompass all aspects of handling and interpreting the data.

GP: Given amazing progress of technology in the last 20 years (with Deep Blue, Google, IBM Watson, Siri, self-driving Google Car as examples of Machine Learning success) what do you think Machine Learning will be able to achieve in 20 years ? Will there be a singularity, as Ray Kurzweil predicts?

YA: I believe Machine Learning will be part of almost every device and process, and will be capable of replicating many of the highly sophisticated cognitive functions. Think of it this way. Every development humans have made started with a person observing and learning. Now we have machines that can observe and learn. They are doing so in progressively more sophisticated ways, and they are getting faster and smarter. When we unleash this potential on all walks of life, we get remarkable systems. It's already happening.

GP: What do you think about the current hype and buzz about Big Data? Is Data Scientist indeed the sexiest job of the 21st century?

YA: There is a clear need for Big Data solutions, and there are enough tools around to make research in Big Data quite promising. It is not clear yet how much it will be based on domain-specific tools versus general Big Data algorithms. Some people feel there is a bit of hype, but I have come to observe that a healthy dose of hype may be actually constructive. In the 1980's, the hype in neural networks created excitement and funding, and attracted very smart people to the field. They may have not delivered an artificial brain, but the focus and momentum led to an established technology used by numerous companies, and opened the door for the huge revival of Machine Learning.

GP: Your advice to students or other professionals who consider a career in Machine Learning or Data Science?

Learning from Data textbook
 Learning From Data
Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin

YA: Understand the fundamentals of the field inside out. Don't treat Machine Learning as a bunch of algorithms and techniques and trendy buzzwords. Data sciences in general are vulnerable to misuse and abuse, and as a professional in this field you should distinguish yourself by really understanding what is going on rather than just applying the latest fad.

GP: What is a recent book that you read and liked?

YA: Well, there is this nice book called Learning From Data :-)

I know it's a shameless plug, but the book does build a solid foundation of Machine Learning concisely and crisply. Just read the reviews.

See also a second part - an internal Caltech interview, where Prof. Abu Mostafa answers questions about his MOOC course.

Prof. Abu-Mostafa also shared his concern that profit-driven entities and those who lobby for legislation to require online credit ( are generating bad will and unnecessary hostility (