Are High Level APIs Dumbing Down Machine Learning?

Libraries like Keras simplify the construction of neural networks, but are they impeding on practitioners full understanding? Or are they simply useful (and inevitable) abstractions?

By Matthew Mayo, KDnuggets Managing Editor on April 16, 2018 in API, Deep Learning, Francois Chollet, Keras, Machine Learning, Neural Networks, TensorFlow

comments

David Ha (@hardmaru), research scientist at Google, recently (February 9, 2018) tweeted the following:

Implementing fully connected nets, convnets, RNNs, backprop and SGD from scratch (using pure python, numpy, or even JS) and training these models on small datasets is a great way to learn how neural nets work. Invest time to gain valuable intuition before jumping onto frameworks. https://t.co/biP02iWsjd

— hardmaru (@hardmaru) February 9, 2018

This elicited a series of response tweets from François Chollet (@fchollet), creator of Keras, which, when considered collectively, presents a different point of view. A couple of the standout responses are shown below:

Grad students knew how to implement neural nets in C in 2000. And they didn't have good intuition about them. A high school student playing with NN frameworks in 2018 can develop stronger understanding of NNs in a matter of days -- just thanks to a better application context

— François Chollet (@fchollet) March 7, 2018

It's a good thing that the next generation is moving up the abstraction stack, telling students to go back to the start is not a good learning strategy in my opinion

— François Chollet (@fchollet) March 7, 2018

A number of other well-respected machine learning researchers and engineers got in on the conversation, including Jeremy Howard and Emmanuel Ameisen among others (Ha's original tweet was actually a response to something that Denny Britz tweeted, it should also be noted). Some good points were made on both "sides" of such an "argument."
‏
High-level machine learning APIs -- in particular Keras, TensorFlow Layers, Scikit Flow, TensorFlow Estimators, among others -- kick abstraction up a notch. Theano and plain vanilla TensorFlow start seeming considerably more low-level in comparison. We can widen our scope even further and consider more general libraries like Scikit-learn; not long ago, such a well-rounded toolkit of algorithms which includes a wide variety of helpful implementations under a single roof, with a consistent API, would have been been nearly unimaginable, to say nothing of its convenient level of abstraction.

Implementing machine learning models becomes much more easily realized with such libraries. But do the standard fit/predict API of Scikit-learn or the simplicity of stacking sequential layers with Keras go too far up the abstraction totem pole for newcomers to genuinely appreciate the underlying theory of the algorithms? Or do they free practitioners from the trouble of having to code the functionality they may well (or may not) understand perfectly fine?

This technology abstraction conversation could just as easily be had about programming languages as machine learning, or even about the concept of abstraction more generally. It's true that an assembly language is not the usual programming weapon of choice these days, with coders generally arming themselves with something further up the abstraction stack, and that the idea of what constitutes high-level languages evolves over time.

I would argue, however, that it's also true that someone picking up Javascript on their own while coding haphazardly without a computer science foundation is not the same as a computer science graduate student with a firm CS base (or someone without formal CS education but a firm grasp of its underpinnings) venturing out to implement their ideas using the same language. I don't believe this is the same argument that Chollet is making (he never states that people without an understanding of neural networks would be able to effectively code them), but he is almost certainly promoting a steady ascension of the abstraction stack as the more valid learning experience.

While "best" approaches to learning, especially their differing nuances, will never be fully-generalizable and always dependent in part on the learner, it seems reasonable that a progressive hybrid approach combining both theory and practice can't be excluded from consideration as a go-to choice for learning neural networks and their implementation. Again, nuance matters, but some balance of keeping theory from tripping up practice, and vice versa, and avoiding a theory front-loaded program preventing learners from being motivated by seeing (and doing) deep learning in action early on seems advantageous. This seems rather like the approach fast.ai has taken, an approach which could also confirm the validity of projects like Keras early on as ways to try stuff out quickly as a starter, as well as how to get stuff done even after you have built some stronger understanding.

Just one man's opinion... what say you?

Related:

Are High Level APIs Dumbing Down Machine Learning?

More On This Topic

Latest Posts

Top Posts