Negative Results on Negative Images: Major Flaw in Deep Learning?
This is an overview of recent research outlining the limitations of the capabilities of image recognition using deep neural networks. But should this really be considered a "limitation?"
A recent paper by Hossein Hosseini and Radha Poovendran of the University of Washington has driven home a point that is often lost on casual deep learning observers, as relates to image recognition and classification with deep neural networks (DNNs):
DNNs, which are merely trained on raw data, do not recognize the semantics of the objects, but rather memorize the inputs.
The short 3 page paper, titled "Deep Neural Networks Do Not Recognize Negative Images," provides support to its title via experimentation using a "state-of-the-art" deep (convolutional) neural network, which is separately trained on both MNIST and the German Traffic Signs Recognition Benchmark (GTSRB) datasets. The "regular" results of the MNIST and GTSRB recognition and classification reach greater than 99% and 98% accuracy on test data, respectively, while testing with the same negative images result in between 4% and 17% and between 5% and 14% accuracy, respectively.
From the paper:
To assess the behavior of the image classification models, we evaluate the performance of DNNs on negative images of the training data. A negative is referred to an image with reversed brightness, i.e., the lightest parts appear the darkest and the darkest parts appear lightest. These complemented images are often easily recognizable by humans. We show when testing on negative images, the accuracy of DNNs drops to the level of the random classification, i.e., the network maps the inputs randomly to one of the output classes. This shows that the DNNs, which are merely trained on raw data, do not learn the structures or semantics of the objects and cannot generalize the concepts.
These findings definitely lend support the paper's title. However, to those who understand convolutional neural networks, is this much of a surprise?
This is not to belittle the point, however. There is an awful lot of misinformation and misunderstanding of neural networks out there, and not just at the "popular" level; individuals from the public to machine learning researchers can rightfully have a difficult time in keeping up with, and properly understanding, continued advances in the field.
The authors go on to argue the security implications of intentionally -- and with malice -- "fooling" machine learning models with negative images -- or semantic adversarial examples -- but this is really ancillary to the main point of outlining that neural networks "learn" differently than how some individuals believe they do.
So... major flaw, or expected behavior? Whatever your opinion, it is easy to agree that properly framing and conveying capabilities is one of the major overarching issues with buzzword-hyped technologies such as neural networks, as well as machine learning and artificial intelligence more generally.
Related: