There is No Such Thing as a Free Lunch
You have heard the expression “there is no such thing as a free lunch” – well in machine learning the same principle holds. In fact there is even a theorem with the same name.
By Dr Vladimir Dobrynin, Dr Xiwu Han, Mr Alexey Mishenin, Dr David Patterson, Dr Niall Rooney, Mr Julian Serdyuk, Aiqudo
Almost every day we read about companies and their Artificial Intelligence (AI) strategies. Sometimes it feels like an arms race where businesses feel they will get left behind if they can’t claim to have AI and (usually) deep learning embedded somewhere in their product. We have seen this type of thing before, reminiscent of the social media and big data hypes of years gone by. It used to be that companies were tripping over themselves to be seen as “big data” now the focus is on “AI ” as they try to position themselves as appealing to customers and investors – one report estimates as much as 40% of European startups classified as “AI” don’t actually use AI in any material way! The hype around AI has largely been driven by substantial recent progress in the sub field of deep learning. While deep learning has made extraordinary advances over recent years (due to the availability of larger amounts of data, increases in computing power and advances in algorithms) it is not a panacea that should be used to solve all problems yet many organisations have rushed to adopt it as part of their product offering. While it can help solve some problems with remarkable human like ability (sometimes even outperforming humans in tasks such as image recognition and game playing), there are many examples where other types of machine learning are just as effective or even more suitable - providing better decision making, while often requiring less training data, less computing power and less “tweaking”. You have heard the expression “there is no such thing as a free lunch” – well in machine learning the same principle holds. In fact there is even a theorem with the same name. This states that any two optimization algorithms are equivalent when their performance is averaged across all possible problems. This doesn’t mean all algorithms are equivalent for each individual problem, but that on average over many problems they provide the same average performance.
Despite this, many companies still feel the need to over focus on deep learning and brand themselves as such - in fact the term has almost become synonymous with “AI” in many circles. Our goal in this first article is to highlight some of the real (and less publicised) challenges deep learning presents and in the second follow on article, to outline why in our own work, that of personal digital assistants, we have developed our own intelligent algorithms to tackle the problems we need to solve. The third article in the series gives some insight as to how our core algorithms work.
Focusing on Deep Learning when considering an AI strategy for a company may stem from a perception that deep learning solutions are relatively simple to implement - all you need is deep learning software and some input data to analyse. The reality is deep learning applications are very complex and rarely do they produce useful results unless they are in the hands of experts. Data collection, for example is a major problem when building a deep learning model. You need a lot of quality data to train a neural network – this is why it has been the big tech companies such as Facebook, Google and Amazon that have been at the forefront of R&D in this area. They have an almost endless supply of high quality data on which to train models. Even if data wasn’t an issue and you had access to all the data you needed, other challenges include data preparation, data curation, data balancing, parameter tuning, network architecture selection and high computing costs due to the specialist hardware required to provide parallel processing. Constructing an effective deep learning model is an art and successful deployment only comes with experience – (it is because it is so difficult that Google felt the need to develop AutoML, which aims to assist developers overcome many of the issues discussed in this article).
To bypass these challenges sometimes companies chose to use models pre-trained by companies such as Google, Facebook or Amazon who have made a number of models available for free eg –Universal Sentence Encoder (sentence similarity), BERT (question answering and other tasks), Amazon Rekognition (object, people, text, scene and activity identification), and FastText (text classification) to name but a few. While this may solve a lot of the technical challenges relating to building competent deep learning models from scratch in house, it is a commercial risk to rely on these companies to always have their models available and to keep these models up to date. Relying on a third party for a critical technical component of your product or service is a high risk strategy. We only need to look at Huawei for example where there is doubt over their long term rights with respect to using Google services such as maps and play store on their handsets or what happened to Meerkat (a video stream app) which used Twitter’s Opengraph API until Twitter bought Periscope (a Meerkat competitor) and then stopped them from using it. Additionally what happens if you need to tweak the model in some way to your own specific needs? What if you are using a pre trained deep learning model to identify pictures of dogs but your customers decide they also want to be able to recognise cats? It is unlikely the owner of the model will accommodate your request to update it and you cannot tweak the model yourself to do this. (As discussed later in relation to NVIDIA and gaming deep learning models tend to be very specific to a defined problem and if your scope changes you need new training data to build a new model).
Limitations of deep learning
Irrespective of whether you choose to build your own models or rely on third parties there are other limitations users need to be aware of before they decide to go down this route.
Explanation of reasoning - One example is the inability to explain the decision making process. In many domains, such as medicine or defence, it is critical for an intelligent system to be able to explain how a decision was made. This information is encoded deep within the neural network model and is very difficult to extract and interpret. As such currently with deep learning solutions, explanation is incredibly difficult and, in our opinion, unlocking this “black box” characteristic is one of the most commercially important areas of research in this field.
Sensitivity - Other limitations have been identified by researchers showing how they can be very sensitive to small changes in the input data. A minor change can radically change the model’s performance. For example experiments show that image recognition systems can be easily fooled by changing only a few input pixels in an image – those interested in reading further about this can download a paper called "Fooling automated surveillance cameras: adversarial patches to attack person detection.” This can have serious ramifications for many real-world applications – for example how useful would a security camera application used to detect intruders be if it can be so easily fooled?
Specificity - Very narrow domain specificity is another challenge. One current example of this is an application developed by NVIDIA to improve the performance of games. They added Turing Tensorcores to their GPU so they could use deep learning to improve the gaming experience. The technology is called Deep Learning Super Sampling (DLSS) aimed at providing high-resolution gaming at higher frames per second while also improving image quality. It’s a new technology that requires a lot of computing power and it has had mixed reviews from gamers, some of which complain that the images are worse than the non DLSS enhanced images. While the company is working to perfect the technology one thing is clear, for best results the deep learning model must be trained on game specific content. A model trained for one game does not produce quality output when applied to another game. For this reason, developers need to provide game specific data to NVIDIA so they can continuously train the DLSS model via their Saturn V supercomputing cluster.
Scaling costs - Running costs can be daunting also especially if specialist hardware such as GPU’s are required to build and maintain models and service customers.
Performance – this is something that many companies overlook, as mentioned at the start of the article, very often other machine learning approaches can perform as well as deep learning with lower costs and with less technical challenges.
So, the message is yes, deep learning, when in the hands of experienced and knowledgeable experts, has shown enormous potential to solve many types of problems with an almost human like ability – but that doesn’t make it right for every type of problem or company situation. There are challenges with building competent deep learning models that may be beyond the reach of many companies who want to deploy them. Therefore companies should think very carefully about what AI technologies are best suited to their needs and consider their in house expertise before taking the plunge. In the next article we will delve into our own experience with AI when building a digital assistant for smart devices and how we addressed the problem of choosing the best technology for our needs.
- There is No Free Lunch in Data Science
- How to Automate Hyperparameter Optimization
- The problem with metrics is a big problem for AI