Gold BlogAI, Analytics, Machine Learning, Data Science, Deep Learning Research Main Developments in 2019 and Key Trends for 2020

As we say goodbye to one year and look forward to another, KDnuggets has once again solicited opinions from numerous research & technology experts as to the most important developments of 2019 and their 2020 key trend predictions.

It's year end again, and that means it's time for KDnuggets annual year end expert analysis and predictions. This year we posed the question:

What were the main developments in AI, Data Science, Deep Learning, and Machine Learning in 2019, and what key trends do you expect in 2020?

As we look back at what our experts predicted one year ago, we see a mix of what could be considered natural technological progression peppered with some more ambitious forecasts. There were a few general themes, as well as a couple of singular prognoses of note.

In particular, continued fears of AI were mentioned more than once, and this prediction certainly seems to have panned out. Talk of advances in automated machine learning was prevalent, though opinions were split as to whether it would be useful or would falter. I think the jury is still out on this to some degree, but when expectations of the technology are tempered it becomes easier to see it as a useful addition as opposed to a looming replacement. Increased AI for good was also singled out, for good reason, and there are myriad examples to point to the accuracy of this prediction. The idea that practical machine learning would have a reckoning was put out there, signalling that fun and games is coming to an end and it's now time for machine learning to put up. This rings true, with anecdotal evidence of practitioners seeking out these opportunities mounting. Finally, mention of the increased concern surrounding dystopian AI developments regarding surveillance, fear, and manipulation can confidently be added to the successful predictions category by a simple spot-check of the past year's news.

There were also some predictions which have not yet panned out. This is unavoidable in such an exercise, however, and we will leave those for the interested reader to seek out on their own.

Our list of experts this year includes Imtiaz Adam, Xavier Amatriain, Anima Anandkumar, Andriy Burkov, Georgina Cosma, Pedro Domingos, Ajit Jaokar, Charles Martin, Ines Montani, Dipanjan Sarkar, Elena Sharova, Rosaria Silipo, and Daniel Tunkelang. We thank them all for taking time from their busy year-end schedules to provide us with their insights.

This is the first in a series of 3 such posts over the coming week. While they will be split up into research, deployment, and industry, there is considerable and understandable overlap between these disciplines, and as such we recommend you check out all 3 as they are published.

Headshots header image


Without further delay, here are the 2019 key trends and 2020 predictions from this year's group of experts.

Imtiaz Adam (@DeepLearn007) is an Artificial Intelligence & Strategy Executive.

In 2019 organizations developed greater awareness of issues relating to ethics and diversity in Data Science.

The Lottery Ticket Hypothesis paper showed potential to simplify the training of Deep Neural Networks with pruning. The Neuro Symbolic Concept Learner paper showed the potential to combine Logic and Deep Learning with enhanced data and memory efficiency.

Research in GANs gained momentum and Deep Reinforcement Learning in particular received a great deal of research attention including areas such as Logic Reinforcement Learning and Genetic Algorithms for parameter optimization.

TensorFlow 2 arrived with Keras integrated and eager execution default mode.

In 2020 Data Science teams and commercial teams will be more integrated. 5G will act as a catalyst for the growth of an intelligent IoT with AI inferencing on the edge meaning that AI will increasingly enter the physical world. Deep Learning combined with Augmented Reality will transform the customer experience.

Xavier Amatriain (@xamat) is Cofounder/CTO at Curai.

I think it is hard to argue against the fact that this has been the year of Deep Learning and NLP. Or more concretely, the year of language models. Or even more concretely the year of Transformers and GPT-2. Yes, it might be hard to believe, but it has been less than a year since OpenAI first released talked about their GPT-2 language model. This blog post sparked a lot of discussions about AI safety since OpenAI did not feel comfortable releasing the model. Since then, the model was publicly replicated, and finally released. However, this has not been the only advance in this space. We have seen Google publish AlBERT or XLNET, and also talk about how BERT has been the largest improvement to Google search in years. Everyone, from Amazon, and Microsoft, to Facebook seems to have really bought into the Language Model revolution, and I do expect to see impressive advances in this space in 2020 and it seems we are getting closer and closer to passing the Turing Test.

Anima Anandkumar (@AnimaAnandkumar) is Director of ML research at NVIDIA and Bren Professor at Caltech.

Researchers aimed to develop a better understanding of deep learning, its generalization properties, and its failure cases. Reducing dependence on labeled data was a key focus, and methods like self-training gained ground. Simulations became more relevant for AI training and more realistic in visual domains such as autonomous driving and robot learning, including on NVIDIA platforms such as DriveSIM and Isaac. Language models went big, e.g. NVIDIA’s 8 billion Megatron model trained on 512 GPUs, and started producing coherent paragraphs. However, researchers showed spurious correlations and undesirable societal biases in these models. AI regulation went mainstream with many prominent politicians voicing their support for ban of face recognition by Governmental agencies. AI conferences started enforcing a code of conduct and increased their efforts to improve diversity and inclusion, starting with the NeurIPS name change last year. In the coming year, I predict that there will be new algorithmic developments and not just superficial application of deep learning. This will especially impact “AI for science” in many areas such as physics, chemistry, material sciences and biology.

Andriy Burkov (@burkov) is Machine Learning Team Leader at Gartner, and author of the Hundred-Page Machine Learning Book.

The main development is, without doubt, BERT, the language modeling neural network model that increased the quality of NLP on virtually all tasks. Google even uses it as one of the major signals of relevancy -- the most significant update for many years.

The key trends, in my opinion, will be even wider adoption of PyTorch in the industry, increased research on faster neural network training methods and fast training of neural networks on convenience hardware.

Georgina Cosma (@gcosma1) is Senior Lecturer at Loughborough University.

In 2019, we have valued the impressive capabilities of Deep Learning models, such as YOLOv3, for various complex computer vision tasks, particularly for real-time object detection. We have also seen Generative Adversarial Networks continue to attract interest in the Deep Learning community for image synthesis with the BigGAN model on ImageNet generation, and the StyleGAN for human image synthesis. This year we have also realised how easy it is to fool Deep Learning models, some studies also show how deep neural networks are vulnerable to adversarial examples. In 2019 we have also seen biased AI decision-making models being deployed for facial recognition, hiring, and legal applications. In 2020, I expect to see development in multi-tasking AI models which are designed to be generic and multi-purposed, and I also expect to see increased interest in developing ethical AI models, since AI is changing decision making in the health, financial services, automotive and many other sectors.

Pedro Domingos (@pmddomingos) is a Professor in the Dept. of Computer Science & Engineering, University of Washington.

Main developments in 2019:

  • The rapid spread of contextual embeddings. They're less than two years old, but they now dominate NLP, and Google has already deployed them in its search engine, reportedly improving 1 in 10 searches. From vision to language, pretraining a model on large data and then tuning it for specific tasks has become the standard.
  • The discovery of double descent. Our theoretical understanding of how overparameterized models can generalize well while perfectly fitting the training data has improved considerably, in particular with candidate explanations of the observation that - counter to what classical learning theory predicts - generalization error falls, rises and then falls again as model capacity increases.
  • The media and public's perception of AI advances has become more skeptical, with diminished expectations for self-driving cars and virtual assistants, and flashy demos no longer taken at face value.

Key trends for 2020:

  • The deep learning crowd's attempt to "climb the stack" from low-level perceptual tasks like vision and speech recognition to high-level cognitive ones like language understanding and commonsense reasoning will pick up speed.
  • The mode of research where better results are obtained by throwing more data and computing power at the problem will reach its limits, because it's on an exponential cost curve that's steeper than Moore's law and already straining at what even rich companies can afford.
  • With some luck we'll enter a Goldilocks era where there's neither excessive hype about AI nor another AI winter.

Ajit Jaokar (@AjitJaokar) is the course director of the "Artificial Intelligence: Cloud and Edge implementations" course at the University of Oxford.

In 2019, we rebranded our course at Oxford University to Artificial intelligence: cloud and edge implementations This also reflects my personal view i.e. 2019 was the year of  cloud maturity. It was a year when the various technologies we speak of (Big Data, AI, IoT etc) came together within the framework of the cloud. This trend will continue – especially for the enterprise. Companies will undertake ‘digital transformation’ initiatives – where they will use the cloud as a unifying paradigm to transform processes driven by AI (kind of like reengineering the corporation 2.0)

In 2020, I also see NLP maturing (BERT, Megatron). 5G will continue to be deployed. We will see wider applications of IoT when 5G is fully deployed (ex: self-driving cars) beyond 2020. Finally, on the IoT front, I follow a technology called MCU (Microcontroller units) – specifically the deployment of machine learning models o MCUs

I believe that AI is a game-changer and every day we see fascinating examples of AI deployments. Much of what Alvin Toffler predicted in future shock is already with us today – how exactly will AI amplify that remains to be seen! Sadly, the rate of change of AI will leave many people behind.

Charles Martin is an AI Scientist and Consultant, and Founder at Calculation Consulting.

BERT, ELMO, GPT2, and all that! AI in 2019 saw huge advances in NLP. OpenAI released their big GPT2 model--i.e. DeepFakes for text. Google announced using BERT for Search -- the biggest change since Panda. Even my collaborators at UC Berkeley released (quantized) QBERT for low footprint hardware. Everyone is making their own document embeddings now.

What does this mean for 2020. According to Search experts, 2020 will be the year of Relevance *(uh, what have they have been doing?). Expect to see vector space search finally gaining traction, with BERT-style fine-tuned embeddings.

Under the hood, in 2019, PyTorch overtook Tensorflow as the choice for AI research. And with the release of TensorFlow 2.x (and TPU support for pytorch). AI coding in 2020 will be all about eager-execution.

Are big companies making progress with AI ? Reports indicate a 1 in 10 success rate. Not great. So AutoML will be in demand in 2020, although I personally think, like making great search results, successful AI requires custom solutions specific to the business.


Ines Montani (@_inesmontani) is a software developer working on Artificial Intelligence and Natural Language Processing technologies, and the co-founder of Explosion.

Everyone is opting for "DIY AI" instead of cloud solutions. One factor driving this trend is the success of transfer learning, which has made it easier for anyone to train their own models with good accuracy, fine-tuned to their very specific use case. With one user per model, there's no real economy of scale for a service provider to exploit. Another advantage of transfer learning is that datasets don't need to be as large anymore, so annotation is moving in-house as well. The in-housing trend is a positive development: commercial AI is a lot less centralized than many people thought it would be. A few years ago, people worried that everyone would get "their AI" from just one provider. Instead people aren't getting their AI from any provider – they're doing it themselves.

Dipanjan Sarkar is Data Science Lead at Applied Materials, a Google Developer Expert - Machine Learning, an author, consultant, and trainer.

The major advancements in the world of Artificial Intelligence in 2019 have been in the areas of Auto-ML, Explainable AI and Deep Learning. Democratization of Data Science remains a key aspect since the last couple of years and various tools and frameworks pertaining to Auto-ML are trying to make this easier. The caveat still remains that we need to be careful when using these tools to make sure we don’t end up with biased or overfit models. Fairness, accountability and transparency still remain key factors for customers, businesses and enterprises to accept decisions made by AI. Hence Explainable AI is no longer a topic just restricted to research papers. A lot of excellent tools and techniques have started making machine learning model decisions more interpretable. Last but not the least, we have seen a lot of progress in the world of Deep Learning and Transfer Learning especially for Natural Language Processing. I expect to see more research and models coming up in 2020 around areas of Deep Transfer Learning for NLP and Computer Vision and hopefully something which looks at taking the best of Deep Learning and Neuroscience which can lead us towards true AGI.

Elena Sharova is Senior Data Scientist at ITV.

By far the most important ML developments of 2019 were made with deep Reinforcement Learning in playing games with DeepMind DQN and AlphaGo; leading to the retirement of the Go champion Lee Sedol. Another important advance was in natural language processing with BERT (deeply bidirectional language representation) being open-sourced by Google and Microsoft leading the GLUE benchmark with the development and open sourcing of the MT-DNN ensemble for pronounce resolution tasks.

It is important to highlight the European Commission’s publication of Ethics Guidelines for Trustworthy AI – the first official publication setting out sensible guidelines for lawful, ethical and robust AI.

Finally, I am sharing with KDnuggets readers that all the keynote speakers at PyData London 2019 were women – a welcome development!

I expect that the main ML development trends of 2020 will continue within NLP and computer vision. Industries adopting ML and DS have realised that they are overdue defining shared standards for best practices in hiring and retaining data scientists, managing the complexity of projects that involve DS and ML, and ensuring the community remains open and collaborative. Thus we should see more focus placed on such standards in the near future.

Rosaria Silipo (@DMR_Rosaria) is Principal Data Scientist at KNIME.

The most promising achievement in 2019 has been the adoption of active learning, reinforcement learning, and other semi-supervised learning procedures. Semi-supervised learning might offer a hope to take a stub at all these unlabelled data currently populating our databases.

Another great advancement has been the correction of the word “auto” with “guided” within the autoML concept. Expert intervention seems to be indispensable for more complex Data Science problems.

In 2020, data scientists will require a rapid solution for easy model deployment, constant model monitoring, and flexible model management. Real business value will derive from these three final parts of the Data Science life-cycle.

I also believe that a more extensive usage of deep learning black-boxes will raise the problem of Machine Learning Interpretability (MLI). We will see at the end of 2020 whether MLI algorithms are up to the challenge of explaining exhaustively what is going on behind closed doors of a deep learning model.

Daniel Tunkelang (@dtunkelang) is an independent consultant specializing in search, discovery, and ML/AI.

The cutting edge of AI continues to be focused on language understanding and generation.

OpenAI announced GPT-2 to predict and generate text. OpenAI did not release the trained model at the time, out of concern for malicious applications, but they eventually changed their mind.

Google released an on-device speech recognizer that fits in 80MB, making it possible to perform speech recognition on mobile devices without sending data to the cloud.

Meanwhile, we're seeing a crescendo of concern around AI and privacy. This year, all of the major digital assistant companies faced backlash around employees or contractors listening to users' conversations.

What does 2020 have in store for AI? We'll see further advances in conversational AI, as well as better generation of images and video. Those advances will raise even greater concerns around malicious applications, and we'll probably see a scandal or two, especially in an election year. The tension between good and evil AI isn't going away, and we'll have to learn better ways to deal with it.