ODSC India Highlights: Deep Learning Revolution in Speech, AI Engineer vs Data Scientist, and Reinforcement Learning for Enterprise

Key takeaways and highlights from ODSC India 2018 conference about the latest trends, breakthroughs and revolutions in the field of Data Science and Artificial Intelligence

The Open Data Science Conference(ODSC India) 2018 was held between August 31st to September 1st in Bangalore, India. It was the first time that ODSC was hosted in India and it attracted 544 participants from 270 companies and 10 countries. 65 speakers from 9 countries delivered talks, case studies, experience reports and demonstrations on a range of relevant and interesting data science and artificial intelligence(AI) topics. 

This blog summarizes few key takeaways highlighting the latest trends and revolutions in the field of Data Science and AI covered in the conference.

Ananth Sankar, Principal Researcher at LinkedIn delivered the opening keynote on Deep Learning Revolution in Automatic Speech Recognition. He talked about the evolution of speech recognition technology that has led to major paradigm shift in speech recognition which has powered products such as voice search and voice assistants like Google Home and Alexa.  He described the main components of speech recognition system as language model, lexicon, triphones, acoustic model and composed search graph

He mentioned that one recent landmark in this field was 32% reduction in the word error rate (WER-metric for speech recognition models, the lower the better) obtained by replacing Gaussian mixture models(GMMs) with deep neural networks(DNNs). Further, in the past few years 15% improvement in WER was achieved by replacing DNNs with LSTMs which are now commonly used for speech recognition. More recently, sequence to sequence recurrent neural network models have been found to have simpler implementations and comparable accuracy to the traditional methods.

He concluded his talk by referring to some key challenges and new areas where this technology can be applied such as accurate combined speaker segmentation and speech recognition, people talking over each other in conversations and sequence models like Listen, Attend and Spell(LAS).

Sheamus McGovern, Founder of Open Data Science, gave the welcome address in which he highlighted the vision behind ODSC and its growth over the years. He opened his talk by referring to the growing scope of AI and how it is being perceived by the most influential tech giants in the world.

He mentioned the vast open source platforms available for Data Science and ML and how the contributions from the open source community have led to the tremendous transformations in the field of AI. He elaborated on the differences in skill sets required for the roles of a Data Scientist and AI Engineer and that in the future there would a bigger demand for AI engineers than Data Scientists.

He concluded his talk by reflecting on the idea that India would be instrumental in building the future of AI as it is has the largest global pool of 5.2 million developers, strong education system, strong language and team skills and highly motivated engineers to up-skill to AI..

The conference had multiple sessions running in parallel and one could choose to attend any of them based on their interests. Below is a summary of few interesting sessions on Reinforcement learning, Sarcasm Detection and Conversational Agents.

Samiran Roy, Data Scientist at Envestnet Yodlee conducted a session on Reinforcement Learning(RL): Demystifying the hype to successful enterprise applications which was focused on giving a practical introduction to RL setting, problem formulation and listing some of the successful industry use cases. He mentioned few challenges associated with RL such as follows,

He then gave an introduction on the components of an RL setting, how to formulate a problem and how RL differs with supervised and unsupervised learning. Roy said, “Supervised ML cannot learn a game better than humans as it’s trained by human player’s data”. He also talked about some recent applications of RL as listed below:

While concluding his talk he mentioned that there’s a huge engineering aspect to RL, deep RL is hard to train and suffers from the problem as deep learning which is interpretability. Also, that reward shaping and simulation environments are important. After his talk, Saurabh Deshpande conducted a workshop on basic concepts of RL which included hands-on coding session in Python using OpenAI gym, Keras and PyTorch packages.

Anuj Gupta, who’s an independent AI researcher gave a talk titled Sarcasm Detection: Achilles Heel of Sentiment Analysis. As most sentiment analysis models fail miserably in handling sarcasm and incorrectly infer the sentiment therefore, sarcasm detection is one challenging problem that the NLP community is trying to ace. He further described sarcasm as a negative sentiment which is difficult to detect because it is subtle, involves play of language and even humans find it difficult to interpret. He then talked about his own approach to the problem of classification of Twitter tweets- whether a tweet is sarcastic or not? His solution was based on 3 typical clues namely sentiment, emotion and personality and he used different classifiers to build the his models including CNN, logistic regression and SVM. The sentiment + emotion features with SVM classifier achieved the best results with 88.43% accuracy.

He concluded his talk while discussing few ideas for possible future work such as training your own word embedding, character n-gram embeddings, attention networks etc.

Another interesting talk about Conversation Agents at Scale: Retrieval and Generative Approaches was given by Swapan Rajdev, CTO, Haptik. He described the retrieval approach which is based on a Dialogue Management System as below:

From the technology point of view, these systems incorporate word embeddings for numerical representation of the text, intent detection and named entity recognition algorithms. He further discussed the advantages and disadvantages of using this approach.

Rajdev also talked about the generative approach which is based on language models and learns from historical conversations to generate new replies for the user without any guidance from the management system.

He mentioned that this approach uses sequence to sequence models which are based on recurrent neural networks(RNNS) (can be LSTMs or GRUs) and discussed the pros and cons of using this particular approach as well.

He suggested that a hybrid architecture that combines these both approaches can be used to build a chatbot in a way that it initially uses the retrieval approach and once it has a lot of data available it also makes use of the generative approach to create replies.

He concluded his talk by giving a few guidelines to keep in mind while designing and building a chatbot for production environment which involves balancing accuracy and performance, whether it is real time or offline and how often does the model required to be retrained.