Interview: Brian Kursar, Toyota on What You Need to be Truly Data-Driven

We discuss Toyota’s Customer 360 Advanced Analytics and Insights platform, Product Quality Analytics system, Predictive Analytics use cases & performance assessment, and challenges in analyzing data from social media.

Twitter Handle: @hey_anmol

brian-kursarBrian Kursar is Senior Data Scientist and Technology Director - Research and Development for Toyota Motor Sales, USA. In this role, Kursar has end-to-end responsibility for predictive analytics on Big Data across the enterprise. He is the chief architect on the Customer 360 Advanced Analytics and Insights platform at Toyota which powers the Toyota Social Media Intelligence Center as well as the recently launched Customer Return on Experience Analytics Application.

In 2011, his work was recognized in the Gartner BI Excellence awards as a Semi Finalist for architecting Toyota’s first Big Data Analytics application. Kursar has been with Toyota for 12 years. Prior to Toyota, he spent four years developing global supply chain solutions and consulting for the U.S. government in housing and urban - multimedia solutions.

Here is my interview with him:

Anmol Rajpurohit: Q1. What are the main responsibilities of the Customer 360 Advanced Analytics and Insights platform team at Toyota? How does it interact with the Toyota Social Media Intelligence Center?

toyotaBrian Kursar: The Customer 360 Advanced Analytics and Insights Team is made up of Data Scientists and Data Engineers who collaborate with our Business Groups to build predictive models that support key Business Insight initiatives.

The Toyota Social Media Intelligence Center leverages models created by our team to classify Social Media comments as they are posted in real time.

AR: Q2. What were the key aspirations from the Product Quality Analytics system established at Toyota in 2010? How has this system evolved since then? What are the key components of this system?

product-recallsBK: The Product Quality Analytics system was my first exposure to Big Data and NoSQL databases. We had a big problem on our hands during the 2009 Toyota product recalls. The creation of this system focused on bringing several large disparate databases together under a single unified and scalable platform.

Since its inception, we have added additional data sources and increased the depth of historical data. It’s nosqlreally amazing when you think about it. Using older technologies, it could take up to 45 minutes to run certain historical queries. Moving out of our traditional RDBMS and into an In Memory, NoSQL based solution the Product Quality Engineers are able to render 16 of the same queries in less than 10 seconds. This has really move Toyota to become a data driven organization.

Currently, we have implemented dashboards that go across both internal and external data sources to provide industry standard PP100 (Problems per 100) measurements as well as threshold based alerts, text analytics, facet based navigation, geospatial heat maps, and interactive drill down capabilities for data discovery.

AR: Q3.You mentioned recently that roughly 85% of the underlying data is unstructured. What are the top sources of unstructured data?

Understanding customer concerns is a top priority at Toyota. We look at everything from call center transcript logs, survey responses, technician notes, field reports, emails, and even social media.

AR: Q4. What are currently the most prominent use cases of Predictive Analytics at Toyota? Based on what metrics do you assess the performance of your Predictive Analytics models?

confusion-matrixOur Team works on models for several Business Groups at Toyota. The most prominent are the ones around Product Quality, Customer Experience, and Customer Loyalty.

It all depends on the intent of the model. For the majority of what we do, we generally leverage confusion matrices or ROC curves. In some cases we have done customized scoring of models based on ROI.

AR: Q5. What have been the top challenges in applying Predictive Analytics on Social data?

BK: One of the big lessons learned in being successful in applying Predictive Analytics on Social data has been gaining a good understanding for the golden rule that not all Social Data is equal. What I mean is that there are many factors to consider if you want to have a highly accurate and highly domain specific classification model for Social Data.
For instance, when you are looking to classify Social data, you need to ask yourself, “What is the source of this record?” Specifically, where did it come from? Is it a tweet in the middle of the twitterverse, or is it a post on a Brand page? Is it a response to a previous post? Perhaps it is a review on a review site asking a pointed question such as “what do you think of this movie”?”

Context is king. Context, and more importantly, the understanding of how you can evaluate the context of the conversation and leverage it to help you get significantly more accurate results in your predictive models.

AR: Q6. How have these challenges changed in the past 5 years?

BK: I feel that the mentality 5 years ago was all about classifying Social data into buckets of Positive, Negative, and Neutral. Yet, if you in a Call Center, monitoring hundreds of thousands of social interactions a day, even if you accurately classified 80% of the documents, you are still stuck with the problem of understanding why particular posts have been classified as negative and more importantly, sentiment-analyticswhich ones to prioritize and take action on.

Over the last two years, Toyota has moved away from leveraging basic sentiment measurement and has moved toward leveraging machine learning to classify automotive, domain specific customer concerns as a way to proactively prioritize putting the Toyota Customer First.

Technology has played a big part in this. The faster processing times and greater number of Open Source tools being made available has allowed Toyota to bring kaizen (continuous improvement) to our Data Science practice.

Second part of the interview

anmol-rajpurohitAnmol Rajpurohit is a software development intern at Salesforce. He is a former MDP Fellow and a graduate mentor for IoT-SURF at UCI-Calit2. He has presented his research work at various conferences including IEEE Big Data 2013. He is currently a graduate student (MS, Computer Science) at UC, Irvine.