Interview: Ali Vanderveld, Groupon on Vital Ingredients of Analytics-powered Sales Force

We discuss the role of Analytics at Groupon, deciding factors for merchant priority, limitations of historical data, optimizing the efforts of sales force, data characteristics and dealing with Data Sparsity.

ali-vanderveldAli Vanderveld is an astrophysicist turned data scientist. After receiving her PhD from Cornell University she worked as a postdoctoral scholar at Caltech and the NASA Jet Propulsion Laboratory, and then as a research fellow at the University of Chicago. During this time she was a member of the development teams for several space telescope missions, including the SuperNova Acceleration Probe, the High Altitude Lensing Observatory, and Euclid. Ali joined Groupon in 2013 and she currently works on the Data Science team, where she uses historical data and machine learning techniques to study issues pertaining to sales and marketing.

Here is my interview with her:

Anmol Rajpurohit: Q1. What are the common use cases of Analytics at Groupon? Which of them are the most challenging?

groupon-logoAli Vanderveld: At Groupon, we use analytics to provide data-driven answers to business questions, to develop new products and to optimize various processes. We use analytics in every stage of our business, from finding the best merchants for each market, to pairing customers with the most relevant deals and optimizing the layout of each page of our website. Some of the most challenging use cases deal with user-level modeling, for instance in determining relevance, due in part to such a large yet sparse data set.

AR: Q2. What are the important factors in deciding the priority order of merchants to be approached by the Groupon sales force?

priorityAV: The most important factors in determining the ranking include the demand for each service in each market, the quality and riskiness of each merchant and the probability to close each lead. We use different types of predictive models for each, tailored to the needs of each category.

AR: Q3. What are the limitations of predicting demand based only on historical sales data?

AV: Predicting demand from historical sales data is particularly difficult in cases of very rare or new services, where we may not have sufficient data for a time series model. In addition, sales-velocity-based demand can lead to “self-fulfilling prophecies.” For example, if Groupon only offered pizza on our site, then we would only ever forecast demand for pizza, and we would forecast zero demand for everything else.
AR: Q4. Can you explain the term "pull-based demand forecasting"? How does it help detect inventory gaps?

demand-forecastingAV: Due to the limitations of demand forecasting from only historical sales data, we also supplement it with demand forecasting from search query data. The Groupon website and app feature prominent search bars, and we track the query string and location for every search event on each platform. We then match these events to our deal taxonomy to find instances where users are searching for something but then not making a purchase. As a result, we use low conversion rate searching as an inventory gap detector.

AR: Q5. What are the key capabilities of Quantum Lead? How does it optimize the efforts of sales representatives? groupon-quantum-lead

AV: Quantum Lead uses these models to prioritize merchants, and then it matches those leads to the best-suited members of our sales force. Then when each sales rep comes into work in the morning, he/she has a ranked “to do” list of merchants to reach out to.

AR: Q6. What has been the feedback from Groupon sales force? What do they feel about being driven by Analytics (and less so by their gut instinct)?

salesforce-feedbackAV: It is of course difficult to earn trust in an algorithm, especially when the user has domain expertise that they already trust. This is why we’ve had focus groups with the sales team to put some faces behind the data science team and to make ourselves available to answer any questions. We also include “sales value reasons” for each lead in the UI, to let the rep know why it is prioritized the way it is. These reasons could include things like good customer reviews, prior deal performance or high refund risk. Reps also have a mechanism to “flag” anything that they find suspicious, and this feedback is worked back into the system.

AR: Q7. What are the unique characteristics of the datasets that you deal with? How do you address the problem of Data Sparsity?

AV: Historical sales data can be sparse and search query data can be messy. data-characteristicsWe deal with the sparsity problem by employing fallbacks, using our deal taxonomy. Deals are categorized in terms of service and merchant type in a tree structure with different levels of granularity. If we lack sufficient data at the smallest granularity level, then we fall back and use results from the next level up. We also employ geography-based fallbacks, using data from larger geographic regions if we lack sufficient data on smaller scales.

Second part of the interview will be published soon.

anmol-rajpurohitAnmol Rajpurohit is a software development intern at Salesforce. He is a MDP Fellow and graduate mentor at UCI-Calit2. He has presented his research work at various conferences including IEEE Big Data 2013. He is currently a graduate student (MS, Computer Science) at UC, Irvine.