Predictive Analytics Innovation Summit, Chicago – Day 1 Highlights

Highlights from the presentations by Predictive Analytics leaders from Netflix, LinkedIn and Mashable on day 1 of Predictive Analytics Innovation Summit 2014 in Chicago.

ieanalyticsPredictive Analytics Innovation Summit was held by Innovation Enterprise in Chicago during Nov 12-13, 2014. It provided a platform for leading executives to share interesting insights into the innovations that are driving success in the world's most successful organizations. Data scientists as well as decision makers from a number of companies came together to learn practical predictive analytics from top companies like Amazon, Twitter, Verizon, Microsoft, etc. Industry leading experts shared case studies and examples to illustrate how they are using analytics to innovate in their organization.

Here are highlights from Day 1 (Friday, Nov 12):

Kelly UphoffKelly Uphoff, Director, Experimentation & Algorithms for Growth & Targeting, Netflix talked about Quasi-Experimentation at Netflix. Three main focus areas of the company are: Content, Product and Marketing. She leads a team of data scientists who build algorithms and design and analyze experiments in the areas of original content promotion, signup flow optimization, messaging, fraud, customer service and marketing.

Netflix Experimentation has a strong foundation of product innovation methodology with goal to maximize revenue. Majority of product experiments are run as A/B tests. A fundamental principle of A/B testing is user-level random assignment into distinct test and control groups. What if one wants to measure the impact of outdoor media vs. other types of local spend like radio or print? How can one answer such questions within the construct of an A/B test? He/she cannot. These questions are answered via quasi-experimentation in which different time periods, different regions, or a combination thereof serve as our "quasi" controls.

Quasi-experiments need a clear learning objective and picking the right test & control regions is key. It also requires monitoring experiment throughout and adjusting raw results for historical differences. She then discussed an example model, its components and validity. She concluded the talk showing concern about model improvement, inclusion of external data (weather, economic trends, etc), creation of R package and comparison to recently released Google's approach to quasi-experimentation analysis.

Haile OwusuHaile Owusu, Chief Data Scientist at Mashable talked about how Mashable is developing proprietary technology and utilizing usable data, both from an editorial and sales perspective, to help deliver the best content to its engaged and growing audience. He started with giving a brief history of the organization. At present, they have about 40 million monthly unique visitors, 19 million social followers. Mashable's proprietary platform - Velocity predicts what's going viral next. Before Velocity, enormous amount of time of editorial staff was spent in where should one put stories on the layout. Velocity made this task easier.

Discussing about functioning of Mashable Velocity, he mentioned three components: crawling algorithms, machine learning and data science. The crawler daily crawls more a million links a day. The crawler is very scalable and flexible. The list of viable domains for content are constantly augmented.

While crawling, Natural Language Processing is used to understand what is at top of the content. Machine learning is used to make decisions to figure out how the layout should be designed and which stories to be placed where. Data Science is used to model how content diffuses over web. He gave some interesting use cases of Velocity. He concluded sharing some details regarding contract with MEC and 360i.

Ray LeiheRay Leihe, Director, Business Analytics, LinkedIn delivered a talk on "The Predictive-First Revolution at LinkedIn". He mentioned that the LinkedIn platform has about 332 million members with over 3.5 million companies and over 3 billion endorsements. Using extensive professional graph, rich content and engagement data, LinkedIn is quietly revolutionizing the sales and marketing world by creating a new generation of social selling and content marketing platform powered by machine learning and predictive modeling.

LinkedIn started with mobile app to take into account user experience first and foremost. However, now they think predictive at first when designing user experiences and building apps. Example: "People You May Know", "Jobs You May Be Interested In", etc. He briefly explained following three principles of "predictive first" giving few examples: 1. Be relevant 2. Be helpful 3. Be everywhere

He also talked about Pinot, a distributed analytics infrastructure that serves interactive analytics products at LinkedIn. It uses compressed columnar indexes for indexing, Apache Helix for cluster management and Apache Kafka and Hadoop for ingestion. He concluded by mentioning "We are investing in a new generation of social selling and content marketing platform to make sales and marketing more effective."

Highlights from Day 2 will be published soon.