Interview: Hobson Lane, SHARP Labs on the Beauty of Simplicity in Analytics
Tags: Hobson Lane, Interview, Natural Language Processing, Predictive Analytics, Project Fail, SHARP, Tools
We discuss Predictive Analytics projects at Sharp Labs of America, common myths, value of simplicity, tools and technologies, and notorious data quality issues.

Here is my interview with him:
Anmol Rajpurohit: Q1. Can you share some of the prominent use cases of Predictive Analytics at Sharp Laboratories of America? What are some projects that you are currently working on or have recently completed?
Hobson Lane:

For a second project, SHARP needed predictions of commercial building daily power consumption profiles and peaks. We delivered a neural net and quantile filter with predictions that will enable fully autonomous operation of a system that dramatically reduces the energy bill for commercial buildings where it is deployed.
AR: Q2. Based on your extensive experience, what do you observe as the most common myths (or errors) prevalent in Predictive Analytics?
HL:
One persistent myth is that complex, inscrutable modelsare required to deliver valuable predictions. We data scientists are often responsible for propagating that myth, due to an obvious conflict of interest.
For example, on that project at SHARP that I mentioned, the non-technical sales team came up with a database query and statistical measure that was sufficiently accurate to monitor the effect of process improvements and forecast return rates well into the future. And it was in place, integrated into their process long before my slightly more accurate, precise, and complicated model was ready. And we continued developing and implementing "value-add" features such as natural language processing and interactive visualizations long after the lion's share of the value had been extracted from the data.
AR: Q3. In general, what approach do you follow for high-impact Predictive Analytics projects? How do you measure the success of your projects?

We brainstorm while visualizing and slicing data from various angles until a trend emerges. Only then do the executives have strong supporting evidence and team buy-in to support them as they begin the challenging task of redirecting a large, complex organization.
AR: Q4. What tools and technologies are used most often by your team?
HL: I default to open source. Fortunately SHARP's Big Data

AR: Q5. What have been the most notorious data quality issues you have come across? How did you deal with them?

Fortunately python, Django, Postgres, pandas, and Python itself have features that make it straight-forward to identify outliers and impute or delete troublesome records.
Second part of the interview
Related:
Top Stories Past 30 Days
|
|