KDnuggets Home » News » 2016 » Mar » Opinions, Interviews, Reports » “Citizen Data Scientist” Revolution ( 16:n11 )

“Citizen Data Scientist” Revolution


The naysayers are on the wrong side of "citizen" Data Scientist debate. Business users already have self-service BI capabilities and make decisions whether they are statistically sound or not. We can’t stop them from making decisions but should make statistically sound decisions easier. This new approach is called Smart Data Discovery.

By Arijit Sengupta, CEO BeyondCore.

citizen-data-scientist-300Businesses have been talking about the ‘data driven enterprise’ for a while and now they are excited about the Citizen Data Scientist. The idea that business users without statistical training will conduct data science is scary to any formally trained statistician. A recent KDnuggets editorial entitled “The Mirage of a Citizen Data Scientist” compared Citizen Data Scientists to a Google Car without a steering wheel. This is understandable because every expert analyst has a horror story in which a business user incorrectly interpreted their data and made the wrong business decision as a result.

However, the naysayers are on the wrong side of this debate. Business users already have self-service BI capabilities that allow them to draw pretty graphs based on their data and make decisions whether they are statistically sound or not. We can’t stop them from doing so because they are ultimately the business decision-makers. What we can do is make taking statistically sound decisions easier than taking bad decisions. This new approach is called Smart Data Discovery and in their most recent BI and Analytics Magic Quadrant report, Gartner highlighted a ‘Visionary’ technology that “ensures users are warned about potential hidden factors that might better explain a visually exciting pattern and protects users from taking statistically unsound decisions. This functionality addresses a key skills shortage highlighted by Gartner that most business users do not have the training necessary to accurately conduct or interpret analysis.”

As with the PC revolution, we cannot expect only a small group of experts to have access to the power of analytics. Business users will visualize data, and rather than discourage it, we need to do our best to ensure they analyze it accurately and efficiently. That means they need to understand and adopt data science principles such as recognizing statistical soundness of patterns, controlling for confounding factors, etc. Gartner is highlighting this market need and encouraging software vendors to leverage this opportunity. Whether or not we agree with the terms Citizen Data Scientist and Smart Data Discovery, , we can all agree on a real need for software for the business user that helps them accurately analyze data to find actionable insights.

What is the real world difference between self-service visualization and Smart Data Discovery? Imagine a marketing executive who wants to evaluate which marketing promotion is most effective. In today’s world of visualizations, she would just draw a graph correlating revenue to each promotion using her favorite visualization tool. If a specific promotion was associated with higher average revenue, she would decide to spend more money on that promotion and convince herself that she made a ‘data-driven decision.’

The same marketer using a Smart Data Discovery solution would have a very different experience. If a specific marketing promotion was effective in increasing revenue, the software would itself recommend that insight to the user. If the business user asked for the graph of revenue by promotion, the software would automatically conduct appropriate tests to confirm whether the pattern was a statistical trend or caused by outliers. An important distinction is that It would then automatically check for confounding effects. For example, perhaps the promotion was mainly run in California or during the month of November, and as such it looks more effective only because it was associated with markets and months with higher revenue. The software would then adjust for such confounding patterns and show the net effect of the promotion alone. Such software is available today—software that makes self-service analytics less error prone as opposed to more.

What role should expert analysts play in this ‘Citizen Data Scientist’ revolution? First, purely as a matter of self-defense, we should encourage business users to adopt analytic techniques as opposed to simplistic visualizations.  Gartner wrote, “less than 10 percent of self-service BI initiatives will be governed sufficiently to prevent inconsistencies that adversely affect the business.” Who will have to clean up after such adverse events? The expert analysts and data scientists. But, if the experts spend most of their time cleaning up after business users, they won’t have much time to conduct analyses of their own. Therefore, we must encourage business users to use these smart data discovery tools to reduce that burden.

Second, as business users start adopting true analytical techniques, they will become more informed consumers of the insights expert analysts deliver. Analysis is not useful unless it is acted upon. The next stage of the ‘data-driven enterprise’ is the analytical enterprise that knows what has happened and why, what has changed recently and why, what will happen, and what can be done to improve business outcomes.

As you can see, the demand for analytics and expert analysts will only increase. As a parallel example, when the heart stent was first introduced, cardiologists were concerned they would lose money because the demand for open heart surgeries would decrease. Instead, stents reduced the risk and cost of surgeries, and cardiologists ended up making more money because they could do many more stent surgeries and there was always the need for really complex heart surgeries that they could now focus on.

We started by comparing Citizen Data Scientists with the Google Car. In their March 7, 2016 cover article, Time magazine stated: “Because the gulf between human and machine is so vast – and growing – the next step after making driverless cars legal will be making them mandatory.” The same is true of analytics.

Bio: Arijit SenguptaArijit Sengupta
is the CEO and Founder of BeyondCore. Before founding BeyondCore, Arijit held a variety of technical and management positions at Oracle and Microsoft. He has been granted 15 patents in advanced analytics, business process as a service, operational risk, privacy and information security. Arijit holds an MBA with distinction from the Harvard Business School and Bachelor degrees with distinction in Computer Science and Economics from Stanford University.