KDnuggets Top Blogger: An Interview with Bill Schmarzo, the Dean of Big Data

Read an interview with the Dean of Big Data Bill Schmarzo, one of KDnuggets' Top Bloggers for September, and gain some insight on the topics data science, IoT, Big Data... and jeans!

KDnuggets has begun a new initiative of connecting our readers with their favorite blog authors.

We have been recognizing top contributors over the past few months with weekly and monthly Top Blogger Badges, awarded to the top 2 most viewed and shared articles of the week/month, and are building upon this by allowing some of these authors to speak directly to our readers.

Bill SchmarzoOne of these recognized blogs was Big Data Dilemma: Save Me Money Versus Make Me Money, one of the most shared for the week of Sep 5-11, 2016. The piece was written by William (Bill) Schmarzo, a leading and widely respected Big Data authority. In fact, he's known in some circles as the Dean of Big Data! Bill is the Chief Technology Officer at Dell EMC, and blogs prolifically on a whole host of Data Science and Big Data topics, a number of which are carried on KDnuggets.

Does your organization see Big Data as an opportunity to “Save Me More Money”, or does your organization see Big Data as an opportunity to “Make Me More Money”?

You can find Bill on LinkedIn, on Twitter, and can read his Dell EMC blog here.

Bill has agreed to answer a few questions in order to allow readers to get to know him a bit better. Take a couple of minutes to find out more about the Dean of Big Data below.

Matthew Mayo: Hi Bill! Thanks for agreeing to speak with our readers.

You have been described by folks as the Dean of Big Data, which even you note may have been applied in a light-hearted spirit. However, having a lengthy list of Big Data-related accomplishments and accolades, such a moniker comes with a responsibility that I'm sure you don't take lightly. What does being the Dean of Big Data mean to you?

Bill Schmarzo: I got the nickname from theCUBE/SiliconAngle folks at one of the first Strata’s conferences. I was asked to do a non-technology / non-engineering session about the value of big data to the business stakeholders. So I created a class that leveraged some of Michael Porter’s materials and some of Peter Drucker’s materials and called the “Big Data MBA.” So theCUBE folks said, “Hey, you’re like the Dean of Big Data” and the term stuck.

So what does that nickname mean to me? It means that my am responsible for helping to lead a transformation within the business community. I use my position to be provocative; to challenge business leadership to address the following simple question: How effective is your organization at leveraging data and analytics to power your business models?

BTW, I also teach at the University of San Francisco School of Management a class called the “Big Data MBA” (and that’s the name of my second book, which is the textbook for my class). The underlying foundation for our class is that we want to teach tomorrow’s business leaders that they will need to embrace analytics at a business discipline, not just something that flip over to IT.

An increasing number of your recent posts are at the intersection of Data Science and the Internet of Things. Could you give us some background on why your interests have gone in this particular direction, and what you see as the future of DS/IoT?

My original (naïve) thinking was the IoT was just another data source, like social media, clickstreams, mobile devices and wearables. But I’ve quickly been educated about the challenges with edge analytics, and how strongly a role that big data (and the data lake and data science) play in enabling edge analytics. We understand the real-time use cases that require driving real-time actions to the devices at the edge of IoT such as the optimal angle and yaw of the wind turbine blades to generate the appropriate amount of electricity given the atmospheric conditions, environmental factors and the market spot price for electricity. However, there is still a need to move much of that sensor data to the data lake to support more strategic use cases such as predictive maintenance (which devices or machines need what sort of maintenance, when should the maintenance be done, what parts are we going to need, and who is best qualified to perform that maintenance), disaster planning and recovery (where should we pre-position our maintenance trucks prior to the incoming hurricane in order to minimize power disruptions), and network capacity planning (where and when do we need to add more power generation, what sort of power generation should we build, when do we start the project, and how do we minimize future inventory and logistical costs by where we position the power generation capabilities vis-a-vis like power generation).

IoT Analytics

What advice would you give a newcomer to the realm of IoT Big Data?

Start with a thorough understanding of the use cases – what is the business value of that use case, what decisions are you trying to optimize, who are the key business stakeholders, and what data sources might you need in order to optimize those decisions. If one thoroughly understands the use case data and analytic requirements, now one is better positioned to determine which data science approaches, techniques, tools and algorithms are most important. If you want your work to be relevant, then make sure that your work is addressing relevant, high-value business needs.

Lightning round! Just a few quick words or a sentence for these...

Favorite Big Data book?

MoneyBall (because it teaches anyone the basic objectives of data science)

Python or R? :)

R, just because I’m too lazy to learn Python

Favorite music?

Funk (Parliament, Ohio Players) or Big Band Jazz (Maynard Ferguson)

Jeans or khakis?

Jeans, and the more worn, the better!

OK, before we let you go, is there anything else you would like our readers to know about Bill Schmarzo, Dell EMC, or Big Data (or, really, anything else at all!)?

I believe that the number one skill for becoming a great data scientist is humility. Our best data scientists are humble, which puts business stakeholders at ease. The best data scientists know that they can learn from anyone because anyone might have a better idea as to which variables and metrics might be better predictors of business performance.

If readers are interested in reading more of my blogs, they should check out infocus.dellemc.com/author/william_schmarzo. My job allows me to spend lots of time working directly with customers, so I try to capture and share the topics that are top of mind with my clients.

Thanks for your time, Bill. I'm sure I speak for our readers when I say that we look forward to your future content.

Bill Schmarzo recent KDnuggets posts include: