KDnuggets Interview: Paul Zikopoulos, IBM on Big Data Opportunities and Challenges

We discuss the value of Big Data for SMBs, how Cognitive will impact Big Data, IBM’s distinction from competition, significant trends and more.

Paul ZikopoulosPaul C. Zikopoulos, B.A., M.B.A., is the Vice President of Technical Sales for IBM’s Information Management division and additionally leads its World Wide Competitive Database and Big Data teams. Paul is an award winning writer and speaker with more than 20 years of experience in Information Management and is seen as a global expert in Big Data and Analytic technologies. Independent groups often recognize Paul as a thought leader with nominations to SAP’s “Top 50 Big Data Twitter Influencers”, Big Data Republic’s “Most Influential”, Onalytica’s “Top 100”, and Analytics Week “Thought Leader in Big Data and Analytics” lists. Big Data Made Simple noted him as a “Top 200 Big Data Thought Leaders on Twitter” and Technopedia listed him one of its “Big Data Experts to Follow”.

Paul has written more than 350 magazine articles and 19 books, some of which include “Big Data Beyond the Hype”, “Hadoop for Dummies”, “Harness the Power of Big Data” and more.

First part of interview.

Here is second and last part of my interview with him:

Anmol Rajpurohit: Q5. Is Big Data relevant only to big corporations or is it also relevant to small and medium-sized businesses? Big Data SMBs

Paul Zikopoulos: Big Data is relevant to everyone because of what I said earlier, it's a little more data than before. But it's a discipline, it's a mindset of enquiry. Personally, because SMBs tend to be more nimble, I think they can outdo their larger peer groups and take larger market share by finding efficiencies faster. It's a great time to be an SMB.

AR: Q6. Why do you believe that "Cognitive" would be a game changer for the field of Big Data? Cognitive

PZ: For this simple reason. In the next few years, you are going to see a shift from the billions of people generating PBs of data today to hundreds of billions of devices generating ZBs of data - this is massive scale change and computers have to learn with us to take advantage of it.

AR: Q7. What are the major dangers of Big Data that people must be aware of? How can we stay away from them? Danger

PZ: I am just going to give you one (other than the obvious, which is governance). Don't start with a science project. Use BigData to solve business problems. And Big Data can't solve everything - for example, I don't expect the Toronto Maple Leafs to win the Stanley Cup anytime soon because they are getting into the data science game.

AR: Q8. How do you distinguish IBM from its competition in the Big Data vendor landscape?

PZ: Well there are multiple competitors out there. I think however the biggest thing that IBM does to distinguish itself from others are the following. First, we don't just talk about analytics IBMfor data at rest (be it in HDFS or RDBMS, or both) but we talk about taking the harvested insights there and getting the focus to analytics on data in motion. Second, cognitive. Third, governance. Fourth, consumability. Put that all together, and the stuff I didn't mention, and the point I'm making is we offer a platform. That platform accords for things like data integration, governance, reporting, search, data science, and more. I've not seen one vendor bring to market this kind of platform and capability.

AR: Q9. What key trends will drive the growth of Big Data industry for the next 2-3 years and what factors will play a critical role in the success of Big Data projects?

PZ: I will say this - the landscape is going to change like crazy. I alluded to this with Spark. I think HDFS is going to Trendschange - I really feel that file system has a lot of shortcomings. I think we will see more high level languages, more Apache projects, but I also think we will see a lot of consolidation. It's the wild wild west in Apache land. I mean look at Sentry and Falcon for security for example. Look at Hive on Tez and so on. I hope we will see even more zaniness in some of the names of these projects too.

AR: Q10. What is the best advice you have got in your career?

PZ: Someone once told me "Don't let good enough...be good enough...". And I never have...

AR: Q11. What key qualities do you look for when interviewing for Data Science related positions on your team?

PZ: It's funny you asked me that, because I have a good track record here and people have been asking the same question. I don't spend a lot of time on qualifications. I mean, really, you are going to give me a CV and references - are they going to make Interviewyou look bad? No. I didn't tell my wife my bad points on our first dates (I never told her, she had to find out that hard way...). Anyway, so I meet them. I get a vibe. Can you communicate? I'm not looking for some stats person that can't speak. I'm looking for someone that is well dressed, but has a hard time figuring out what to do on a Friday night: see a movie, play Wizard of WarCraft, or download the latest R module on CRAN. Then I get them to do something ... I put them out of their comfort zone. That's when I see the kind of character you have. Big Data Beyond the Hype

AR: Q12. What was the last book that you read and liked? What do you like to do when you are not working?

PZ: The last book I read and liked was my own: Big Data Beyond the Hype. I get that sounds bad...but it's true, and I really do like it :). When I'm not working you are going to find me in the gym or doing hot yoga, unless I'm with my kid, who affectionately calls me "Big DaDa".