Exclusive Interview: Michael O’Connell, Chief Data Scientist, TIBCO on How to Lead in Big Data
We discuss Big Data vs. Fast Data, Data Visualization trends, Jaspersoft acquisition, factors differentiating future leaders of Big Data and more.

Here is my interview with him:
Anmol Rajpurohit: Q1. How would you define "Big Data"? When enterprises think of Big Data, what are the most important questions they should be thinking about?

Some important big data questions in the enterprise include:
- What/how specific business problems can be addressed with big data analytics and what is the value thesis/potential.
- What data management and analytic methods are appropriate e.g. in-database or in-appliance analytics, hadoop map reduce jobs and data-on-demand; and most importantly how to structure the ETL, EDA, feature identification and modeling to get to the kernel of the business problem. It is far more important to construct a relevant dataset and features; and to apply appropriate algorithms, than to run inappropriate algorithms on a big/raw dataset. For example, a (generalized) linear model run on millions/billions of rows is often not as informative as fitting a model that better identifies the relevant patterns/structures on smaller, well-constructed datasets.
- How to deploy insights obtained with big data-at-rest to fast data-in-motion and enable real-time data analysis.
I like to fit generalized additive models to identify (non-linear) relationships, gradient boosting machines to obtain good predictions, PCA with robust partitioning to identify segments, and association rules for affinity analysis. In addition to using a variety of supervised and unsupervised learning methods, we like to use location analytics and optimization methods; and we do a ton of time series analyses for forecasting demand and future results. We always explore the data and models with interactive EDA and visual analytics – this is a must for identifying the business-relevant structures.
In my experience one can generate extreme value by applying analytics at the confluence of (big) data at rest and fast data in motion to solve engineering, manufacturing, R&D and sales & marketing problems.
AR: Q2. Your book "A Picture is Worth a Thousand Tables" was a very insightful and encouraging description of graphics in life sciences. What would be your top recommendations to get the most value out of graphics? What key trends do you currently observe in the data visualization arena?

My top recommendation is to enable exploratory graphics for creating data discovery sequences - Guided Analytics - that are immediately intuitive to a casual business user. Guided Analytics that disaggregate data and provide transparency on the business, to enable impactful action.
Graphics sequences that force the business user to refresh and explore data; and "spot the fires" in their business.
One interesting trend is the emergence of Javascript graphics, for example the d3 library. It's exciting to add such beautiful graphics to our Spotfire data discovery environment and to see these graphics respond to filtering, marking, brushing, coloring and layout.
AR: Q3. Where does the recent acquisition of Jaspersoft fit in TIBCO's overall strategy and product portfolio?

AR: Q4. You come from a very strong statistical background. What challenges have you experienced while working with team members having not-so-good understanding of statistics? How much knowledge and experience of statistics do you consider vital for data scientists?
MOC: Pretty much everyone at TIBCO thinks like statisticians. Statistics is the math of life, a framework for understanding events. TIBCO is all about understanding and anticipating events; and enabling action. The Spotfire tag line is "first to insight, first to action". We are a fast company. We make things happen around the world every second of every day.
I think of data scientists as knowing more about statistics than computer scientists and more about computer science than statisticians.
My staff develops simple software solutions to complex problems. We enable our customers to create extreme business value with our visual analytics software.
AR: Q5. In the current competitive landscape of Big Data, what factors do you think will help differentiate the future leaders?

Future leaders:
- Have the analytic capabilities to understand big data at rest: connect and mash-up data; derive features; provide guided, self-service dashboards; develop in-line and predictive analytics that get to the heart of the business problem at hand.
- Use in-line analytics to interpret fast data in motion; to understand what is happening at the moment of truth.
- Have the software muscle to take corrective action; to sense and respond to issues and opportunities; to make the most of perishable inventory in the moment.
- Understand and effectively monetize their social networks of customers, channels, suppliers and agents.
- Inspire their staff with collaboration; crowd-sourcing and organizational intelligence.
Analytics software plays a huge role in all of this. We will continue to see leading companies outsmart the rest with 2-speed information architectures that create rapid value while operationalizing efficient business processes.
AR: Q6. Is "talent crunch" a real problem in Data Science? What has been your personal experience around it?
MOC: I don't think it’s quite as big a problem as it's made out to be. I have a ton of great talent interested in joining our team. But you do need to recognize and foster great talent; and create working environments and internal communities for synergies and knowledge sharing.
AR: Q7. What advice would you give to Data Science students and researchers who are just starting their career?

AR: Q8. On a personal note, we are curious to know what keeps you busy when you are away from work?
MOC: I'm a music and art lover and I enjoy going to art exhibits and concerts. I like musicians such as My Bloody Valentine, Caribou, Brian Eno, Wire, Ride, Pavement, Sigur Ros, Radiohead, Spiritualized, Nick Cave, Lou Reed, David Bowie, Ryan Adams, Amy Winehouse; directors like Wes Anderson, Jim Jarmusch, David Lynch, Matthew Weiner; and visual artists like Jenny Holzer, Damien Hirst, Paul Klee, Ai Weiwei, Gerhard Richter, Rebecca Horn and many more.
Related:
Top Stories Past 30 Days
|
|