Big Data for Executives 2014: Day 1 Highlights
Highlights from the presentations by Big Data experts from Sears Holdings, PWC, Oracle, Altamira, Tesora on Day 1 of Big Data for Executives 2014.
To help its readers succeed in their Analytics pursuits, KDnuggets provides concise summaries from selected talks at the event. These concise, takeaway-oriented summaries are designed for both – people who attended the event but would like to re-visit the key talks for a deeper understanding and people who could not attend the event. As you go through it, for any session that you find interesting, check KDnuggets as we would soon publish exclusive interviews with some of these speakers.
Here are highlights from selected talks on day 1(Thursday, May 5):
Ankur Gupta, IT Director – Big Data, Sears Holdings delivered an interesting talk on "Hadoop-Enabled Business Intelligence Use Cases".
He started with giving the following facts from a recent Wikibon survey:
- 46% of Big Data practitioners report that they have only realized partial value from their Big Data deployments
- 2% declared their Big Data deployments total failures, with no value achieved
According to Wikibon, there are three compelling reasons for this struggle to achieve maximum business value from big data:
- A lack of skilled Big Data practitioners
- “Raw” and relatively immature technology
- A lack of compelling business use case
In order to solve this crucial problem, Ankur suggested the following strategy:
- Bring IT and Business together
- Understand how Hadoop will fit into your environment
- See the end results first before you start your journey
- Define realistic success criteria and discover your big data use case
Introducing Sears as a cutting edge integrated retailer he discussed various use cases such as product perception, brand sentiment analysis, behavioral and predictive analytics on network data, and real-time inventory management. He concluded the talk stressing critical need of data governance and data hub across the enterprise.
- Platform: Public Cloud, Private Cloud, Dedicated Hardware, etc.
- Database Management System: Relational, Non-Relational, SQL, NoSQL, etc.
- Application: Vertical Specific, Custom, In-house, etc.
- End-User Devices: PC, Tablet, Phone, Hardcopy, etc.
Discussing about platform in detail he said storage, memory, networking, computing and cloud technology are improving rapidly each day. He described End User’s dilemma as user expects more than what specialized solution can provide. He discussed the following business problems:
- Provisioning of data databases takes too long
- Data is not actionable / monetize-able
- Database server not scalable
- Administration being too cumbersome
Giving DBaaS (database-as-a-service) as a solution, he emphasized that it improves usability of databases and handles “administrivia” so that application user can focus on innovation. Talking about DaaS(Data-as-a-Service), he pointed out that such simplification (through self-service provisioning) is desperately needed and DBaaS helps simplify the work-flow.
- Statistical Analysis: R
- Data Mining: Pandas, Impala, Mahout
- Machine Learning: Scikit-learn
- Machine Learning + NLP: Mallet
- Natural Language Processing: NLTK, Stanford CoreNLP,
- NLP + Geospatial Analysis: CLAVIN
- Social Network Analysis: NetworkX, Gephi
- Data Visualization: D3.js
- Fusion, Analysis, Visualization: Lumify
He concluded the talk emphasizing that organizations should save their dollars for people (salaries, training, etc.), resources (hardware, AWS, etc.) and proprietary software if no viable OSS alternative exists.
Highlights from day 2 will be made available soon.
Related: