Interview: Satyam Priyadarshy, Halliburton on Unlocking Success for Big Data Projects
We discuss Predictive Analytics in Oil & Gas industry, Big Data analytics, key drivers of success,common reasons of failure, trends, advice, and more.
on Mar 31, 2015 in Halliburton, Interview, Predictive Analytics, Project Fail, Satyam Priyadarshy, Success, Trends
How Big Data Can Improve the Lives of the Poor
The role of Big Data in allowing greater financial inclusion for the poor also is a trending Internet topic. But it’s mostly creating optimism and interest, rather than controversy and dissent.
on Mar 31, 2015 in Big Data, Social Good
Interview: Satyam Priyadarshy, Halliburton on Big Data Challenges in Oil & Gas Industry
We discuss Analytics at Halliburton, Big Data challenges unique to Oil & Gas industry, and the 7 V’s of Big Data.
on Mar 30, 2015 in 3Vs of Big Data, Challenges, Halliburton, Interview, IoT, Oil & Gas, Satyam Priyadarshy
Data Science as a profession – time is now
Now is the time to begin thinking of Data Science as a profession not a job, as a corporate culture not a corporate agenda, as a strategy not a stratagem, as a core competency not a course, and as a way of doing things not a thing to do.
on Mar 30, 2015 in Data Science Skills, Kirk D. Borne
Text Analytics 2015 – Technology and Market Overview
A leading analyst and expert on text analytics gives an overview of the past year and looks ahead on text analytics technology and market developments.
on Mar 30, 2015 in Deep Learning, IBM, Ontotext, Sentiment Analysis, Text Analytics
Interview: Bill Moreau, USOC on the Pursuit of a Career in Sports Analytics
We discuss challenges in applying Data Analytics to sports, advice to beginners in the field of Sports Analytics, and more.
on Mar 28, 2015 in Advice, Analytics, Bill Moreau, Career, Challenges, Sports, Sports Medicine, USOC
The Grammar of Data Science: Python vs R
In this post, I will elaborate on my experience switching teams by comparing and contrasting R and Python solutions to some simple data exploration exercises.
on Mar 28, 2015 in Data Science, Data Visualization, Python, Python vs R, R
Interview: Bill Moreau, USOC on Evidence-based Medicine to Reduce Sports Injuries
We discuss the success of Analytics in predicting sports injuries, recent progress in concussion management and the trends in data-driven evidence-based sports medicine.
on Mar 27, 2015 in Bill Moreau, Data Analytics, Predictive Analytics, Sports, Sports Medicine, Trends, USOC
PredictionIO (Open Source Version) vs Microsoft Azure Machine Learning
Azure Machine Learning and PredictionIO are tools that both have similar visions and similar features, but when digging deeper you’ll notice key differences and key advantages to each.
on Mar 26, 2015 in Azure ML, Louis Dorard, Machine Learning, Marketplace, Microsoft Azure, PredictionIO
Interview: Bill Moreau, USOC on Empowering World’s Best Athletes through Analytics
We discuss how United States Olympic Committee uses Big Data, how athletes respond to Analytical insights, integration of sports medicine into sports performance and sports injury.
on Mar 26, 2015 in Analytics, Bill Moreau, Coaching, Correlation, Healthcare, Olympic, Performance, Sports, Sports Medicine, USOC
Talking Machine – 3 Deep Learning Gurus Talk about History and Future, part 2
Key ideas from a podcast with Deep Learning gurus Geoff Hinton, Yoshua Bengio, and Yann LeCun, where they explain the power of distributed representation and also propose a new open paper review process.
on Mar 26, 2015 in Deep Learning, Distributed Representation, Geoff Hinton, Ran Bi, Yann LeCun, Yoshua Bengio
Talking Machine – 3 Deep Learning Gurus Talk about History and Future of Machine Learning, part 1
An recent interview from the talking machine podcast with three deep learning experts. They talked about the neural network winter and its renewal.
on Mar 25, 2015 in convnet, Deep Learning, Geoff Hinton, Neural Networks, Ran Bi, Yann LeCun, Yoshua Bengio
Interview: Beena Ammanath, GE on Data Science – It’s Not Just Science!
We discuss benefits and challenges of Data Lake, trends, life lessons, motivation, desired skills, and more.
on Mar 24, 2015 in Beena Ammanath, Challenges, Data Analytics, Data Lakes, Data Science, GE, Interview, Trends
Data science done well looks easy, which is a big problem
Data Science done well looks too easy and that poses a major public relations problem for serious data scientists. The really tricky twist is that bad data science looks easy too.
on Mar 24, 2015 in Data Preprocessing, Data Science
Interview: Beena Ammanath, GE on the Industrial Internet for Data-driven Innovation
We discuss the role of Analytics at GE, Industrial Internet and how it is different from consumer internet, and the key capabilities of Predix.
on Mar 23, 2015 in Analytics, Beena Ammanath, Data Science, GE, Industrial Internet, Innovation, Interview, Predix, Software
Do We Need More Training Data or More Complex Models?
Do we need more training data? Which models will suffer from performance saturation as data grows large? Do we need larger models or more complicated models, and what is the difference?
on Mar 23, 2015 in Big Data, convnet, Generalized Linear Models, K-nearest neighbors, Training Data, Zachary Lipton
Top 10 UK Big Data Professionals
The top 10 Big Data Professionals in the UK include CEOs, journalists, an Information Commissioner, and Analytics leaders from leading companies and organizations.
on Mar 23, 2015 in Big Data Influencers, Computerworld, Top 10, UK
Interview: Brad Klingenberg, StitchFix on Decoding Fashion through Analytics and ML
We discuss the challenges in making personal styling recommendations, unexpected insights, interesting trends, motivation, advice, desired qualities in data scientists and more.
on Mar 21, 2015 in Advice, Analytics, Brad Klingenberg, Data Science, Fashion, Interview, Machine Learning, Stitch Fix, Trends
Interview: Brad Klingenberg, StitchFix on Building Analytics-powered Personal Stylist
We discuss StitchFix, how it leverages Analytics, understanding customer preferences, and pros-and-cons of involving human judgement in the recommendation process.
on Mar 20, 2015 in Analytics, Brad Klingenberg, Customer Experience, Recommendations, Stitch Fix
Small Data requires Specialized Deep Learning and Yann LeCun response
For industries that have relatively small data sets (less than a petabyte), a Specialized Deep Learning approach based on unsupervised learning and domain knowledge is needed.
on Mar 19, 2015 in Big Data, Deep Learning, Small Data, Yann LeCun
Interview: Vince Darley, King.com on What do you need to become Top Grossing Game
We discuss common characteristics of games that achieved top ranking, career advice, trends, desired qualities in data scientists and more.
on Mar 19, 2015 in Advice, Career, Data Scientist, Games, Interview, King.com, Trends, Vince Darley
Interview: Vince Darley, King.com on the Serious Analytics behind Casual Gaming
We discuss key characteristics of social gaming data, ML use cases at King, infrastructure challenges, major problems with A-B testing and recommendations to resolve them.
on Mar 18, 2015 in A/B Testing, Analytics, Gaming, Infrastructure, King.com, Machine Learning, Predictive Analysis, Vince Darley
Interview: Dave McCrory, Basho on Why Data Gravity Cannot be Ignored in Architecture Design
We discuss data gravity and its implications, Riak Enterprise 2.0, Riak CS 1.5, competitive landscape, challenges and more.
on Mar 17, 2015 in Basho, Challenges, Competition, Dave McCrory, Distributed Systems, Interview
Interview: Dave McCrory, Basho on Distributed Database Needs of a Future Enterprise
We discuss the future of distributed storage for enterprise, Scale-up vs. Scale-out, software design patterns in Cloud era, microservices model and the place for legacy database in modern enterprise IT.
on Mar 16, 2015 in Basho, Cloud Computing, Databases, Dave McCrory, Distributed Systems, Integration, Interview, SQL
Interview: Kenneth Viciana, Equifax on Data Governance – Red Tape or Catalyst?
We discuss recommendations for Data Governance policies, advice, Big Data trends, qualities sought in Data Scientists, and more.
on Mar 14, 2015 in Advice, Career, Data Governance, Data Science, Equifax, Interview, Kenneth Viciana, Trends
Report – MLconf: what industry leaders say about machine learning
MLconf hosted in 4 different cities, NYC, Seattle, Atlanta and San Francisco with speakers from big, established companies and from emerging startups, bringing more ideas and experience into the game.
on Mar 14, 2015 in CA, Deep Learning, Facebook, Machine Learning, MLconf, Netflix, New York City, NY, San Francisco
Interview: Kenneth Viciana, Equifax on Data Lake & Other Strategies for Insights Culture
We discuss the responsibilities of Enterprise Data Strategy team at Equifax, why Data Lake, Equifax Decision360, how to set up Insights Culture and bottlenecks for value delivery from Big Data.
on Mar 13, 2015 in Analytics, Business Strategy, Culture, Data Lakes, Equifax, Innovation, Insights, Kenneth Viciana
Deep Learning for Text Understanding from Scratch
Forget about the meaning of words, forget about grammar, forget about syntax, forget even the very concept of a word. Now let the machine learn everything by itself.
on Mar 13, 2015 in convnet, Deep Learning, Francois Petitjean, Text Classification, Torch, Yann LeCun
Interview: Josh Hemann, Activision on Why the Tolerance for Ambiguity is Vital
We discuss handling bias in data, other data quality concerns, advice, desired qualities, and more.
on Mar 12, 2015 in Activision, Advice, Bias, Career, Data Quality, Data Science, Data Visualization, Graphics, Interview, Josh Hemann, Junk Charts
Deep Learning, The Curse of Dimensionality, and Autoencoders
Autoencoders are an extremely exciting new approach to unsupervised learning and for many machine learning tasks they have already surpassed the decades of progress made by researchers handpicking features.
on Mar 12, 2015 in Autoencoder, Deep Learning, Face Recognition, Geoff Hinton, Image Recognition, Nikhil Buduma
SQL-like Query Language for Real-time Streaming Analytics
We need SQL like query language for Realtime Streaming Analytics to be expressive, short, fast, define core operations that cover 90% of problems, and to be easy to follow and learn.
on Mar 12, 2015 in Real-time, Realtime Analytics, SQL, Stream Mining, Streaming Analytics
Interview: Josh Hemann, Activision on Taming the Beast of Gaming Big Data
We discuss Analytics challenges at Activision, event data from games such as Call of Duty, balancing aesthetics and inference in visualization, problem with stacked charts and more.
on Mar 11, 2015 in Activision, Call of Duty, Data Visualization, Decision Making, Design, Gaming, Josh Hemann, Video Games
10 Steps to Success in Kaggle Data Science Competitions
The author, ranked in top 10 in five Kaggle competitions, shares his 10 steps for success. These also apply to any well-defined predictive analytics or modeling problem with a closed dataset.
on Mar 11, 2015 in Competition, Hackathon, Kaggle, Overfitting, Yanir Seroussi
Interview: Slava Akmaev, Berg on Challenges in Transitioning Analytics to Clinical Utility
We discuss Analytics use cases, challenges in relating molecular/clinical data to real-life outcomes, Healthcare Analytics trends and more.
on Mar 10, 2015 in Advice, Analytics, Berg, Challenges, Healthcare, Interview, Slava Akmaev, Trends, Use Cases
Strata + Hadoop World 2015 San Jose – Day 2 Highlights
Strata + Hadoop World 2015 was a great conference, and here are key insights from some of the best sessions on day 2.
on Mar 10, 2015 in Anomaly Detection, Apache Spark, Cloudera, Databricks, Intel, Microsoft, Netflix, Strata, Trifacta
Interview: Slava Akmaev, Berg on Healthcare Transparency & Effectiveness using Big Data
We discuss Big Data Analytics at Berg, making Healthcare effective through Big Data, impact of falling cost of DNA sequencing, Berg AI-Analytics Suite and more.
on Mar 9, 2015 in Analytics, Big Data Strategy, Biology, DNA, Healthcare, Transparency
Juergen Schmidhuber AMA: The Principles of Intelligence and Machine Learning
Jürgen Schmidhuber, pioneer in innovating Deep Neural Networks, answers questions on open code, general problem solvers, quantum computing, PhD students, online courses, and the neural network research community in this Reddit AMA.
on Mar 9, 2015 in AI, Deep Learning, Deep Neural Network, Human Intelligence, Jurgen Schmidhuber, PhD, Python, Quantum Computing, Reddit
7 common mistakes when doing Machine Learning
In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.
on Mar 7, 2015 in Machine Learning, Mistakes, Overfitting, Regression, SVM
Interview: Lei Shi, ChinaHR.com on Unraveling Insights from Unstructured Data
We discuss challenges in leveraging Big Data, important attributes while profiling employers and job seekers, competitive landscape, desired skills in data scientists and more.
on Mar 7, 2015 in Competition, Decision Making, Hiring, Interview, Jobs, Machine Learning, Trends
Interview: Lei Shi, ChinaHR.com on Analytics behind the Perfect Match
We discuss analytics at ChinaHR, matching job seekers and employers, traditional job fairs vs online recruitment, key metrics and analytical insights.
on Mar 6, 2015 in Analytics, ChinaHR, Hiring, Jobs, Lei Shi, Optimization, Recruitment
Interview: Kaiser Fung, NYU on Why Statistical Reasoning is more important than Number Crunching
We discuss why every individual should care about statistics, inspiration behind the book Numbersense, teaching statistics as liberal arts, Junk Charts blog, advice and more.
on Mar 5, 2015 in Advice, Data Science Skills, Education, Kaiser Fung, Numbersense, NYU, Statistical Learning
Interview: Kaiser Fung, NYU on Why Ignoring Data Integrity is a Recipe for Disaster
We discuss different levels of Data Integrity, logical fallacies in Analytics, measures to boost accountability, role for human intelligence in Analytics and relevance of OCCAM framework.
on Mar 4, 2015 in Data Integrity, Fallacies, Human Intelligence, Junk Charts, Kaiser Fung, NYU, OCCAM
Failing Optimally – Data Science’s Measurement Problem
Data science has a measurement problem. Simple metrics may not address complex situations. But complex metrics present myriad problems.
on Mar 4, 2015 in Accuracy, Competition, Model Performance, Zachary Lipton
Interview: Ted Dunning, MapR on Apache Mahout & Technology Landscape in ML
We discuss Apache Mahout, its comparison with Spark and H2O, trends, advice, desired qualities in data scientists and more.
on Mar 3, 2015 in Advice, Apache Mahout, Apache Spark, H2O, Interview, Machine Learning, MapR, Ted Dunning
All Machine Learning Models Have Flaws
This classic post examines what is right and wrong with different models of machine learning, including Bayesian learning, Graphical Models, Convex Loss Optimization, Statistical Learning, and more.
on Mar 3, 2015 in Bayesian, Decision Trees, Gradient Descent, John Langford, Machine Learning, Statistical Learning
Interview: Ted Dunning, MapR on The Real Meaning of Real-Time in Big Data
We discuss major Big Data developments in 2014, real-time processing, interactive queries, streaming systems, batch systems, MapR partnerships and challenges in scaling recommendation engines.
on Mar 2, 2015 in Big Data, MapR, Real-time, Recommendation, Stream Mining, Ted Dunning
Strata + Hadoop World 2015 San Jose – Day 1 Highlights
Here are the quick takeaways and valuable insights from selected talks at one of the most reputed conferences in Big Data – Strata + Hadoop World 2015, San Jose.
on Mar 2, 2015 in CA, Cloudera, Hadoop, HCatalog, Highlights, Hortonworks, IBM, MapR, MemSQL, San Jose, Strata, Twitter, Yahoo
|