Interview: Satyam Priyadarshy, Halliburton on Unlocking Success for Big Data Projects
We discuss Predictive Analytics in Oil & Gas industry, Big Data analytics, key drivers of success,common reasons of failure, trends, advice, and more.
on Mar 31, 2015 in Halliburton, Interview, Predictive Analytics, Project Fail, Satyam Priyadarshy, Success, Trends
How Big Data Can Improve the Lives of the Poor
The role of Big Data in allowing greater financial inclusion for the poor also is a trending Internet topic. But it’s mostly creating optimism and interest, rather than controversy and dissent.
on Mar 31, 2015 in Big Data, Social Good
Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 31 and beyond
Predictive Analytics for the Mainstream, Data Mining - Failure to Launch, Great Data Lakes, Disrupting Traditional Analyst Workflows, Tamr and Tableau, and more.
on Mar 30, 2015 in Failure to Launch, Hadoop, SAP, Tableau, Tamr, Teradata
Interview: Satyam Priyadarshy, Halliburton on Big Data Challenges in Oil & Gas Industry
We discuss Analytics at Halliburton, Big Data challenges unique to Oil & Gas industry, and the 7 V’s of Big Data.
on Mar 30, 2015 in 3Vs of Big Data, Challenges, Halliburton, Interview, IoT, Oil & Gas, Satyam Priyadarshy
Data Science as a profession – time is now
Now is the time to begin thinking of Data Science as a profession not a job, as a corporate culture not a corporate agenda, as a strategy not a stratagem, as a core competency not a course, and as a way of doing things not a thing to do.
on Mar 30, 2015 in Data Science Skills, Kirk D. Borne
Top KDnuggets tweets, Mar 26-29: The Basic Recipe for #MachineLearning in one slide; The Grammar of Data Science – comparing Python and R
The Basic Recipe for Machine Learning in one slide; The Grammar of Data Science - comparing Python and R; Uber Data Science team reveals why taxis may never be able to compete; Comparing @PredictionIO (Open Source Version) vs Microsoft Azure Machine Learning.
on Mar 30, 2015 in Azure ML, Machine Learning, PredictionIO, Python, Python vs R, R, Uber
Top /r/MachineLearning Posts, Mar 22-28: Deep Learning flaws & Security, DeepMind Publications, and Keras
Computer Vision security issues, DeepMind, statistics with Python, hacking on neural networks, and Keras, a neural network library are all topics on top of /r/MachineLearning this week.
on Mar 30, 2015 in Deep Learning, DeepMind, Neural Networks, Python, Reddit, Security, Statistics
Text Analytics 2015 – Technology and Market Overview
A leading analyst and expert on text analytics gives an overview of the past year and looks ahead on text analytics technology and market developments.
on Mar 30, 2015 in Deep Learning, IBM, Ontotext, Sentiment Analysis, Text Analytics
Top stories for Mar 22-28: More Free Data Mining, Data Science Books; PredictionIO vs Microsoft Azure Machine Learning
More Free Data Mining, Data Science Books and Resources; PredictionIO (Open Source Version) vs Microsoft Azure Machine Learning; Do We Need More Training Data or More Complex Models? Data science done well looks easy, which is a big problem.
on Mar 29, 2015 in Top stories
Interview: Bill Moreau, USOC on the Pursuit of a Career in Sports Analytics
We discuss challenges in applying Data Analytics to sports, advice to beginners in the field of Sports Analytics, and more.
on Mar 28, 2015 in Advice, Analytics, Bill Moreau, Career, Challenges, Sports, Sports Medicine, USOC
The Grammar of Data Science: Python vs R
In this post, I will elaborate on my experience switching teams by comparing and contrasting R and Python solutions to some simple data exploration exercises.
on Mar 28, 2015 in Data Science, Data Visualization, Python, Python vs R, R
Interview: Bill Moreau, USOC on Evidence-based Medicine to Reduce Sports Injuries
We discuss the success of Analytics in predicting sports injuries, recent progress in concussion management and the trends in data-driven evidence-based sports medicine.
on Mar 27, 2015 in Bill Moreau, Data Analytics, Predictive Analytics, Sports, Sports Medicine, Trends, USOC
PredictionIO (Open Source Version) vs Microsoft Azure Machine Learning
Azure Machine Learning and PredictionIO are tools that both have similar visions and similar features, but when digging deeper you’ll notice key differences and key advantages to each.
on Mar 26, 2015 in Azure ML, Louis Dorard, Machine Learning, Marketplace, Microsoft Azure, PredictionIO
Interview: Bill Moreau, USOC on Empowering World’s Best Athletes through Analytics
We discuss how United States Olympic Committee uses Big Data, how athletes respond to Analytical insights, integration of sports medicine into sports performance and sports injury.
on Mar 26, 2015 in Analytics, Bill Moreau, Coaching, Correlation, Healthcare, Olympic, Performance, Sports, Sports Medicine, USOC
Top KDnuggets tweets, Mar 23-25: 24 free resources on Data Mining, Data Science; More Training Data or More Complex Models?
24 free resources and online books on #DataMining, #DataScience, #MachineLearning; New R Online Tool for Seasonal Adjustment of time series; Key #DataScience question: More Training Data or More Complex Models?; Twitter #DataMining finds origins of ISIS support.
on Mar 26, 2015 in Data Mining, Free ebook, ISIS, Time Series, Twitter
Talking Machine – 3 Deep Learning Gurus Talk about History and Future, part 2
Key ideas from a podcast with Deep Learning gurus Geoff Hinton, Yoshua Bengio, and Yann LeCun, where they explain the power of distributed representation and also propose a new open paper review process.
on Mar 26, 2015 in Deep Learning, Distributed Representation, Geoff Hinton, Ran Bi, Yann LeCun, Yoshua Bengio
Database Pioneer Michael Stonebraker Wins ACM Turing Award, Computing “Nobel Prize”
Michael Stonebraker, a database pioneer and a serial entrepreneur, won the 2014 ACM Turing Award (which carries $1 million prize) for fundamental contributions to the concepts and practices in modern database systems.
on Mar 25, 2015 in ACM, Michael Stonebraker, MIT, SciDB, Tamr
More Free Data Mining, Data Science Books and Resources
More free resources and online books by leading authors about data mining, data science, machine learning, predictive analytics and statistics.
on Mar 25, 2015 in Book, Data Mining, Data Science, Free ebook, Machine Learning
Talking Machine – 3 Deep Learning Gurus Talk about History and Future of Machine Learning, part 1
An recent interview from the talking machine podcast with three deep learning experts. They talked about the neural network winter and its renewal.
on Mar 25, 2015 in convnet, Deep Learning, Geoff Hinton, Neural Networks, Ran Bi, Yann LeCun, Yoshua Bengio
TMA Predictive Analytics Data Mining Training [Wash. DC, May | Toronto, Aug]
Successful analytics in the big data era does not start with data and software, but with hands-on, immersive training and goal-driven strategy - get it from The Modeling Agency in Washington DC (May), Toronto (Aug)
on Mar 24, 2015 in Canada, Data Mining Training, DC, The Modeling Agency, TMA, Toronto, Washington
Interview: Beena Ammanath, GE on Data Science – It’s Not Just Science!
We discuss benefits and challenges of Data Lake, trends, life lessons, motivation, desired skills, and more.
on Mar 24, 2015 in Beena Ammanath, Challenges, Data Analytics, Data Lakes, Data Science, GE, Interview, Trends
PASS Business Analytics Conference, Santa Clara, April 20-22
Aimed at business and data analytics professionals, it brings a lineup of world-class analytics speakers, fresh insights, compelling content and powerful networking. KDnuggets discount.
on Mar 24, 2015 in Business Analytics, CA, PASS, Santa Clara
Data science done well looks easy, which is a big problem
Data Science done well looks too easy and that poses a major public relations problem for serious data scientists. The really tricky twist is that bad data science looks easy too.
on Mar 24, 2015 in Data Preprocessing, Data Science
Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 24 and beyond
Addressing the Challenges of Data Variety, Semantic Publishing, How to scale faster with NoSQL, Data Mining - Failure to Launch, Disrupting Traditional Analyst Workflows, and more.
on Mar 24, 2015 in JMP, Lavastorm, NoSQL, Ontotext, Talend
Interview: Beena Ammanath, GE on the Industrial Internet for Data-driven Innovation
We discuss the role of Analytics at GE, Industrial Internet and how it is different from consumer internet, and the key capabilities of Predix.
on Mar 23, 2015 in Analytics, Beena Ammanath, Data Science, GE, Industrial Internet, Innovation, Interview, Predix, Software
Catch the Wave at Analytics 2015, the Leading Analytics, Big Data Conference, Huntington Beach, April 12-14
Analytics 2015 will go beyond the typical “buzz” about Big Data and the cloud, providing unique opportunities to learn about potential analytics applications to the Internet of Things, as well as practical implementations of cognitive computing, unstructured data analytics, and real-time decisions based on streaming data.
on Mar 23, 2015 in
Top KDnuggets tweets, Mar 19-22: Tensor methods for Machine Learning; Tibco survey: Big Data top use cases
Tensor methods for #MachineLearning: fast, accurate, scalable, need open-source libs; #DataScience and Reproducibility: Explaining when the experiment does not work; Google #DeepLearning FaceNet is the best ever for recognizing faces; Tibco survey #BigData top use cases: Customer & Experience Analytics, Risk/Threat.
on Mar 23, 2015 in Big Data, Deep Learning, Face Recognition, Google, Open Source, Reproducibility, Tensor, Use Cases
GBDC: Real-Time Big Data Developer (focus on Spark, Storm, Flink, Kafka), Santa Clara, Apr 23-24
A fast paced, vendor agnostic, technical overview of the Apache Spark landscape, with technical sessions, use cases and hands-on sessions. Get KDnuggets discount.
on Mar 23, 2015 in Apache Spark, Apache Storm, CA, Global Big Data Conference, Kafka, Real-time, Santa Clara
Do We Need More Training Data or More Complex Models?
Do we need more training data? Which models will suffer from performance saturation as data grows large? Do we need larger models or more complicated models, and what is the difference?
on Mar 23, 2015 in Big Data, convnet, Generalized Linear Models, K-nearest neighbors, Training Data, Zachary Lipton
PAW: Learn the ways predictive analytics bolsters insurance
Insurance relies greatly on predictive analytics - learn about advances and best practices at several PAW Business insurance-related sessions in San Francisco and Chicago.
on Mar 23, 2015 in CA, Chicago, IL, Insurance, PAW, Predictive Analytics World, San Francisco
Top 10 UK Big Data Professionals
The top 10 Big Data Professionals in the UK include CEOs, journalists, an Information Commissioner, and Analytics leaders from leading companies and organizations.
on Mar 23, 2015 in Big Data Influencers, Computerworld, Top 10, UK
Top stories for Mar 15-21: Deep Learning for Text Understanding from Scratch; White House on Big Data and Differential Pricing
Deep Learning for Text Understanding from Scratch; 7 common mistakes when doing Machine Learning; White House report on Big Data and Differential Pricing; Why Data Gravity Cannot be Ignored.
on Mar 22, 2015 in Top stories
Automatic Statistician is here: Dr. Mo
Dr. Mo, Automatic Statistician is here! Using Artificial Intelligence, self-learning algorithm, multimodel technology Dr. Mo achieves Super Accuracy and Speed. Simple use and simple output for non-statisticians.
on Mar 21, 2015 in Artificial Intelligence, Automating, Predictive Modeling, Statistics
Interview: Brad Klingenberg, StitchFix on Decoding Fashion through Analytics and ML
We discuss the challenges in making personal styling recommendations, unexpected insights, interesting trends, motivation, advice, desired qualities in data scientists and more.
on Mar 21, 2015 in Advice, Analytics, Brad Klingenberg, Data Science, Fashion, Interview, Machine Learning, Stitch Fix, Trends
2015 SIGKDD Data Science/Data Mining PhD Dissertation Award – Nominations due Apr 30
This annual award by ACM SIGKDD seeks to recognize outstanding research by doctoral candidates in the field of data mining, data science, and knowledge discovery. Nominations due Apr 30.
on Mar 21, 2015 in ACM, Awards, Dissertation, KDD-2015, PhD, SIGKDD
Interview: Brad Klingenberg, StitchFix on Building Analytics-powered Personal Stylist
We discuss StitchFix, how it leverages Analytics, understanding customer preferences, and pros-and-cons of involving human judgement in the recommendation process.
on Mar 20, 2015 in Analytics, Brad Klingenberg, Customer Experience, Recommendations, Stitch Fix
useR 2015 – R User conference, Aalborg, Denmark, June 30 – July 3
The open source R language is a leading tool for data scientists. Attend useR! conference, the main annual event of the R community, June 30 - July 3, in Aalborg, Denmark.
on Mar 20, 2015 in Aalborg, Denmark, R
Top KDnuggets tweets, Mar 16-18: 87 Studies shown that accurate numbers aren’t more useful than the ones you make up (Dilbert)
Also Sirius - a free, open-source version of Siri; #PI art: the first 13,689 digits of pi; Great tutorial + #Python code: 1-Layer Neural Networks.
on Mar 19, 2015 in Cartoon, Data Preparation, Deep Learning, Dilbert, Excel, Neural Networks, pi, Python, Siri
Small Data requires Specialized Deep Learning and Yann LeCun response
For industries that have relatively small data sets (less than a petabyte), a Specialized Deep Learning approach based on unsupervised learning and domain knowledge is needed.
on Mar 19, 2015 in Big Data, Deep Learning, Small Data, Yann LeCun
Top /r/MachineLearning Posts, Mar 8-14: Word vectors, Hardware for Deep Learning, and Neural Graphics Engines
Word vectors in NLP, Machine Learning's place in programming, hardware for deep learning, Machine Learning interviews, and neural graphics engines are all topics covered this week on /r/MachineLearning.
on Mar 19, 2015 in Deep Learning, GPU, Graphics, Interview Questions, NLP, Reddit
Interview: Vince Darley, King.com on What do you need to become Top Grossing Game
We discuss common characteristics of games that achieved top ranking, career advice, trends, desired qualities in data scientists and more.
on Mar 19, 2015 in Advice, Career, Data Scientist, Games, Interview, King.com, Trends, Vince Darley
Interview: Vince Darley, King.com on the Serious Analytics behind Casual Gaming
We discuss key characteristics of social gaming data, ML use cases at King, infrastructure challenges, major problems with A-B testing and recommendations to resolve them.
on Mar 18, 2015 in A/B Testing, Analytics, Gaming, Infrastructure, King.com, Machine Learning, Predictive Analysis, Vince Darley
PACE Data Mining Bootcamps, San Diego, April
Every class is taught by SDSC experienced data scientists, delivering practical, hands-on training in an intimate classroom setting limited to 25 participants. Early bird till Mar 31.
on Mar 17, 2015 in Bootcamp, CA, Data Mining, PACE, San Diego, SDSC
NYC Data Science Courses, Bootcamps, Meetups
NYC Data Science Academy spring schedule includes 3 classes, 3 Meetups, 7 bootcamp events on Data Science, R, Python, Machine Learning, scikit-learn, and related topics.
on Mar 17, 2015 in Bootcamp, Knewton, Machine Learning, Meetup, New York City, NY, NYC Data Science Academy, Python, R, scikit-learn
Interview: Dave McCrory, Basho on Why Data Gravity Cannot be Ignored in Architecture Design
We discuss data gravity and its implications, Riak Enterprise 2.0, Riak CS 1.5, competitive landscape, challenges and more.
on Mar 17, 2015 in Basho, Challenges, Competition, Dave McCrory, Distributed Systems, Interview
Ontotext Introduces the S4 Developer Challenge
The challenge will award a cash prize to developers that write the most interesting demo, application or show case utilizing the S4 capabilities for text analytics, linked data and knowledge graphs. Submissions due Mar 31.
on Mar 17, 2015 in Challenge, Developers, Ontotext, RDF, Triplestore
Ontotext Webinar: Semantic Publishing, Enhancing Content and Engagement, Mar 26
Ontotext will show how news and media publishers can use semantic publishing technology to more efficiently generate content while increasing audience engagement through personalization and recommendations.
on Mar 17, 2015 in Ontotext, Personalization, Recommendations, Semantic Analysis
Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 17 and beyond
Predictive Analytics - Rethinking Big Data, Addressing the Challenges of Data Variety, Semantic Publishing for News & Media, Data Mining: Failure to Launch.
on Mar 16, 2015 in Angoss, Data Mining, Failure to Launch, Hadoop, JMP, Security
Top KDnuggets tweets, Mar 12-15: Cartoon: the most difficult challenge facing the 1st US Chief Data Scientist @dpatil
Cartoon: top challenge for US Chief Data Scientist DJ Patil; In-depth intro to #MachineLearning, #Statistics, R: 15 hours of videos; Amazing! Forget coding word meaning, grammar, syntax - now #DeepLearning can learn everything.
on Mar 16, 2015 in Cartoon, Data Science Education, Deep Learning, DJ Patil, Robert Tibshirani, World Cup
Interview: Dave McCrory, Basho on Distributed Database Needs of a Future Enterprise
We discuss the future of distributed storage for enterprise, Scale-up vs. Scale-out, software design patterns in Cloud era, microservices model and the place for legacy database in modern enterprise IT.
on Mar 16, 2015 in Basho, Cloud Computing, Databases, Dave McCrory, Distributed Systems, Integration, Interview, SQL
IEEE ICDM 2015 Call for Papers, Workshops, Contest proposals, demos, and tutorials
ICDM '15: The 15th IEEE International Conference on Data Mining, a leading research conference in the field, calls for workshop proposals, contest proposals, papers, demo proposals, and tutorial proposals. Conference dates: Nov 14-17, Atlantic City, NJ, USA.
on Mar 16, 2015 in Atlantic City, Charu Aggarwal, Data Mining, ICDM, IEEE, NJ
KDnuggets Free Pass to Big Data TechCon How-To Conference, Apr 26-28, Boston
Win a free KDnuggets Pass for Big Data TechCon in Boston - the conference to learn HOW-TO master and analyze Big Data. Learn Hadoop, Spark, Yarn, HBase, R, and Hive from the smartest, hardest-working faculty.
on Mar 15, 2015 in Apache Hive, Apache Spark, Big Data, Boston, Hadoop, HBase, MA, R, Techcon, YARN
Top stories for Mar 8-14: 7 common Machine Learning mistakes; Deep Learning for Text Understanding from Scratch
7 common mistakes when doing Machine Learning; Deep Learning for Text Understanding from Scratch; SQL-like Query Language for Real-time Streaming Analytics; 10 Steps to Success in Kaggle Data Science Competitions.
on Mar 15, 2015 in Top stories
White House report on Big Data and Differential Pricing
White House report examines how companies are using big data and analytics to charge different prices to different customers (price discrimination), looks at both benefits and risks, and concludes that many concerns can be addressed by existing anti-discrimination and consumer protection laws.
on Mar 14, 2015 in Big Data, Discrimination, Pricing, White House
Interview: Kenneth Viciana, Equifax on Data Governance – Red Tape or Catalyst?
We discuss recommendations for Data Governance policies, advice, Big Data trends, qualities sought in Data Scientists, and more.
on Mar 14, 2015 in Advice, Career, Data Governance, Data Science, Equifax, Interview, Kenneth Viciana, Trends
New Poll: Computing platform for your analytics, data mining, data science work or research
New KDnuggets Poll is asking: What computing platform you use for analytics, data mining, data science work or research? Please vote.
on Mar 14, 2015 in Cloud Computing, Data Science Platform, In-Memory Computing, Poll
Report – MLconf: what industry leaders say about machine learning
MLconf hosted in 4 different cities, NYC, Seattle, Atlanta and San Francisco with speakers from big, established companies and from emerging startups, bringing more ideas and experience into the game.
on Mar 14, 2015 in CA, Deep Learning, Facebook, Machine Learning, MLconf, Netflix, New York City, NY, San Francisco
Participate in the Rexer Analytics 2015 Data Miner Survey
Data Analysts, Predictive Modelers, Data Scientists, Data Miners, and all other types of analytic professionals, students, and academics - please participate in the Rexer Analytics 2015 Data Miner Survey.
on Mar 14, 2015 in Data Miner, Rexer Analytics, Survey
Coursera: Process Mining: Data science in Action, April 2015
Due to the big success of the first run, this 6 week online course is repeated on Coursera, starting April 1. This free course provides data science knowledge that can be applied directly to analyze and improve processes in a variety of domains.
on Mar 14, 2015 in Coursera, MOOC, Process Mining
Strata + Hadoop World London, 5-7 May 2015
Strata + Hadoop World has been called "mind-blowing", "an amazing event", "the most interesting and informative conference". See for yourself in London and get a special KDnuggets discount.
on Mar 13, 2015 in Apache Spark, Hadoop, London, Strata, UK
Interview: Kenneth Viciana, Equifax on Data Lake & Other Strategies for Insights Culture
We discuss the responsibilities of Enterprise Data Strategy team at Equifax, why Data Lake, Equifax Decision360, how to set up Insights Culture and bottlenecks for value delivery from Big Data.
on Mar 13, 2015 in Analytics, Business Strategy, Culture, Data Lakes, Equifax, Innovation, Insights, Kenneth Viciana
Cartoon: US Chief Data Scientist Most Difficult Challenge
New KDnuggets cartoon looks at the most difficult challenge facing the first US Chief Data Scientist DJ Patil @dpatil.
on Mar 13, 2015 in Cartoon, Data Scientist, DJ Patil, Government
Top Big Data influencers of 2014, according to HadoopSphere
Top big data influencers of 2014 include analysts Mike Gualtieri and Curt Monash, IBM and TDWI media, Spark and Scala products, Ben Lorica @bigdata and Gregory Piatetsky @kdnuggets on social media, Data Collective and AngelList co-founder.
on Mar 13, 2015 in About Gregory Piatetsky, Apache Spark, Big Data Influencers, Hadoop, IBM, Influencers, Kafka, Kirk D. Borne, Mike Gualtieri, Scala, TDWI
Deep Learning for Text Understanding from Scratch
Forget about the meaning of words, forget about grammar, forget about syntax, forget even the very concept of a word. Now let the machine learn everything by itself.
on Mar 13, 2015 in convnet, Deep Learning, Francois Petitjean, Text Classification, Torch, Yann LeCun
Top KDnuggets tweets, Mar 09-11: Learning path from noob to Kaggler in Python; 10 steps for success in Kaggle competitions
Comprehensive learning path from noob to Kaggler in Python; 10 steps for success in Kaggle competitions; Machine learning packages #Python #Java #BigData #Lua #Clojure #Scala, R; Very useful LeaRning Path on R - Step by Step Guide.
on Mar 12, 2015 in Barcelona, Kaggle, Learning Path, Machine Learning, Online Education, Prismatic, Python, R
Interview: Josh Hemann, Activision on Why the Tolerance for Ambiguity is Vital
We discuss handling bias in data, other data quality concerns, advice, desired qualities, and more.
on Mar 12, 2015 in Activision, Advice, Bias, Career, Data Quality, Data Science, Data Visualization, Graphics, Interview, Josh Hemann, Junk Charts
Simplilearn Big Data and Analytics Courses – CAREER30
Get Big Data and Analytics certification - a big plus for your career - with Simplilearn courses on Analytics, Big Data, Hadoop, SAS, R, Cloud Computing, and more, now at 30% discounted prices until Mar 30.
on Mar 12, 2015 in Big Data, Big Data Analytics, Certification, Cloud Computing, Hadoop, R, SAS, Simplilearn
Feb 2015 Analytics, Big Data, Data Mining Acquisitions and Startups Activity
Feb 2015 acquisitions, startups, and company activity in Analytics, Big Data, Data Mining, and Data Science: @Kaggle cuts 1/3 of staff, Infosys buys Panaya, RapidMiner gets $15M, Palantir buys Fancy That, Hitachi buys Pentaho, and more.
on Mar 12, 2015 in Hitachi, Infosys, Kaggle, Palantir, Pentaho, RapidMiner, Startups
Deep Learning, The Curse of Dimensionality, and Autoencoders
Autoencoders are an extremely exciting new approach to unsupervised learning and for many machine learning tasks they have already surpassed the decades of progress made by researchers handpicking features.
on Mar 12, 2015 in Autoencoder, Deep Learning, Face Recognition, Geoff Hinton, Image Recognition, Nikhil Buduma
SQL-like Query Language for Real-time Streaming Analytics
We need SQL like query language for Realtime Streaming Analytics to be expressive, short, fast, define core operations that cover 90% of problems, and to be easy to follow and learn.
on Mar 12, 2015 in Real-time, Realtime Analytics, SQL, Stream Mining, Streaming Analytics
Wharton Successful Applications of Customer Analytics Conference, Apr 30, Philadelphia
Wharton Customer Analytics Initiative (WCAI) helps define Customer Analytics, with conference dedicated to real-world applications that balance high-level rigor and business know-how. Case studies include Nielsen, Google, Cablevision, and MLB.
on Mar 11, 2015 in Customer Analytics, PA, Philadelphia, WCAI, Wharton
Interview: Josh Hemann, Activision on Taming the Beast of Gaming Big Data
We discuss Analytics challenges at Activision, event data from games such as Call of Duty, balancing aesthetics and inference in visualization, problem with stacked charts and more.
on Mar 11, 2015 in Activision, Call of Duty, Data Visualization, Decision Making, Design, Gaming, Josh Hemann, Video Games
10 Steps to Success in Kaggle Data Science Competitions
The author, ranked in top 10 in five Kaggle competitions, shares his 10 steps for success. These also apply to any well-defined predictive analytics or modeling problem with a closed dataset.
on Mar 11, 2015 in Competition, Hackathon, Kaggle, Overfitting, Yanir Seroussi
Machine Learning Table of Elements Decoded
Machine learning packages for Python, Java, Big Data, Lua/JS/Clojure, Scala, C/C++, CV/NLP, and R/Julia are represented using a cute but ill-fitting metaphor of a periodic table. We extract the useful links.
on Mar 11, 2015 in Big Data Software, Java, Julia, Machine Learning, NLP, Python, R, Scala, scikit-learn, Weka
KDD-2017, top conference on Data Mining, Data Science Research coming to Halifax, Canada
The 2017 edition of KDD, the leading conference on Knowledge Discovery, Data Mining, and Data Science Research will be held in Halifax, Nova Scotia, Canada.
on Mar 10, 2015 in Canada, Data Mining, Halifax, KDD, KDD-2014, Nova Scotia, Research, SIGKDD
Interview: Slava Akmaev, Berg on Challenges in Transitioning Analytics to Clinical Utility
We discuss Analytics use cases, challenges in relating molecular/clinical data to real-life outcomes, Healthcare Analytics trends and more.
on Mar 10, 2015 in Advice, Analytics, Berg, Challenges, Healthcare, Interview, Slava Akmaev, Trends, Use Cases
Strata + Hadoop World 2015 San Jose – Day 2 Highlights
Strata + Hadoop World 2015 was a great conference, and here are key insights from some of the best sessions on day 2.
on Mar 10, 2015 in Anomaly Detection, Apache Spark, Cloudera, Databricks, Intel, Microsoft, Netflix, Strata, Trifacta
PAW San Francisco: The Pentagon of Data Analytics Events, Mar 29 – Apr 2
In San Francisco this month, you'll find five data analytics events - join your peers by for case studies, profound keynotes, and build strong connections with data analytics professionals. Special KDnuggets discount.
on Mar 10, 2015 in Business Analytics, CA, PAW, Predictive Analytics World, San Francisco, Text Analytics, Workforce Analytics
Northwestern Online MS in Predictive Analytics
Prepare for leadership-level career opportunities, learn from distinguished Northwestern faculty and industry experts, build statistical and analytic expertise. Application deadline Apr 15.
on Mar 10, 2015 in Master of Science, MS in Analytics, Northwestern, Online Education, Predictive Analytics
OpenDataSciCon – Data Science for All: What Do You Need ?
“All you need is knowledge.” Open source is what paves the road to that knowledge and the Open Data Science Conference, Boston, May 30-31, 2015.
on Mar 10, 2015 in Boston, Data Science, Kaggle, MA, Open Data, USA
NoSQL matters Paris, a conference for developers, architects and geeks, Mar 26-27
An opportunity to network with leading NoSQL experts from all over the world, enjoy mind blowing talks or simply have loads of fun hacking away with other participants in Paris, 26-27 Mar 2015.
on Mar 10, 2015 in France, NoSQL, Paris, Ted Dunning
Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 10 and beyond
Data Wrangling and the Art of Big Data Discovery, Data Mining: Failure to Launch, The State of Hadoop Adoption, Addressing the Challenges of Data Variety, and more.
on Mar 9, 2015 in Data Visualization, Data Wrangling, Hadoop, Kafka, Security, SQL
Webinar: Data Mining: Failure to Launch [Mar 11]
Learn how to get started with predictive modeling and overcome strategic and tactical limitations that cause data mining projects to fall short of their potential. Next webinar is Mar 11.
on Mar 9, 2015 in Data Mining, Failure to Launch, TMA
Interview: Slava Akmaev, Berg on Healthcare Transparency & Effectiveness using Big Data
We discuss Big Data Analytics at Berg, making Healthcare effective through Big Data, impact of falling cost of DNA sequencing, Berg AI-Analytics Suite and more.
on Mar 9, 2015 in Analytics, Big Data Strategy, Biology, DNA, Healthcare, Transparency
Top KDnuggets tweets, Mar 2-8: 6 categories in the Hadoop Ecosystem; How PayPal uses Deep Learning to fight fraud
How #PayPal uses #DeepLearning and detective work to fight #fraud; Beginning #deeplearning with 500 lines of Julia; Processing frameworks for Hadoop and 6 categories in the #Hadoop Ecosystem; KDnuggets Poll results: #Analytics, #DataMining, #DataScience salary income by region.
on Mar 9, 2015 in Anomaly Detection, Deep Learning, Hadoop, Julia, Netflix, PayPal, Poll, Salary
Chapter Download from “Data Mining Techniques”
Download this chapter from "Data Mining Techniques" (3rd Edition), by Gordon Linoff and Michael Berry, and learn how to create derived variables, which allow the statistical modeling process to incorporate human insights.
on Mar 9, 2015 in Data Mining, Derived Variables, Gordon Linoff, JMP, Michael Berry
Top stories for Mar 1-7: All Machine Learning Models Have Flaws; Analytics, Data Mining, Data Science professionals salary
All Machine Learning Models Have Flaws; Analytics, Data Mining, Data Science professionals well compensated; 7 common mistakes when doing Machine Learning; Interviews with Ted Dunning (MapR) and Kaiser Fung (junkcharts).
on Mar 9, 2015 in Top stories
Top /r/MachineLearning Posts, Mar 1-7: Stanford Deep Learning for NLP, Machine Learning with Scikit-learn
This week on /r/MachineLearning, we have a new NLP-focused deep learning course from Stanford, an introduction to scikit-learn, visualization of music collections, an implementation of DeepMind, and NLP using deep learning and Torch.
on Mar 9, 2015 in Deep Learning, DeepMind, Facebook, GPU, Python, Reddit, scikit-learn, Torch
Juergen Schmidhuber AMA: The Principles of Intelligence and Machine Learning
Jürgen Schmidhuber, pioneer in innovating Deep Neural Networks, answers questions on open code, general problem solvers, quantum computing, PhD students, online courses, and the neural network research community in this Reddit AMA.
on Mar 9, 2015 in AI, Deep Learning, Deep Neural Network, Human Intelligence, Jurgen Schmidhuber, PhD, Python, Quantum Computing, Reddit
7 common mistakes when doing Machine Learning
In statistical modeling, there are various algorithms to build a classifier, and each algorithm makes a different set of assumptions about the data. For Big Data, it pays off to analyze the data upfront and then design the modeling pipeline accordingly.
on Mar 7, 2015 in Machine Learning, Mistakes, Overfitting, Regression, SVM
Interview: Lei Shi, ChinaHR.com on Unraveling Insights from Unstructured Data
We discuss challenges in leveraging Big Data, important attributes while profiling employers and job seekers, competitive landscape, desired skills in data scientists and more.
on Mar 7, 2015 in Competition, Decision Making, Hiring, Interview, Jobs, Machine Learning, Trends
Interview: Lei Shi, ChinaHR.com on Analytics behind the Perfect Match
We discuss analytics at ChinaHR, matching job seekers and employers, traditional job fairs vs online recruitment, key metrics and analytical insights.
on Mar 6, 2015 in Analytics, ChinaHR, Hiring, Jobs, Lei Shi, Optimization, Recruitment
GISCUP: GIS-focused algorithm competition at ACM SIGSPATIAL GIS 2015
Route planner in real-time one of the most popular web GIS services in use today, and 2015 contest is to find shortest path under polygonal obstacles.
on Mar 6, 2015 in ACM, Competition, GIS, SIGSPATIAL
Wrangling Public Bike Share Data with The Free Trial of Trifacta
A free trial of Trifacta is a good opportunity for data analysts to start wrangle the different shapes and sizes of data sets. We give an example of wrangling Bay Area Bike Share data to better understand biking around San Francisco.
on Mar 6, 2015 in Data Analytics, Data Processing, Data Science Platform, Obama for America, Trifacta
Interview: Kaiser Fung, NYU on Why Statistical Reasoning is more important than Number Crunching
We discuss why every individual should care about statistics, inspiration behind the book Numbersense, teaching statistics as liberal arts, Junk Charts blog, advice and more.
on Mar 5, 2015 in Advice, Data Science Skills, Education, Kaiser Fung, Numbersense, NYU, Statistical Learning
10 Predictive Analytics Influencers You Need to Know
A list of Predictive Analytics Influencers based on Twitter activity around “#PredictiveAnalytics” and “Predictive Analytics”: Gregory Piatetsky, Vineet Vashishta, Aki Kakko and more.
on Mar 5, 2015 in About Gregory Piatetsky, About KDnuggets, Big Data Influencers, Dataconomy, Influencers, Mike Gualtieri, Predictive Analytics
Online Graduate Certificate in Business Analytics from Penn State
This new online Graduate Certificate covers the entire life cycle of data and analytics-supportive decision-making and is built around the framework from the Institute for Operations Research and Management Sciences (INFORMS).
on Mar 5, 2015 in Business Analytics, Certificate, INFORMS, Online Education, Penn State
Hurwitz: Cognitive Computing and Big Data Analytics
New book from Judith Hurwitz and associates on Cognitive Computing is a comprehensive guide to the subject, providing both the theoretical and practical guidance technologists need.
on Mar 5, 2015 in Big Data Analytics, Book, Cognitive Computing, Hurwitz, IBM Watson
Top /r/Machine Learning Posts, February: Automating Tinder, Jurgen Schmidhuber, and Shazam
Automating Tinder with Eigenfaces, the elephant in the room of Machine Learning, the Jürgen Schmidhuber AMA, and Shazam's music recognition algorithm make up the top posts in the last month on /r/MachineLearning.
on Mar 5, 2015 in Deep Learning, Eigenface, Jurgen Schmidhuber, Machine Learning, Reddit, Tinder
Top stories in February: 10 things statistics taught about big data; Gartner Analytics MQ: gainers and losers
10 things statistics taught us about big data analysis; Data Science's Most Used, Confused, and Abused Jargon; Gartner Analytics MQ: gainers and losers; My Brief Guide to Big Data and Predictive Analytics for non-experts.
on Mar 5, 2015 in Top stories
Careers in Data Science and Business Analytics – NYU Seminar, Mar 28
In an upcoming one-day seminar by NYU School of Professional Studies, Kaiser Fung will explain how to shape a career in data science and business analytics. Date: March 28, 2015. Registration: Open.
on Mar 4, 2015 in Business Analytics, Career, Data Science, Kaiser Fung, New York City, NY, NYU
Interview: Kaiser Fung, NYU on Why Ignoring Data Integrity is a Recipe for Disaster
We discuss different levels of Data Integrity, logical fallacies in Analytics, measures to boost accountability, role for human intelligence in Analytics and relevance of OCCAM framework.
on Mar 4, 2015 in Data Integrity, Fallacies, Human Intelligence, Junk Charts, Kaiser Fung, NYU, OCCAM
Upcoming March – August 2015 Meetings in Analytics, Big Data, Mining, Data Science
Coming soon: Big Data Paris, TDWI Solution Summit, GigaOM Structure Data, Chief Data Strategy Forum, PAW San Francisco, Text Analytics World SF, Gartner BI and Analytics Summit, and 80+ more meetings/events.
on Mar 4, 2015 in Belgium, Boston, Brussels, CA, Chicago, IL, London, MA, New York City, NY, San Diego, San Francisco, UK, USA
Top /r/MachineLearning Posts, Feb 22-28: Jurgen Schmidhuber AMA and Machine Learning Done Wrong
The Jürgen Schmidhuber AMA begins taking questions, machine learning done wrong, GPUs for deep learning, Google opens its native MapReduce capabilities, and Google publishes its DeepMind paper this week on /r/MachineLearning
on Mar 4, 2015 in Deep Learning, DeepMind, GPU, Jurgen Schmidhuber, Machine Learning, Reddit
The Elements of Data Analytic Style – checklist
Jeff Leek book "Elements of Data Analytic Style" had a rocket launch, thanks to author course on Coursera. The book includes a useful checklist that can guide beginning data analysts or serve for evaluating data analyses.
on Mar 4, 2015 in Book, Checklist, Data Analytics, Jeff Leek, Leanpub, Reproducibility
Failing Optimally – Data Science’s Measurement Problem
Data science has a measurement problem. Simple metrics may not address complex situations. But complex metrics present myriad problems.
on Mar 4, 2015 in Accuracy, Competition, Model Performance, Zachary Lipton
Interview: Ted Dunning, MapR on Apache Mahout & Technology Landscape in ML
We discuss Apache Mahout, its comparison with Spark and H2O, trends, advice, desired qualities in data scientists and more.
on Mar 3, 2015 in Advice, Apache Mahout, Apache Spark, H2O, Interview, Machine Learning, MapR, Ted Dunning
Analytics, Data Mining, Data Science professionals well compensated
US, Canada, and Australian analytics professionals are paid the most, with US/Canada Industry Data Science Managers earning on average $177K, Industry Data Scientists $126K, Academic Researchers $119K, and Data Analysts $86K.
on Mar 3, 2015 in Asia, Australia, Canada, Data Scientist, Europe, Manager, Poll, Salary, USA
Chief Data Officer Summit, San Jose, Apr 28-29, 2015
CDO is vital in bridging the gap between the C-suite and the data team. The Chief Data Officer Summit in San Jose on April 28-29 will bring together top data leaders to discuss the growing responsibilities of the CDO. Special KDnuggets discount.
on Mar 3, 2015 in CA, Chief Data Officer, IE Group, Los Angeles, San Jose, Summit
All Machine Learning Models Have Flaws
This classic post examines what is right and wrong with different models of machine learning, including Bayesian learning, Graphical Models, Convex Loss Optimization, Statistical Learning, and more.
on Mar 3, 2015 in Bayesian, Decision Trees, Gradient Descent, John Langford, Machine Learning, Statistical Learning
Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 3 and beyond
Data Wrangling and the Art of Big Data Discovery, Hadoop - A Solution for Big Data, Fast Data Meets Open Source, Real-Time Data on Hadoop with Apache Kafka, and more.
on Mar 2, 2015 in Cloudera, Data Wrangling, Hadoop, Kafka, Trifacta
Top KDnuggets tweets, Feb 26 – Mar 1: Bayes Theorem explained with Lego; 10 Cool #BigData Cartoons
Cute and Educational: Bayes Theorem explained with Lego; 10 Cool #BigData Cartoons #TGIF; #DataMining Indian Recipes finds spices make negative food pairing more powerful; Key Take-Aways from Gartner 2015 MQ for #BI & Analytics Platforms.
on Mar 2, 2015 in Bayes Theorem, Cartoon, Gartner, Indian Food, Lego, Trevor Hastie
Interview: Ted Dunning, MapR on The Real Meaning of Real-Time in Big Data
We discuss major Big Data developments in 2014, real-time processing, interactive queries, streaming systems, batch systems, MapR partnerships and challenges in scaling recommendation engines.
on Mar 2, 2015 in Big Data, MapR, Real-time, Recommendation, Stream Mining, Ted Dunning
Strata + Hadoop World 2015 San Jose – Day 1 Highlights
Here are the quick takeaways and valuable insights from selected talks at one of the most reputed conferences in Big Data – Strata + Hadoop World 2015, San Jose.
on Mar 2, 2015 in CA, Cloudera, Hadoop, HCatalog, Highlights, Hortonworks, IBM, MapR, MemSQL, San Jose, Strata, Twitter, Yahoo
Top stories for Feb 22-28: Gartner 2015 MQ for Advanced Analytics: gainers and losers; History of Data Science Infographic
Gartner 2015 Magic Quadrant for Advanced Analytics Platforms: who gained and who lost; History of Data Science Infographic in 5 strands; Interview: David Kasik, Boeing on Data Analysis vs Data Analytics.
on Mar 1, 2015 in Top stories
Additions to KDnuggets Directory in February
Big Data Paris, Wharton Conf: Successful Applications of Customer Analytics, analytics consulting firms, Georgetown MS in Analytics, MSc in Data Science in France, and more meetings, companies, education, and solutions.
on Mar 1, 2015 in Belgium, Brandeis, Brussels, DC, France, Paris, Washington, Wharton