- Cartoon: Machine Learning – What They Think I Do - Apr 29, 2017.
Different views of Machine Learning: What society, my friends, my parents, other programmers think I do, and what I really do.
Cartoon, Machine Learning
- Keep it simple! How to understand Gradient Descent algorithm - Apr 28, 2017.
In Data Science, Gradient Descent is one of the important and difficult concepts. Here we explain this concept with an example, in a very simple way. Check this out.
Algorithms, Gradient Descent
- One Deep Learning Virtual Machine to Rule Them All - Apr 28, 2017.
The frontend code of programming languages only needs to parse and translate source code to an intermediate representation (IR). Deep Learning frameworks will eventually need their own “IR.”
Deep Learning, Neural Networks
- Models: From the Lab to the Factory - Apr 27, 2017.
In this post, we’ll go over techniques to avoid these scenarios through the process of model management and deployment.
Data Science, Modeling, SVDS
- Dask and Pandas and XGBoost: Playing nicely between distributed systems - Apr 27, 2017.
This blogpost gives a quick example using Dask.dataframe to do distributed Pandas data wrangling, then using a new dask-xgboost package to setup an XGBoost cluster inside the Dask cluster and perform the handoff.
Dask, Distributed Systems, Pandas, Python, XGBoost
- What Data You Analyzed – KDnuggets Poll Results and Trends - Apr 26, 2017.
Image/video data analysis is surging, JSON replacing XML, anonymized data usage is growing in US and Europe (but not in Asia), itemsets and Twitter analysis is declining - some of the highlights of KDnuggets Poll on data types used.
Anonymized, Asia, Data types, Europe, Image Recognition, Poll, Text Analysis, Time Series, USA
- The Analytics of Emotion and Depression - Apr 26, 2017.
Analytics can be used to provide a boost to the cure of depression. How analytics is being adopted by companies like Microsoft, Facebook to handle and detect vulnerable targets of depression.
Analytics, Depression, India, Instagram, Sentiment Analysis, Social Media Analytics, Text Analysis
- How to Build a Recurrent Neural Network in TensorFlow - Apr 26, 2017.
This is a no-nonsense overview of implementing a recurrent neural network (RNN) in TensorFlow. Both theory and practice are covered concisely, and the end result is running TensorFlow RNN code.
Deep Learning, Neural Networks, Recurrent Neural Networks, TensorFlow
- The Data Science of Steel, or Data Factory to Help Steel Factory - Apr 25, 2017.
Applying Machine Learning to steel production is really hard! Here are some lessons from Yandex researchers on how to balance the need for findings to be accurate, useful, and understandable at the same time.
Applications, Recommendation Engine, Regression, Russia, Steel, Yandex
- AI & Machine Learning Black Boxes: The Need for Transparency and Accountability - Apr 25, 2017.
When something goes wrong, as it inevitably does, it can be a daunting task discovering the behavior that caused an event that is locked away inside a black box where discoverability is virtually impossible.
AI, Machine Learning, Transparency
- Must-Know: When can parallelism make your algorithms run faster? When could it make your algorithms run slower? - Apr 25, 2017.
Efficient implementation is key to achieving the benefits of parallelization, even though parallelism is a good idea when the task can be divided into sub-tasks that can be executed independent of each other without communication or shared resources.
Interview Questions, Parallelism
- Cartoon: the distance between Espresso and Cappuccino - Apr 22, 2017.
This cartoon takes a vector space approach to your favorite drinks and examines the distance between Espresso and Cappuccino. Warning: this is only funny to Data Scientists and mathematicians.
Cartoon, Coffee, Humor, word2vec
- Difference Between Big Data and Internet of Things - Apr 21, 2017.
If you cannot manage real-time streaming data and make real-time analytics and real-time decisions at the edge, then you are not doing IOT or IOT analytics, in my humble opinion. So what is required to support these IOT data management and analytic requirements?
Big Data, Internet of Things, IoT
- Awesome Deep Learning: Most Cited Deep Learning Papers - Apr 21, 2017.
This post introduces a curated list of the most cited deep learning papers (since 2012), provides the inclusion criteria, shares a few entry examples, and points to the full listing for those interested in investigating further.
Deep Learning, Neural Networks, Research
- Dataiku: The Complete Data Sheet - Apr 20, 2017.
Whether your every day tool is Scala, Python, R, or Excel, you can now use one tool - Dataiku - to transform raw data to predictions without the hassle. Discover the platform!
Automated Data Science, Data Science Platform, Data Workflow, Dataiku
- The Value of Exploratory Data Analysis - Apr 20, 2017.
In this post, we will give a high level overview of what exploratory data analysis (EDA) typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.
Data Analysis, Data Exploration, Data Visualization, SVDS
- How to Lie with Data - Apr 20, 2017.
We expect data scientists to be objective, but intentionally or not, they can produce results that mislead. We examine three common types of “lies” that Data Scientists should be aware of.
Confirmation Bias, Data Visualization, Mistakes, Overfitting
- Data Science for the Layman (No Math Added) - Apr 20, 2017.
Written for the layman, this book is a practical yet gentle introduction to data science. Discover key concepts behind more than 10 classic algorithms, explained with real-world examples and intuitive visuals.
Book, Data Science, Machine Learning, Tutorial
- How Big Data Helps Today’s Airlines Operate - Apr 19, 2017.
Companies all over the world have placed a lot of value on getting more insights from big data analytics. That’s not without good reason.
Airlines, Big Data
- E-learning courses on Advanced Analytics, Credit Risk Modeling, and Fraud Analytics - Apr 18, 2017.
These online courses, developed by Prof. Bart Baesens and SAS, include videos, case studies, quizzes, and focus on focusses on the concepts and modeling methodologies and not on specific software.
Advanced Analytics, Bart Baesens, Credit Risk, Fraud analytics, Online Education, SAS
- The dynamics between AI and IoT - Apr 18, 2017.
We see the need for a new type of Engineer who will combine knowledge from Electronics & IoT with Machine learning, AI, Robotics, Cloud and Data management (devops).
AI, Cloud Computing, Data Management, DevOps, Engineer, IoT, Robots
- Time Series Analysis with Generalized Additive Models - Apr 18, 2017.
In this tutorial, we will see an example of how a Generative Additive Model (GAM) is used, learn how functions in a GAM are identified through backfitting, and learn how to validate a time series model.
Temporal Data, Time Series
- Must-Know: What is the curse of dimensionality? - Apr 18, 2017.
What is the curse of dimensionality? This post gives a no-nonsense overview of the concept, plain and simple.
Dimensionality Reduction, High-dimensional, Interview Questions
- More Deep Learning “Magic”: Paintings to photos, horses to zebras, and more amazing image-to-image translation - Apr 17, 2017.
This is an introduction to recent research which presents an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.
Deep Learning, Generative Adversarial Network, Generative Models, Torch
- Cartoon: Taxes, Artificial Intelligence, and Humans - Apr 15, 2017.
In honor of Tax Day, new KDnuggets Cartoon looks at an unexpected white-collar job that may resist automation and Machine Learning.
AI, Artificial Intelligence, Cartoon, Fraud Detection, Humans, Taxes
- What Makes a Good Analyst? - Apr 14, 2017.
Without doubt, critical thinking is necessary in order to be a good analyst but particular skills and experience are also required. What are some of these skills?
Analyst, Science
- Is Blockchain the Ultimate Enabler of Data Monetization? - Apr 14, 2017.
Is blockchain the ultimate enabler of data and analytics monetization; creating marketplaces where companies, individuals and even smart entities (cars, trucks, building, airports, malls) can share/sell/trade/barter their data and analytic insights directly with others?
Blockchain, Data Monetization, Monetizing
- Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions - Apr 14, 2017.
Who leads in Data Science, Machine Learning, and Predictive Analytics? We compare the latest Forrester and Gartner reports for this industry for 2017 Q1, identify gainers and losers, and strong leaders vs contenders.
Data Science Platform, Forrester, Gartner, IBM, Knime, Machine Learning, Mike Gualtieri, Predictive Analytics, RapidMiner, SAS
- Top mistakes data scientists make when dealing with business people - Apr 13, 2017.
There are no cover articles praising the fails of the many data scientists that don’t live up to the hype. Here we examine 3 typical mistakes and how to avoid them.
Business, Data Scientist, Mistakes, Skills
- 5 Machine Learning Projects You Can No Longer Overlook, April - Apr 13, 2017.
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
Data Exploration, Deep Learning, Java, Machine Learning, Neural Networks, Overlook, Python, Scala, scikit-learn, Topic Modeling
- Machine Learning Finds “Fake News” with 88% Accuracy - Apr 12, 2017.
In this post, the author assembles a dataset of fake and real news and employs a Naive Bayes classifier in order to create a model to classify an article as fake or real based on its words and phrases.
Data Science, Fake News, Machine Learning, Naive Bayes, Politics, Text Analytics
- Anonymization and the Future of Data Science - Apr 11, 2017.
This post walks the reader through a real-world example of a "linkage" attack to demonstrate the limits of data anonymization. New privacy regulation, most notably the GDPR, are making it increasingly difficult to maintain a balance between privacy and utility.
Big Data Privacy, Data Science, Law, Privacy
- The Evolution of a Productive Data Team - Apr 11, 2017.
Successful data teams at companies of any size are able to produce results because they develop gradually through a series of stages and acquire skills along the way that help them stay efficient and effective.
Data Science Team, Dataiku
- Must-Know: How to evaluate a binary classifier - Apr 11, 2017.
Binary classification is a basic concept which involves classifying the data into two groups. Read on for some additional insight and approaches.
Classifier, Interview Questions, Machine Learning
- New Poll: What data types you analyzed? - Apr 11, 2017.
New KDnuggets Poll is asking: What data types you analyzed in the past 12 months? Please vote.
Data types, Poll
- 10 Free Must-Read Books for Machine Learning and Data Science - Apr 10, 2017.
Spring. Rejuvenation. Rebirth. Everything’s blooming. And, of course, people want free ebooks. With that in mind, here's a list of 10 free machine learning and data science titles to get your spring reading started right.
Books, Data Science, ebook, Free ebook, Machine Learning
- The 42 V’s of Big Data and Data Science - Apr 7, 2017.
It's 2017 now, and we now operate in an ever more sophisticated world of analytics. To keep up with the times, we present our updated 2017 list: The 42 V's of Big Data and Data Science.
3Vs of Big Data, Humor
- A Brief History of Artificial Intelligence - Apr 7, 2017.
This post is a brief outline of what happened in artificial intelligence in the last 60 years. A great place to start or brush up on your history.
AI, Artificial Intelligence, History, ImageNet
- Stuff Happens: A Statistical Guide to the “Impossible” - Apr 6, 2017.
Why are some people struck by lightning multiple times or, more encouragingly, how could anyone possibly win the lottery more than once? The odds against these sorts of things are enormous.
Probability, Statistics
- How to stay out of analytic rabbit holes: avoiding investigation loops and their traps - Apr 6, 2017.
Data scientists tend to think that their main job is to answer complex questions and gain in-depth insights, bu in reality it is all about solving problems – and the only way to solve a problem is to act on it.
Data Science, Methodology, Skills
- Top 20 Recent Research Papers on Machine Learning and Deep Learning - Apr 6, 2017.
Machine learning and Deep Learning research advances are transforming our technology. Here are the 20 most important (most-cited) scientific papers that have been published since 2014, starting with "Dropout: a simple way to prevent neural networks from overfitting".
Deep Learning, Machine Learning, Research, Top list, Yoshua Bengio
- Putting Together A Full-Blooded AI Maturity Model - Apr 5, 2017.
Here is a proposed “7A” model that is useful enough to capture of the core of what AI offers without falsely implying there is a static body of best practices in this area.
AI, Bernard Marr, Maturity Model, Methodology, Mike Gualtieri
- Top /r/MachineLearning Posts, March: A Super Harsh Guide to Machine Learning; Is it Gaggle or Koogle?!? - Apr 4, 2017.
A Super Harsh Guide to Machine Learning; Google is acquiring data science community Kaggle; Suggestion by Salesforce chief data scientist; Andrew Ng resigning from Baidu; Distill: An Interactive, Visual Journal for Machine Learning Research
Advice, Andrew Ng, Distill, Google, Kaggle, Machine Learning, Reddit, Salesforce
- Must-Know: Why it may be better to have fewer predictors in Machine Learning models? - Apr 4, 2017.
There are a few reasons why it might be a better idea to have fewer predictor variables rather than having many of them. Read on to find out more.
Feature Selection, Interview Questions, Machine Learning, Modeling
- Introduction to Anomaly Detection - Apr 3, 2017.
This overview will cover several methods of detecting anomalies, as well as how to build a detector in Python using simple moving average (SMA) or low-pass filter.
Anomaly Detection, Datascience.com, Python, Time Series
- What is AI? Ingredients for Intelligence - Apr 3, 2017.
This introductory overview of artificial intelligence acts as a layman's guide what AI is, and what it is made up of.
AI, GRAKN.AI, Machine Intelligence, Turing Test