2016 Dec

All (32) | News, Features (7) | Opinions, Interviews (11) | Tutorials, Overviews (14)

Game Theory Reveals the Future of Deep Learning

This post covers the emergence of Game Theoretic concepts in the design of newer deep learning architectures. Deep learning systems need to be adaptive to imperfect knowledge and coordinating systems, 2 areas with which game theory can help.

on Dec 29, 2016 in Architecture, Deep Learning, Optimization
Laying the Foundation for a Data Team

Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.

on Dec 28, 2016 in Analytics Team, Data Science Team, Team
A Funny Look at Big Data and Data Science

A less than serious look at Big Data and Data Science. If you can laugh at all cartoons, then your Data Science skills are in good shape.

on Dec 27, 2016 in Big Data, Cartoon, Humor, SQL
The Five Capability Levels of Deep Learning Intelligence

Deep learning writer Carlos Perez gives his own classification for deep learning-based AI, which is aimed at practitioners. This classification gives us a sense of where we currently are and where we might be heading.

on Dec 22, 2016 in AI, Deep Learning, Machine Intelligence
The big data ecosystem for science: Climate Science and Climate Change

Climate change is one of the most pressing challenges for human society in the 21st century. We review the Big Data ecosystem for studying the climate change.

on Dec 22, 2016 in Big Data, Climate Change, Science, Strata
Privacy, Security and Ethics in Process Mining

Data Privacy, Security and Ethics are hot yet complex topics in the business and data science world. This important article talks about and provide guidelines for privacy, security and ethics, specifically in the context of Process Mining.

on Dec 21, 2016 in Anonymity, Ethics, Privacy, Process Mining, Security
4 Reasons Your Machine Learning Model is Wrong (and How to Fix It)

This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.

on Dec 21, 2016 in Bias, Overfitting, Variance
Data Science Basics: Power Laws and Distributions

Power laws and other relationships between observable phenomena may not seem like they are of any interest to data science, at least not to newcomers to the field, but this post provides an overview and suggests how they may be.

on Dec 21, 2016 in Beginners, Data Science, Distribution, Zipf's Law
Data Sources for Cool Data Science Projects

One of the biggest obstacles to successful projects has been getting access to interesting data. Here are some more cool public data sources you can use for your next project.

on Dec 20, 2016 in Data Incubator, Datasets, Elections, Healthcare, Michael Li
Machine Learning & Artificial Intelligence: Main Developments in 2016 and Key Trends in 2017

As 2016 comes to a close and we prepare for a new year, check out the final instalment in our "Main Developments in 2016 and Key Trends in 2017" series, where experts weigh in with their opinions.

on Dec 20, 2016 in 2017 Predictions, AI, Artificial Intelligence, Machine Learning, Predictions
ResNets, HighwayNets, and DenseNets, Oh My!

This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.

on Dec 19, 2016 in Convolutional Neural Networks, Deep Learning, Neural Networks
The 5 Basic Types of Data Science Interview Questions

Data science interviews are notoriously complex, but most of what they throw at you will fall into one of these categories.

on Dec 16, 2016 in Data Science, Interview Questions, Springboard
Introduction to Bayesian Inference

Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.

on Dec 16, 2016 in Bayesian, Datascience.com, Inference, Probability
Top 2016 KDnuggets Stories: Must-Know Data Science Interview Q&A, 10 Algorithms Machine Learning Engineers Need to Know

Also 20 Questions to Detect Fake Data Scientists; Software used for Analytics, Data Science, Machine Learning projects; Top Algorithms and Methods Used by Data Scientists

on Dec 15, 2016 in Top stories
Artificial Intelligence and Life in 2030

Read this engaging overview of a report from the Stanford University 100 year study of Artificial Intelligence, “a long-term investigation of the field of Artificial Intelligence (AI) and its influences on people, their communities, and society.”

on Dec 15, 2016 in AI, Artificial Intelligence, Future
50+ Data Science, Machine Learning Cheat Sheets, updated

Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.

on Dec 14, 2016 in Cheat Sheet, Data Science, Django, Hadoop, Java, Machine Learning, MATLAB, Python, R
The Costs of Misclassifications

Importance of correct classification and hazards of misclassification are subjective or we can say varies on case-to-case. Lets see how cost of misclassification is measured from monetary perspective.

on Dec 14, 2016 in Accuracy, Classification, Cost Sensitive, Salford Systems
Data Science Basics: What Types of Patterns Can Be Mined From Data?

Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.

on Dec 14, 2016 in Beginners, Classification, Data Science, Frequent Pattern Mining, Outliers, Regression
Data Science, Predictive Analytics Main Developments in 2016 and Key Trends for 2017

Key themes included the polling failures in 2016 US Elections, Deep Learning, IoT, greater focus on value and ROI, and increasing adoption of predictive analytics by the "masses" of industry.

on Dec 13, 2016 in 2017 Predictions, Data Science, John Elder, Kirk D. Borne, Predictive Analytics, Tom Davenport
Data Analytics Models in Quantitative Finance and Risk Management

We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.

on Dec 13, 2016 in Data Analytics, Feature Selection, Finance, Regression, Risk Modeling
arXiv Paper Spotlight: Why Does Deep and Cheap Learning Work So Well?

The recent paper at hand approaches explaining deep learning from a different perspective, that of physics, and discusses the role of "cheap learning" (parameter reduction) and how it relates back to this innovative perspective.

on Dec 13, 2016 in Academics, arXiv, Deep Learning, Machine Learning
4 Cognitive Bias Key Points Data Scientists Need to Know

Cognitive biases are inherently problematic in a variety of fields, including data science. Is this something that can be mitigated? A solid understanding of cognitive biases is the best weapon, which this overview hopes to help provide.

on Dec 9, 2016 in Bias, Cognitive Bias, Confirmation Bias
Introduction to K-means Clustering: A Tutorial

A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.

on Dec 9, 2016 in Clustering, Datascience.com, K-means, Python
Artificial Neural Networks (ANN) Introduction, Part 2

Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.

on Dec 9, 2016 in Algobeans, Deep Learning, Neural Networks
Artificial Neural Networks (ANN) Introduction, Part 1

This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.

on Dec 8, 2016 in Algobeans, Image Recognition, MNIST, Neural Networks
The Best Metric to Measure Accuracy of Classification Models

Measuring accuracy of model for a classification problem (categorical output) is complex and time consuming compared to regression problems (continuous output). Let’s understand key testing metrics with example, for a classification problem.

on Dec 7, 2016 in Accuracy, Classification, CleverTap, Measurement, Metrics, Precision, Unbalanced
Free ebooks: Machine Learning with Python and Practical Data Analysis

Two free ebooks: "Building Machine Learning Systems with Python" and "Practical Data Analysis" will give your skills a boost and make a great start in the New Year.

on Dec 5, 2016 in Data Analysis, Free ebook, Machine Learning, Packt Publishing, Python
Why Deep Learning is Radically Different From Machine Learning

There is a lot of confusion these days about Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL), yet the distinction is very clear to practitioners in these fields. Are you able to articulate the difference?

on Dec 5, 2016 in Deep Learning, Machine Learning
Smart Data Platform – The Future of Big Data Technology

Data processing and analytical modelling are major bottlenecks in today’s big data world, due to need of human intelligence to decide relationships between data, required data engineering tasks, analytical models and it’s parameters. This article talks about Smart Data Platform to help to solve such problems.

on Dec 2, 2016 in Big Data, Big Data Analytics, China, Data Processing, Modeling, TalkingData
Random Forests® in Python

Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.

on Dec 2, 2016 in Algorithms, Classification, Ensemble Methods, Python, random forests algorithm, Yhat
The hard thing about deep learning

It’s easy to optimize simple neural networks, let’s say single layer perceptron. But, as network becomes deeper, the optmization problem becomes crucial. This article discusses about such optimization problems with deep neural networks.

on Dec 1, 2016 in CA, Deep Learning, Neural Networks, NP-hard, Optimization, San Jose, Strata
Top Reasons Why Big Data, Data Science, Analytics Initiatives Fail

We examine the main reasons for failure in Big Data, Data Science, and Analytics projects which include lack of clear mandate, resistance to change, and not asking the right questions, and what can be done to address these problems.

on Dec 1, 2016 in Big Data, Data Science, Failure, Project Fail

2016 Dec

Latest Posts

Top Posts