2016 Aug Tutorials, Overviews
All (123) | Courses, Education (11) | Meetings (15) | News, Features (23) | Opinions, Interviews (27) | Software (8) | Tutorials, Overviews (32) | Webcasts & Webinars (7)
- Learning from Imbalanced Classes - Aug 31, 2016.
Imbalanced classes can cause trouble for classification. Not all hope is lost, however. Check out this article for methods in which to deal with such a situation.
- How Convolutional Neural Networks Work - Aug 31, 2016.
Get an overview of what is going on inside convolutional neural networks, and what it is that makes them so effective.
- What is the Role of the Activation Function in a Neural Network? - Aug 30, 2016.
Confused as to exactly what the activation function in a neural network does? Read this overview, and check out the handy cheat sheet at the end.
- Data Mining Tip: How to Use High-cardinality Attributes in a Predictive Model - Aug 29, 2016.
High-cardinality nominal attributes can pose an issue for inclusion in predictive models. There exist a few ways to accomplish this, however, which are put forward here.
- How Can Data Scientists Mitigate Sensitive Data Exposure Vulnerability? - Aug 26, 2016.
What is sensitive data? How does it affect data science, and what can be done to mitigate data exposure vulnerability? Read on to find out.
- A Tutorial on the Expectation Maximization (EM) Algorithm - Aug 25, 2016.
This is a short tutorial on the Expectation Maximization algorithm and how it can be used on estimating parameters for multi-variate data.
- Introduction to Local Interpretable Model-Agnostic Explanations (LIME) - Aug 25, 2016.
Learn about LIME, a technique to explain the predictions of any machine learning classifier.
- A Gentle Introduction to Bloom Filter - Aug 24, 2016.
The Bloom Filter is a probabilistic data structure which can make a tradeoff between space and false positive rate. Read more, and see an implementation from scratch, in this post.
- A Primer on Logistic Regression – Part I - Aug 24, 2016.
Gain an understanding of logistic regression - what it is, and when and how to use it - in this post.
- A simple approach to anomaly detection in periodic big data streams - Aug 24, 2016.
We describe a simple and scaling algorithm that can detect rare and potentially irregular behavior in a time series with periodic patterns. It performs similarly to Twitter's more complex approach.
- How to Become a (Type A) Data Scientist - Aug 23, 2016.
This post outlines the difference between a Type A and Type B data scientist, and prescribes a learning path on becoming a Type A.
- A Neat Trick to Increase Robustness of Regression Models - Aug 22, 2016.
Read this take on the validity of choosing a different approach to regression modeling. Why isn't L1 norm used more often?
- Misinformation Key Terms, Explained - Aug 20, 2016.
Misinformation has emerged as a key issue for social media platforms. This post will introduce the concept of misinformation and the 8 Key Terms, which provides insights into mining misinformation in social media.
- The Gentlest Introduction to Tensorflow – Part 2 - Aug 19, 2016.
Check out the second and final part of this introductory tutorial to TensorFlow.
- The 10 Algorithms Machine Learning Engineers Need to Know - Aug 18, 2016.
Read this introductory list of contemporary machine learning algorithms of importance that every engineer should understand.
- Approaching (Almost) Any Machine Learning Problem - Aug 18, 2016.
If you're looking for an overview of how to approach (almost) any machine learning problem, this is a good place to start. Read on as a Kaggle competition veteran shares his pipelines and approach to problem-solving.
- The Gentlest Introduction to Tensorflow – Part 1 - Aug 17, 2016.
In this series of articles, we present the gentlest introduction to Tensorflow that starts off by showing how to do linear regression for a single feature problem, and expand from there.
- Central Limit Theorem for Data Science - Aug 12, 2016.
This post is an introductory explanation of the Central Limit Theorem, and why it is (or should be) of importance to data scientists.
- Understanding the Empirical Law of Large Numbers and the Gambler’s Fallacy - Aug 12, 2016.
Law of large numbers is a important concept for practising data scientists. In this post, The empirical law of large numbers is demonstrated via simple simulation approach using the Bernoulli process.
- Making Data Science Accessible – Neural Networks - Aug 11, 2016.
This post attempts to make the underlying concepts of neural networks more accessible to everyone. Gain a high level view of their working here.
- A Beginner’s Guide to Neural Networks with R! - Aug 11, 2016.
In this article we will learn how Neural Networks work and how to implement them with the R programming language! We will see how we can easily create Neural Networks with R and even visualize them. Basic understanding of R is necessary to understand this article.
- Big Data Key Terms, Explained - Aug 11, 2016.
Just getting started with Big Data, or looking to iron out the wrinkles in your current understanding? Check out these 20 Big Data-related terms and their concise definitions.
- Exploring Social Media Diversity with Natural Language Processing - Aug 10, 2016.
This post uses natural language processing on Twitter data to determine the diversity of Twitter accounts the author is following. An innovative take on social media analytics.
- 7 Steps to Understanding Computer Vision - Aug 9, 2016.
A starting point for Computer Vision and how to get going deeper. Dive into this post for some overview of the right resources and a little bit of advice.
- Brain Monitoring with Kafka, OpenTSDB, and Grafana - Aug 5, 2016.
Interested in using open source software to monitor brain activity, and control your devices? Sure you are! Read this fantastic post for some insight and direction.
- Contest Winner: Winning the AutoML Challenge with Auto-sklearn - Aug 5, 2016.
This post is the first place prize recipient in the recent KDnuggets blog contest. Auto-sklearn is an open-source Python tool that automatically determines effective machine learning pipelines for classification and regression datasets. It is built around the successful scikit-learn library and won the recent AutoML challenge.
- Reinforcement Learning and the Internet of Things - Aug 5, 2016.
Gain an understanding of how reinforcement learning can be employed in the Internet of Things world.
- Contest 2nd Place: Automated Data Science and Machine Learning in Digital Advertising - Aug 4, 2016.
This post is an overview of an automated machine learning system in the digital advertising realm. It is an entrant and second-place recipient in the recent KDnuggets blog contest.
- Getting Started with Data Science – R - Aug 3, 2016.
A great introductory post from DataRobot on getting started with data science in R, including cleaning data and performing predictive modeling.
- Data Science for Beginners: Fantastic Introductory Video Series from Microsoft - Aug 3, 2016.
The remaining videos in Microsoft's Data Science for Beginners video series are available now. Have a look at what they offer.
- Doing Statistics with SQL - Aug 2, 2016.
This post covers how to perform some basic in-database statistical analysis using SQL.
- And the Winner is… Stepwise Regression - Aug 1, 2016.
This post evaluates several methods for automating the feature selection process in large-scale linear regression models and show that for marketing applications the winner is Stepwise regression.