- Evolutionary Algorithms for Feature Selection - Nov 29, 2017.
Feature selection is a very important technique in machine learning. In this post we discuss one of the most common optimization algorithms for multi-modal fitness landscapes - evolutionary algorithms.
Evolutionary Algorithm, Feature Selection, RapidMiner
Understanding Deep Convolutional Neural Networks with a practical use-case in Tensorflow and Keras - Nov 29, 2017.
We show how to build a deep neural network that classifies images to many categories with an accuracy of a 90%. This was a very hard problem before the rise of deep networks and especially Convolutional Neural Networks.
Pages: 1 2
Convolutional Neural Networks, Deep Learning, Keras, TensorFlow
Why You Should Forget ‘for-loop’ for Data Science Code and Embrace Vectorization - Nov 29, 2017.
Data science needs fast computation and transformation of data. NumPy objects in Python provides that advantage over regular programming constructs like for-loop. How to demonstrate it in few easy lines of code?
numpy, Python, Scientific Computing
- Natural Language Processing Library for Apache Spark – free to use - Nov 28, 2017.
Introducing the Natural Language Processing Library for Apache Spark - and yes, you can actually use it for free! This post will give you a great overview of John Snow Labs NLP Library for Apache Spark.
Apache Spark, API, GitHub, John Snow Labs, Machine Learning, NLP
- How To Unit Test Machine Learning Code - Nov 28, 2017.
One of the main principles I learned during my time at Google Brain was that unit tests can make or break your algorithm and can save you weeks of debugging and training time.
Machine Learning, Neural Networks, Python, Software Engineering, TensorFlow
- Survival Analysis for Business Analytics - Nov 27, 2017.
We compare survival analysis to other predictive techniques, and provide examples of how it can produce business value, with a focus on Kaplan-Meier and Cox Regression methods which have been underutilized in business analytics.
Business Analytics, Survival Analysis, Time Series
- How (and Why) to Create a Good Validation Set - Nov 24, 2017.
The definitions of training, validation, and test sets can be fairly nuanced, and the terms are sometimes inconsistently used. In the deep learning community, “test-time inference” is often used to refer to evaluating on data in production, which is not the technical definition of a test set.
Cross-validation, Datasets, Rachel Thomas, Training Data, Validation
- Understanding Objective Functions in Neural Networks - Nov 23, 2017.
This blog post is targeted towards people who have experience with machine learning, and want to get a better intuition on the different objective functions used to train neural networks.
Cost Function, Deep Learning, Gradient Descent, Neural Networks, Optimization
- Building a Wikipedia Text Corpus for Natural Language Processing - Nov 23, 2017.
Wikipedia is a rich source of well-organized textual data, and a vast collection of knowledge. What we will do here is build a corpus from the set of English Wikipedia articles, which is freely and conveniently available online.
Datasets, Natural Language Processing, NLP, Text Mining, Wikidata, Wikipedia
- Taming the Python Visualization Jungle, Nov 29 Webinar - Nov 22, 2017.
Python has a ton of plotting libraries—but which ones should you use? And how should you go about choosing them? This webinar shows you key starting points and demonstrates how to solve a range of common problems.
Anaconda, Data Visualization, Python
A Framework for Approaching Textual Data Science Tasks - Nov 22, 2017.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
Modeling, Natural Language Processing, NLP, Text Analytics, Text Mining
- Best Masters in Data Science and Analytics in US/Canada - Nov 21, 2017.
Second comprehensive list of master's degrees in the US and Canada with tuition information and duration.
Pages: 1 2
Canada, Master of Science, MS in Analytics, MS in Business Analytics, MS in Data Science, USA
- Using TensorFlow for Predictive Analytics with Linear Regression - Nov 21, 2017.
This post presents a powerful and simple example of how to use TensorFlow to perform a Linear Regression. check out the code for your own experiments!
Linear Regression, TensorFlow
- Estimating an Optimal Learning Rate For a Deep Neural Network - Nov 21, 2017.
This post describes a simple and powerful way to find a reasonable learning rate for your neural network.
Deep Learning, Hyperparameter, Neural Networks
- New Poll: Which Data Science / Machine Learning methods and tools you used? - Nov 20, 2017.
Please vote in new KDnuggets poll which examines the methods and tools used for a real-world application or project.
Algorithms, Data Science Tools, Machine Learning, Poll
Automated Feature Engineering for Time Series Data - Nov 20, 2017.
We introduce a general framework for developing time series models, generating features and preprocessing the data, and exploring the potential to automate this process in order to apply advanced machine learning algorithms to almost any time series problem.
Automated Machine Learning, Data Preparation, Feature Engineering, Feature Selection, Time Series
Top 10 Videos on Deep Learning in Python - Nov 17, 2017.
Playlists, individual tutorials (not part of a playlist) and online courses on Deep Learning (DL) in Python using the Keras, Theano, TensorFlow and PyTorch libraries. Assumes no prior knowledge. These videos cover all skill levels and time constraints!
Deep Learning, Keras, Python, PyTorch, TensorFlow, Theano, Top 10, Tutorials, Videolectures, Youtube
- 8 Ways to Improve Your Data Science Skills in 2 Years - Nov 17, 2017.
Two years. Two years is the maximum amount of time you should spend focused on your learning, education and training. That’s exactly why this guide is focused on honing the most beneficial skills in two years.
Data Science, Data Science Skills, Skills, Training
- The Python Graph Gallery - Nov 16, 2017.
Welcome to the Python Graph Gallery, a website that displays hundreds of python charts with their reproducible code snippets.
Data Visualization, Matplotlib, Python, Seaborn
- PySpark SQL Cheat Sheet: Big Data in Python - Nov 16, 2017.
PySpark is a Spark Python API that exposes the Spark programming model to Python - With it, you can speed up analytic applications. With Spark, you can get started with big data processing, as it has built-in modules for streaming, SQL, machine learning and graph processing.
Pages: 1 2
Apache Spark, Big Data, DataCamp, Python, SQL
- You have created your first Linear Regression Model. Have you validated the assumptions? - Nov 15, 2017.
Linear Regression is an excellent starting point for Machine Learning, but it is a common mistake to focus just on the p-values and R-Squared values while determining validity of model. Here we examine the underlying assumptions of a Linear Regression, which need to be validated before applying the model.
Data Science, Linear Regression, Machine Learning, Multicollinearity, Statistics
The 10 Statistical Techniques Data Scientists Need to Master - Nov 15, 2017.
The author presents 10 statistical techniques which a data scientist needs to master. Build up your toolbox of data science tools by having a look at this great overview post.
Pages: 1 2
Algorithms, Data Science, Data Scientist, Machine Learning, Statistical Learning, Statistics
Best Online Masters in Data Science and Analytics – a comprehensive, unbiased survey - Nov 14, 2017.
The first comprehensive and objective survey of online Masters in Analytics / Data Science, including rankings, tuition, and duration of the education program.
Pages: 1 2
Master of Science, MS in Analytics, MS in Business Analytics, MS in Data Science, Online Education
- Extracting Tweets With R - Nov 14, 2017.
This article will give you a great, brief overview for extracting Tweets using R.
R, Twitter
Machine Learning Algorithms: Which One to Choose for Your Problem - Nov 14, 2017.
This article will try to explain basic concepts and give some intuition of using different kinds of machine learning algorithms in different tasks. At the end of the article, you’ll find the structured overview of the main features of described algorithms.
Algorithms, Machine Learning, Reinforcement Learning, Statsbot, Supervised Learning, Unsupervised Learning
- Your guide to predictive analytics in media and entertainment - Nov 13, 2017.
Download your free guide to predictive analytics in media and entertainment for a look at the landscape and use cases, from Dataiku.
Dataiku, Entertainment, Free ebook, Media
A Day in the Life of a Data Scientist - Nov 13, 2017.
Are you interested in what a data scientist does on a typical day of work? Each data science role may be different, but these five individuals provide insight to help those interested in figuring out what a day in the life of a data scientist actually looks like.
Advice, Career, Data Science, Data Scientist
- Overview of GANs (Generative Adversarial Networks) – Part I - Nov 10, 2017.
A great introductory and high-level summary of Generative Adversarial Networks.
Deep Learning, GANs, Generative Adversarial Network, Neural Networks
- How Bayesian Networks Are Superior in Understanding Effects of Variables - Nov 9, 2017.
Bayes Nets have remarkable properties that make them better than many traditional methods in determining variables’ effects. This article explains the principle advantages.
Bayesian, Bayesian Networks, Predictive Models, Probability, Regression, Statistics
- The Qualitative Side of Quantitative Research - Nov 9, 2017.
Kevin and Koen may buy the same brand for the same reasons. On the other hand, they may buy the same brand for different reasons, or buy different brands for the same reasons, or even different brands for different reasons. The brands they purchase and the reasons why may vary by occasion, too.
Qualitative Analytics, Qualitative Research, Quantitative Analytics, Research
- TensorFlow: What Parameters to Optimize? - Nov 9, 2017.
Learning TensorFlow Core API, which is the lowest level API in TensorFlow, is a very good step for starting learning TensorFlow because it let you understand the kernel of the library. Here is a very simple example of TensorFlow Core API in which we create and train a linear regression model.
Neural Networks, Optimization, Python, TensorFlow
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
Python, R, Text Mining
When Will Demand for Data Scientists/Machine Learning Experts Peak? - Nov 7, 2017.
We analyze the results of Data Science / Machine Learning peak demand poll, examine the split between optimists and pessimists, and try to explain why predictions look so similar regardless of experience, affiliation, and region?
Data Scientist, Hiring, Machine Learning, Poll, Trends
Interpreting Machine Learning Models: An Overview - Nov 7, 2017.
This post summarizes the contents of a recent O'Reilly article outlining a number of methods for interpreting machine learning models, beyond the usual go-to measures.
Interpretability, Machine Learning, Modeling, O'Reilly
- What is the difference between Bagging and Boosting? - Nov 6, 2017.
Bagging and Boosting are both ensemble methods in Machine Learning, but what’s the key behind them? Here we explain in detail.
Bagging, Boosting, Ensemble Methods, Machine Learning
- Blockchain Key Terms, Explained - Nov 3, 2017.
Need a quick glance over some important definitions associated with the Blockchain? Then consider this article your Blockchain Definitions 101!
Bitcoin, Blockchain, Cryptocurrency, Explained, Hashing, Key Terms
- More than the Hype: Beyond Gartner’s Hype Cycle - Nov 3, 2017.
Gartner publishes hype cycles across different technologies and sectors. Here we conduct detailed analysis of Gartner’s Hype Cycles.
Gartner, Hype, Stocks
Want to know how Deep Learning works? Here’s a quick guide for everyone - Nov 3, 2017.
Once you’ve read this article, you will understand the basics of AI and ML. More importantly, you will understand how Deep Learning, the most popular type of ML, works.
Deep Learning, Neural Networks
- Cybersecurity: Managing Risk in the Information age, Harvard online short course - Nov 2, 2017.
Learn how to identify and manage operational risk, litigation risk and reputational risk. This course is brought to you by HarvardX in collaboration with GetSmarter, experts in online education for working professionals.
Cybersecurity, Harvard, Online Education, Risk Assessment
- Process Mining with R: Introduction - Nov 2, 2017.
In the past years, several niche tools have appeared to mine organizational business processes. In this article, we’ll show you that it is possible to get started with “process mining” using well-known data science programming languages as well.
Pages: 1 2
Data Mining, Data Science, Process Mining, R
- 3 different types of machine learning - Nov 1, 2017.
In this extract from “Python Machine Learning” a top data scientist Sebastian Raschka explains 3 main types of machine learning: Supervised, Unsupervised and Reinforcement Learning. Use code PML250KDN to save 50% off the book cost.
Pages: 1 2
Classification, Clustering, Machine Learning, Regression, Reinforcement Learning, Supervised Learning
- Conjoint Analysis: A Primer - Nov 1, 2017.
Conjoint is another of those things everyone talks about but many are confused about…
Statistical Analysis, Statistics
- Getting Started with Machine Learning in One Hour! - Nov 1, 2017.
Here is a machine learning getting started guide which grew out of the author's notes for a one hour talk on the subject. Hopefully you find the path helpful.
Beginners, Machine Learning