DataCamp - Easiest Way to Learn Data Science
Learning R? Take this free
Intro to R for Data Science Tutorial.
Learning Python? Take this free
Intro to Python for Data Science Tutorial.
Check also these fantastic posts:
R Learning Path: From beginner to expert in R in 7 steps
Comprehensive Guide to Learning Python for Data Science
Models: From the Lab to the Factory - Apr 27, 2017.
In this post, we’ll go over techniques to avoid these scenarios through the process of model management and deployment.
Dask and Pandas and XGBoost: Playing nicely between distributed systems - Apr 27, 2017.
This blogpost gives a quick example using Dask.dataframe to do distributed Pandas data wrangling, then using a new dask-xgboost package to setup an XGBoost cluster inside the Dask cluster and perform the handoff.
How to Build a Recurrent Neural Network in TensorFlow - Apr 26, 2017.
This is a no-nonsense overview of implementing a recurrent neural network (RNN) in TensorFlow. Both theory and practice are covered concisely, and the end result is running TensorFlow RNN code.
Must-Know: When can parallelism make your algorithms run faster? When could it make your algorithms run slower? - Apr 25, 2017.
Efficient implementation is key to achieving the benefits of parallelization, even though parallelism is a good idea when the task can be divided into sub-tasks that can be executed independent of each other without communication or shared resources.
Data Science Dividends – A Gentle Introduction to Financial Data Analysis - Apr 24, 2017.
This post outlines some very basic methods for performing financial data analysis using Python, Pandas, and Matplotlib, focusing mainly on stock price data. A good place for beginners to start.
Difference Between Big Data and Internet of Things - Apr 21, 2017.
If you cannot manage real-time streaming data and make real-time analytics and real-time decisions at the edge, then you are not doing IOT or IOT analytics, in my humble opinion. So what is required to support these IOT data management and analytic requirements?
Awesome Deep Learning: Most Cited Deep Learning Papers - Apr 21, 2017.
This post introduces a curated list of the most cited deep learning papers (since 2012), provides the inclusion criteria, shares a few entry examples, and points to the full listing for those interested in investigating further.
The Value of Exploratory Data Analysis - Apr 20, 2017.
In this post, we will give a high level overview of what exploratory data analysis (EDA) typically entails and then describe three of the major ways EDA is critical to successfully model and interpret its results.
Negative Results on Negative Images: Major Flaw in Deep Learning? - Apr 20, 2017.
This is an overview of recent research outlining the limitations of the capabilities of image recognition using deep neural networks. But should this really be considered a "limitation?"
Time Series Analysis with Generalized Additive Models - Apr 18, 2017.
In this tutorial, we will see an example of how a Generative Additive Model (GAM) is used, learn how functions in a GAM are identified through backfitting, and learn how to validate a time series model.
Must-Know: What is the curse of dimensionality? - Apr 18, 2017.
What is the curse of dimensionality? This post gives a no-nonsense overview of the concept, plain and simple.
Predictive Maintenance: A Primer - Apr 17, 2017.
Companies can no longer afford to have product rollbacks or have wastage because of replacement parts. This is where the need for “Predictive Maintenance” comes into play.
New Online Data Science Tracks for 2017 - Apr 17, 2017.
In 2017 there are many new and revamped data science tracks that are much more comprehensive for beginners than ever before. The tracks are designed to give you the skills you need to grab a job in data science, and some even have a job guarantee.
Is Blockchain the Ultimate Enabler of Data Monetization? - Apr 14, 2017.
Is blockchain the ultimate enabler of data and analytics monetization; creating marketplaces where companies, individuals and even smart entities (cars, trucks, building, airports, malls) can share/sell/trade/barter their data and analytic insights directly with others?
Medical Image Analysis with Deep Learning , Part 2 - Apr 13, 2017.
In this article we will talk about basics of deep learning from the lens of Convolutional Neural Nets. We plan to use this knowledge to build CNNs in the next post and use Keras to develop a model to predict lung cancer.
5 Machine Learning Projects You Can No Longer Overlook, April - Apr 13, 2017.
It's about that time again... 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out. Find tools for data exploration, topic modeling, high-level APIs, and feature selection herein.
Machine Learning Finds “Fake News” with 88% Accuracy - Apr 12, 2017.
In this post, the author assembles a dataset of fake and real news and employs a Naive Bayes classifier in order to create a model to classify an article as fake or real based on its words and phrases.
Anonymization and the Future of Data Science - Apr 11, 2017.
This post walks the reader through a real-world example of a "linkage" attack to demonstrate the limits of data anonymization. New privacy regulation, most notably the GDPR, are making it increasingly difficult to maintain a balance between privacy and utility.
Must-Know: How to evaluate a binary classifier - Apr 11, 2017.
Binary classification is a basic concept which involves classifying the data into two groups. Read on for some additional insight and approaches.
The 42 V’s of Big Data and Data Science - Apr 7, 2017.
It's 2017 now, and we now operate in an ever more sophisticated world of analytics. To keep up with the times, we present our updated 2017 list: The 42 V's of Big Data and Data Science.
A Brief History of Artificial Intelligence - Apr 7, 2017.
This post is a brief outline of what happened in artificial intelligence in the last 60 years. A great place to start or brush up on your history.
Top 20 Recent Research Papers on Machine Learning and Deep Learning - Apr 6, 2017.
Machine learning and Deep Learning research advances are transforming our technology. Here are the 20 most important (most-cited) scientific papers that have been published since 2014, starting with "Dropout: a simple way to prevent neural networks from overfitting".
Finding “Gems” in Big Data - Apr 4, 2017.
Detecting anomalous cases in large datasets is critical in conducting surveillance, countering credit-card fraud, protecting against network hacking, combating insurance fraud, and many more applications in government, business and healthcare. Learn how to do it online in "Anomaly Detection" course at Statistics.com.
Must-Know: Why it may be better to have fewer predictors in Machine Learning models? - Apr 4, 2017.
There are a few reasons why it might be a better idea to have fewer predictor variables rather than having many of them. Read on to find out more.
Introduction to Anomaly Detection - Apr 3, 2017.
This overview will cover several methods of detecting anomalies, as well as how to build a detector in Python using simple moving average (SMA) or low-pass filter.
What is AI? Ingredients for Intelligence - Apr 3, 2017.
This introductory overview of artificial intelligence acts as a layman's guide what AI is, and what it is made up of.
- Medical Image Analysis with Deep Learning
- A Short Guide to Navigating the Jupyter Ecosystem
The Best R Packages for Machine Learning
There is no doubt R is language of choice for the majority of data scientists who want to understand data, especially those looking to leverage its great machine learning packages.
- A Beginner’s Guide to Tweet Analytics with Pandas
- Deep Learning, Generative Adversarial Networks & Boxing – Toward a Fundamental Understanding
- The Next Challenges for Reinforcement Learning
What is Structural Equation Modeling?
Structural Equation Modeling (SEM) is an extremely broad and flexible framework for data analysis, perhaps better thought of as a family of related methods rather than as a single technique. What is its relevance to Marketing Research?
- Getting Started with Deep Learning
- Unsupervised Investments: A Comprehensive Guide to AI Investors
What Is Data Science, and What Does a Data Scientist Do?
This article is intended to help define the data scientist role, including typical skills, qualifications, education, experience, and responsibilities. This definition is somewhat loose, and given that the ideal experience and skill set is relatively rare to find in one individual.
- Statistical Modeling: A Primer
Getting Up Close and Personal with Algorithms
We've put together a brief summary of the top algorithms used in predictive analysis, which you can see just below. Read to learn more about Linear Regression, Logistic Regression, Decision Trees, Random Forests, Gradient Boosting, and more.
- Analytics 101: Comparing KPIs
- The Most Underutilized Function in SQL
- Email Spam Filtering: An Implementation with Python and Scikit-learn
- Applying Machine Learning To March Madness
50 Companies Leading The AI Revolution, Detailed
We detail 50 companies leading the Artificial Intelligence revolution in AD Sales, CRM, Autotech, Business Intelligence and analytics, Commerce, Conversational AI/Bots, Core AI, Cyber-Security, Fintech, Healthcare, IoT, Vision, and other areas.
17 More Must-Know Data Science Interview Questions and Answers, Part 3
The third and final part of 17 new must-know Data Science interview questions and answers covers A/B testing, data visualization, Twitter influence evaluation, and Big Data quality.
- Homebrewed Deep Learning and Do-It-Yourself Robotics
- Open Source Toolkits for Speech Recognition
- Toward Increased k-means Clustering Efficiency with the Naive Sharding Centroid Initialization Method
- Working With Numpy Matrices: A Handy First Reference
- Visualizing Time-Series Change
- Beginner’s Guide to Customer Segmentation
What makes a good data visualization – a Data Scientist perspective
We examine principles of good data visualization, including some great and terrible examples, guidelines for human perception, focus on key variables, changes and trends, avoiding chart junk, and more.
- The Challenges of Building a Predictive Churn Model
- Building Regression Models in R using Support Vector Regression
- Neuroscience for Data Scientists: Understanding Human Behaviour
- K-Means & Other Clustering Algorithms: A Quick Intro with Python
- A Simple XGBoost Tutorial Using the Iris Dataset
- Bokeh Cheat Sheet: Data Visualization in Python
Every Intro to Data Science Course on the Internet, Ranked
For this guide, I spent 10+ hours trying to identify every online intro to data science course offered as of January 2017, extracting key bits of information from their syllabi and reviews, and compiling their ratings.
- Building a Bot to Answer FAQs: Predicting Text Similarity
- What is Customer Churn Modeling? Why is it valuable?
7 More Steps to Mastering Machine Learning With Python
This post is a follow-up to last year's introductory Python machine learning post, which includes a series of tutorials for extending your knowledge beyond the original.
- 50+ Useful Machine Learning & Prediction APIs, updated