2016 Dec Tutorials, Overviews
All (100) | Courses, Education (9) | Meetings (11) | News, Features (26) | Opinions, Interviews (31) | Software (3) | Tutorials, Overviews (18) | Webcasts & Webinars (2)
- The big data ecosystem for science: Climate Science and Climate Change - Dec 22, 2016.
Climate change is one of the most pressing challenges for human society in the 21st century. We review the Big Data ecosystem for studying the climate change.
- 4 Reasons Your Machine Learning Model is Wrong (and How to Fix It) - Dec 21, 2016.
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
- Data Science Basics: Power Laws and Distributions - Dec 21, 2016.
Power laws and other relationships between observable phenomena may not seem like they are of any interest to data science, at least not to newcomers to the field, but this post provides an overview and suggests how they may be.
- ResNets, HighwayNets, and DenseNets, Oh My! - Dec 19, 2016.
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
- Introduction to Bayesian Inference - Dec 16, 2016.
Bayesian inference is a powerful toolbox for modeling uncertainty, combining researcher understanding of a problem with data, and providing a quantitative measure of how plausible various facts are. This overview from Datascience.com introduces Bayesian probability and inference in an intuitive way, and provides examples in Python to help get you started.
- The big data ecosystem for science: Genomics - Dec 15, 2016.
The field of genomics has undergone a revolution over the past decade as the cost of sequencing has rapidly declined and the practice of sequencing has been commoditized. We review the Big Data ecosystem in genomics.
- 50+ Data Science, Machine Learning Cheat Sheets, updated - Dec 14, 2016.
Gear up to speed and have concepts and commands handy in Data Science, Data Mining, and Machine learning algorithms with these cheat sheets covering R, Python, Django, MySQL, SQL, Hadoop, Apache Spark, Matlab, and Java.
- The Costs of Misclassifications - Dec 14, 2016.
Importance of correct classification and hazards of misclassification are subjective or we can say varies on case-to-case. Lets see how cost of misclassification is measured from monetary perspective.
- Data Science Basics: What Types of Patterns Can Be Mined From Data? - Dec 14, 2016.
Why do we mine data? This post is an overview of the types of patterns that can be gleaned from data mining, and some real world examples of said patterns.
- Data Analytics Models in Quantitative Finance and Risk Management - Dec 13, 2016.
We review how key data science algorithms, such as regression, feature selection, and Monte Carlo, are used in financial instrument pricing and risk management.
- Data Science and Big Data: Definitions and Common Myths - Dec 12, 2016.
A well-set data strategy is becoming fundamental to every business, regardless the actual size of the datasets used. However, in order to establish a data framework that works, there are a few misconceptions that need to be clarified.
- Introduction to K-means Clustering: A Tutorial - Dec 9, 2016.
A beginner introduction to the widely-used K-means clustering algorithm, using a delivery fleet data example in Python.
- Artificial Neural Networks (ANN) Introduction, Part 2 - Dec 9, 2016.
Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.
- Artificial Neural Networks (ANN) Introduction, Part 1 - Dec 8, 2016.
This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.
- The Best Metric to Measure Accuracy of Classification Models - Dec 7, 2016.
Measuring accuracy of model for a classification problem (categorical output) is complex and time consuming compared to regression problems (continuous output). Let’s understand key testing metrics with example, for a classification problem.
- Internet of Things Tutorial: Introduction - Dec 6, 2016.
In the first of a series of posts, a set of Internet of Things technologies and applications are presented. Following posts will expand on these topics in tutorial form. Get an introduction to IoT here.
- Random Forests® in Python - Dec 2, 2016.
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
- Data Science Deployments With Docker - Dec 1, 2016.
With the recent release of NVIDIA’s nvidia-docker tool, accessing GPUs from within Docker is a breeze. In this tutorial we’ll walk you through setting up nvidia-docker so you too can deploy machine learning models with ease.