-
Why the Data Scientist and Data Engineer Need to Understand Virtualization in the Cloud
This article covers the value of understanding the virtualization constructs for the data scientist and data engineer as they deploy their analysis onto all kinds of cloud platforms. Virtualization is a key enabling layer of software for these data workers to be aware of and to achieve optimal results from.
-
Tidying Data in Python
This post summarizes some tidying examples Hadley Wickham used in his 2014 paper on Tidy Data in R, but will demonstrate how to do so using the Python pandas library.
-
Ten Myths About Machine Learning, by Pedro Domingos
Myths on artificial intelligence and machine learning abound. Noted expert Pedro Domingos identifies and refutes a number of these myths, of both the pessimistic and optimistic variety.
-
Laying the Foundation for a Data Team
Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.
-
4 Reasons Your Machine Learning Model is Wrong (and How to Fix It)
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
-
ResNets, HighwayNets, and DenseNets, Oh My!
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
-
The 5 Basic Types of Data Science Interview Questions
Data science interviews are notoriously complex, but most of what they throw at you will fall into one of these categories.
-
Artificial Neural Networks (ANN) Introduction, Part 2
Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.
-
Artificial Neural Networks (ANN) Introduction, Part 1
This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.
-
Random Forests® in Python
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.
|