Matt Mayo

Why the Data Scientist and Data Engineer Need to Understand Virtualization in the Cloud

By Matt Mayo on January 25, 2017 in Cloud, Data Engineer, Data Engineering, Data Science, Data Scientist, Virtualization
This article covers the value of understanding the virtualization constructs for the data scientist and data engineer as they deploy their analysis onto all kinds of cloud platforms. Virtualization is a key enabling layer of software for these data workers to be aware of and to achieve optimal results from.
Tidying Data in Python

By Matt Mayo on January 4, 2017 in Data Cleaning, Data Preparation, Pandas, Python
This post summarizes some tidying examples Hadley Wickham used in his 2014 paper on Tidy Data in R, but will demonstrate how to do so using the Python pandas library.
Ten Myths About Machine Learning, by Pedro Domingos

By Matt Mayo on January 3, 2017 in Machine Learning, Myths, Pedro Domingos
Myths on artificial intelligence and machine learning abound. Noted expert Pedro Domingos identifies and refutes a number of these myths, of both the pessimistic and optimistic variety.
Laying the Foundation for a Data Team

By Matt Mayo on December 28, 2016 in Analytics Team, Data Science Team, Team
Admittedly, there is a lot more to building a successful data team, and we would be lying if we pretended we have it all figured out. But hopefully focusing on the elements in this post is a good start.
4 Reasons Your Machine Learning Model is Wrong (and How to Fix It)

By Matt Mayo on December 21, 2016 in Bias, Overfitting, Variance
This post presents some common scenarios where a seemingly good machine learning model may still be wrong, along with a discussion of how how to evaluate these issues by assessing metrics of bias vs. variance and precision vs. recall.
ResNets, HighwayNets, and DenseNets, Oh My!

By Matt Mayo on December 19, 2016 in Convolutional Neural Networks, Deep Learning, Neural Networks
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
The 5 Basic Types of Data Science Interview Questions

By Matt Mayo on December 16, 2016 in Data Science, Interview Questions, Springboard
Data science interviews are notoriously complex, but most of what they throw at you will fall into one of these categories.
Artificial Neural Networks (ANN) Introduction, Part 2

By Matt Mayo on December 9, 2016 in Algobeans, Deep Learning, Neural Networks
Matching the performance of a human brain is a difficult feat, but techniques have been developed to improve the performance of neural network algorithms, 3 of which are discussed in this post: Distortion, mini-batch gradient descent, and dropout.
Artificial Neural Networks (ANN) Introduction, Part 1

By Matt Mayo on December 8, 2016 in Algobeans, Image Recognition, MNIST, Neural Networks
This intro to ANNs will look at how we can train an algorithm to recognize images of handwritten digits. We will be using the images from the famous MNIST (Mixed National Institute of Standards and Technology) database.
Random Forests® in Python

By Matt Mayo on December 2, 2016 in Algorithms, Classification, Ensemble Methods, Python, random forests algorithm, Yhat
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. This is a post about random forests using Python.

Matt Mayo

Latest Posts

Top Posts