- Web Scraping for Dataset Curation, Part 2: Tidying Craft Beer Data - Feb 14, 2017.
This is the second part in a 2 part series on curating data from the web. The first part focused on web scraping, while this post details the process of tidying scraped data after the fact.
- Web Scraping for Dataset Curation, Part 1: Collecting Craft Beer Data - Feb 13, 2017.
This post is the first in a 2 part series on scraping and cleaning data from the web using Python. This first part is concerned with the scraping aspect, while the second part while focus on the cleaning. A concrete example is presented.
- Clean Data Science: Evaluating The Cleanliness of NYC Craft Beer Bar Kitchens - Jan 13, 2017.
An analysis of NYC Open Data health inspections showing that craft beer bar kitchens in Manhattan are cleaner than the average establishment by a statistically significant margin. An encouraging finding for Dry January.
- Neighbors Know Best: (Re) Classifying an Underappreciated Beer - Nov 24, 2016.
A look at beer features to determine whether a specific brew might be better served (pun intended) by being classified under a different style. kNN analysis supported with in-post plots and linked iPython notebook.
- Central Limit Theorem for Data Science – Part 2 - Aug 16, 2016.
This post continues an explanation of Central Limit Theorem started in a previous post, with additional details... and beer.
- Barley, Hops, and Bayes: Predicting The World Beer Cup - Jul 26, 2016.
This post covers predicting award counts by the United States in an international beer competition. Exploratory data analysis and Bayes methods are also supported.
- Cartoon: When Automation Goes Too Far - Apr 30, 2016.
KDnuggets Cartoon looks into the future of Automated Data Science and Marketing - when will automation go too far?
- Deep Learning Transcends the Bag of Words - Dec 7, 2015.
Generative RNNs are now widely popular, many modeling text at the character level and typically using unsupervised approach. Here we show how to generate contextually relevant sentences and explain recent work that does it successfully.