Clean Data Science: Evaluating The Cleanliness of NYC Craft Beer Bar Kitchens

An analysis of NYC Open Data health inspections showing that craft beer bar kitchens in Manhattan are cleaner than the average establishment by a statistically significant margin. An encouraging finding for Dry January.

By Reginald Eps, EndlessPint.

One of the most natural things to offer alongside craft beer is food. On an Epicurean level beer pairs well with about any dish out there and there are few things more adept at creating a convivial atmosphere than drinking and eating.

The food offered at craft bars is ever more spectacular and would be unrecognizable as pub fare if it weren’t for the dozen or so taps on the wall. It’s not just Buffalo wings and onion rings that can be appreciated alongside a pint nowadays.

Naturally with food preparation of any kind you want to have some reassurances about the quality of what ends up on your plate and that’s a story that begins long before the dish is put in front of you, just as it does for your pint.

We will not be going as far back as the farm in this piece but we will take it back a couple of steps to the kitchen itself to see what it is we’re up against when we pony up to the bar of our choice for a pint and a bite. I’ll be focusing specifically on the criteria of cleanliness and hygiene.

Like any self-respecting city of the 20th century New York City does health inspections of, among other things, restaurants. Like any self-respecting 21st century city NYC makes this data openly available. Inspections are unannounced yearly occurrences where an establishment receives a certain number of points for each violation, critical or otherwise. The points are tallied up per inspection and a final grade (A, B, C) is based on the resulting score, the fewer points the better.

Craft or Bust?

Taking the top 30 craft beer bars of Manhattan as identified on FourSquare (Accessed: Dec 2016) I set about gathering their respective scores, code violations, and grades[1]. I did the same for all other inspected establishments in Manhattan, bar or otherwise.

The craft beer bar inspection average of 12.89 just qualifies for an average grade of A. The same average for all other non-craft beer bars is 14.68. This difference of 1.79 pushes the other establishments over the points threshold into a Grade B average.

Data Source: NYC Open Data

On top of this being a 10% difference it is a statistically significant one as well. This was established via a ho-hum t-test and by running a shuffling and sampling of the two populations. Not only are you much more likely to find better beer at these establishments than at a random place but you’re also more likely to find a cleaner kitchen.

Below is a timeline of the 30 craft beer bars considered. Each horizontal line and their attending dots represent one establishment. The occurrence of the dots appear at specific times (x-axis), may be multiple for an inspection, and are colored as to their severity (red for critical, yellow for not critical, and gray for not applicable). The lines themselves stretch from the earliest inspection date recorded to the most recent. In the couple of instances of there being only one inspection to date no line is present.

Data Source: NYC Open Data

The lines serve several purposes. First, they make the reading and grouping of the dots per establishment clearer. Second, they provide a convenient means of comparing length of inspection ranges between bars. Lastly, the lines themselves are color coded to convey information as to the grade carried by a bar. These colors match the NYC grade color scheme where meaningful. Thus the grades A, B, & C are rendered as blue, green, and orange, respectively (no “C’s”, hooray).

Inspection dates need not always result in a new or retained grade. I use white to indicate when there is no grade given out for a specific inspection date. This color will be extended until the next grade. As best as I can ascertain grades are retained unless specifically changed, even so I prefer this color scheme to at least differentiate the awarding/retaining upgrades. The color gray is also used, though difficult to see, for grades of “P”, passing. One particular feature of the open data not visualized are the violation points incurred per inspection[2].

Further with Data

I leave for another time the analysis of the cuisine itself. A quick look around made me reconsider expanding the scope of this piece. Perhaps if Zagat had some quick numbers to glean insight from, but in many instances that fine New York establishment does not even offer food ratings on these locales. So be it. Having now established some small appreciation for the quality of the preparation and handling of the food[3] it would be a natural extension to see about the quality of the chow itself.


[1] Actually 28 of top 30 FourSquare locations were identified, the other two either have no kitchen (Carmine Street Beers) or have never been inspected/cited (Milk & Hops), either way I replaced them with two substitute craft beer bars, Amity Hall & Malt House, to keep the number at 30.
[2] The inclusion of linewidth variation to portray violation scores proved, for the time being anyway, a bit too much to include.

[3] Tabular data per craft bar available in PDF document.

Original post. Reposted with permission.

Bio: Reginald Eps provides data analysis consulting for companies on both sides of the Atlantic and is currently based in Europe working in supply-side forecasting. He is the creator of EndlessPint.