Trick-or-Treat a Data Scientist

How would one infuse Data Science skills in children to optimize candy collection on Halloween? Zeeshan-ul-hassan Usmani (Founder & Chief Data Scientist, PredictifyMe) explains.

By Zeeshan-ul-hassan Usmani

halloween-family“Dad, can you help get more candy this Halloween?” Abdul Rehman, my 12 year old asked me a night before the last Halloween. His simple question got me thinking of how I, as a data scientist, could optimize his path for collecting candy.

I had to tell my son to rely on his vampire costume this year, but if he could get me some Data Candy, I might be able to assist him next year. Let’s just define Data Candy as small pieces of information that could be readily collected by kids while trick-or-treating their way through the neighborhood. My son got a task.

As he went from door to door, he kept jotting down the number and brand of candy that he got from each house. That’s pretty quantitative right? But wait, based on his brief presence at any doorstep, he also had to figure out if it was a house with one or more daughters. Don’t worry, it gets creepier.

candiesAfter a three hour trick-or-treat spree, joined by my wife and his two siblings, Abdul Rahman came back with buckets full of candy from more than a hundred houses. I got the candy I was interested in – little handwritten post-its, all squeezed up against their rather dashingly sugar counterparts.

America spends 6.9 Billion dollars on Halloween each year, 1.96B on decorations, 2.6B on costumes, and 330 million on pet costumes. American also spends 12.6 Billion dollars per year on chocolates alone.

As the kids went to bed, I started sorting both types of candy.

As I started populating a spreadsheet, I looked up related information from publicly available datasets and used what I found helpful in my analysis. For example, I got the exact house addresses from google maps; the worth of the houses, and per month rent estimates from Zillow; names of the people living in the houses that my family visited, their genders, age, and political affiliation were collected from the public voters database; the birth state was mapped using the publicly available SSA popular name by state and decades database; candy brands were mapped to the retail stores by mapping their online inventories, keeping in view their proximity to our neighborhood.

Here is what I found:

Out of 117 houses, 32 houses (27%) did not participate in Halloween – I have marked them out for this year’s optimized trick-or-treat path for my children.
The top choice for Halloween treats in our neighborhood came out to be Lolly pops (making up to 12.3% of all the treats), followed by Twizzlers, Snickers, M&M, Twix, Milkyway, Dove, KitKat, Whoopers and Tootiseroll (with only 3.4%).
The give-away treats that came to my house were from a total of 33 distinct brands. The chart points out that no consideration was paid to health and calories (sugar kills anyway, right?), as Butterfinger (on an ideal 45 calories) is not the star in the list, Stardust is about 160, Resses’ peanut butter comes up to 180, while Twizller (at 150 calories per pack) makes to the second of the most given treat list.
The households that my son identified as the ones with daughter(s) had a very clear distinction with the choice of treats/brands that they were giving away. Intuitively, the brands like KitKat, Snickers, M&M, and Smarties shout out ‘boys’, while Harsheys, Butterfinger, Twix, and Dove are more preferred brands of a household with daughters. Since we don’t have a daughter, I could customize a route if my sons preferred to collect more KitKat than Twix, or otherwise.