How Data Science saves lives and helps combat obesity

We look at projects from Chicago University Data Science for Social Good (DSSG) Program which help make the world a better place, and in particular at measure to help predict obesity.

From the personal growth curves above it can be seen that three individuals vary greatly. Patient 1 generally follows the 85th percentile of the population, patient 2 starts at the 85th percentile but reduces dramatically to 50th percentile and patient 2 experiences a significant increase in BMI, outside what is average from the population.  If a subpopulation of children who were obese at a young age remained obese, predicting who would be easy but as we can see from the graph they found this not to be the case.

In order to be able to predict future obesity they searched for other measures that could be added to their analyses which could more accurately predict future obesity. The measure they encountered dips and then returns back to its original course around the ages 5 or 6. They wanted to test if this measure was present in their dataset, so they calculated and plotted the BMI for each child in their dataset over all the years they were followed for all 4,248 patients they had height and weight measurements at age 5. They calculated if the slope of the BMI turned from negative to positive, then that was when the child experienced the adiposity rebound and were able to find adiposity events for 1,035 of them.


They plotted the results of each child on a histogram and found that this did normally occur around 5 or 6 but often the age was much younger at 2 or 3 or sometimes as high as 8 or 9.


Then with the adiposity rebound calculated where possible, they correlated the patients’ ages at adiposity rebound with the BMI at the end of their particular growth curves. From the scatterplot they showed that the younger a child experiences adiposity rebound the more likely they are to be obese in later childhood. To further validate the measure they rand two regression models: one with only initial BMI as the predictor and the second which included initial BMI as well as age of adiposity rebound as predictors of BMI in later childhood. Using only initial BMI the model has an R squared of 0.37  and using only the age of adiposity rebound the model had an R squared of 0.32 but when both adiposity rebound and initial BMI were modelled, the R squared jumped up to 0.65, meaning adding this measure more than doubled the explained variance of the model.


The project team aim to make their method of detecting adiposity rebound more robust and further incrementally improve their models before they can be used in the field. The results though, are promising,  giving way to possibly circumvent one of the greatest health challenges of our time: obesity, at a time when it can be easily tackled, in early childhood. Letting parents know their child was at risk could significantly improve the chances of changing the entire family’s habits, tackling the problem before it begins with healthy eating and exercise habits.