Lionsolver Machine Learning Approach to Smartphone Data Wins Parkinson Data Challenge
Researchers from Lionsolver, Inc won first prize in The Michael J. Fox Foundation's $10K Parkinson Data Challenge which used smart phone to monitor disease progression. In spite of the very sparse data, the winning entry could predict incidence and monitor disease progression with 100% accuracy.
By Gregory Piatetsky-Shapiro, Apr 25, 2013.
The contest received a big response, and data was downloaded over 600 times by teams from 21 countries.
The LIONsolver team's winning entry provided proof of concept for a "machine learning approach" that could unveil clues to PD onset and progression embedded in data collected on smartphones. LIONsolver's project proved the feasibility and value of gathering mobile data for monitoring PD, while laying the groundwork for further analysis of larger, and potentially more powerful, datasets using LIONsolver's machine learning platform.
Although 100% accuracy is usually suspect in Data Science contests, it is not unreasonable in this case since Parkinson's disease significantly affects patient's movement and makes them very different and recognizable. I know - I watched my father suffer from Parkinson's for 30 years.
LionSolver founder, Roberto Battiti, explained in an email to me:
there was no overfitting, we used leave-one-out crossvalidation. The classificaiton performance was always evaluated on examples not considered during training.
Yes, there are indeed significant differences between healthy and sick subjects. This is validated also for more traditional statistical techniques, although with a lower performance.
Of course 100% is on a limited number of patients. In a more detailed paper we also estimate the error bars of the estimate.
Here is the winning Lionsolver submission by Mauro Brunato, Roberto Battiti, Drake Pruitt, Enrico Sartori:
We demonstrate that a Machine Learning approach is superior to conventional statistical methods for the detection, monitoring and management of Parkinson's disease. In spite of the very sparse data of this specific Parkinson's diagnosis problem, we can predict incidence and monitor progression of the disease with 100% accuracy on the competition data. In addition to producing accurate detection, a machine learning approach paves the way for disruptive innovation in the monitoring and management of the disease
Comments from the Web:
From Data Science Central
A press release states that a contest was won by the winner. Two others received honorable mention.
While the details provide an illuminating example of what can be done with common sensors and statistical methods, there's no big advance.
The winner's report is at goo.gl/aqJIE
In brief, smartphones monitored tremors in 16 subjects, half of whom had Parkinson's Disease. The winning contestant used their machine learning software on this data and just barely achieved separation in their statistical space between those with and without Parkinson's Disease. Other contest entries achieved similar results. Most used Support Vector Machines as their statistical tool. Nothing here is surprising.
The full list of entries is at goo.gl/m0zGy Note that the right-arrow icon gets you from individual entry "cover pages" to their reports and data.