Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2015 » Oct » Opinions, Interviews, Reports » Five Principles for Applying Data Science for Social Good ( 15:n33 )

Five Principles for Applying Data Science for Social Good

Well-meaning data scientists often fail to reach their full potential when working for social good. The following 5 principles can help improve this situation.

3. Communication is more important than technology

We must foster environments in which people can speak openly, honestly, and without judgment. We must be constantly curious about each other.

At the conclusion of one of our recent DataKind events, one of our partner nonprofit organizations lined up to hear the results from their volunteer team of data scientists. Everyone was all smiles - the nonprofit leaders had loved the project experience, the data scientists were excited with their results. The presentations began. "We used Amazon RedShift to store the data, which allowed us to quickly build a multinomial regression. The p-value of 0.002 shows..." Eyes glazed over. The nonprofit leaders furrowed their brows in telegraphed concentration. The jargon was standing in the way of understanding the true utility of the project's findings. It was clear that, like so many other well-intentioned efforts, the project was at risk of gathering dust on a shelf if the team of volunteers couldn't help the organization understand what they had learned and how it could be integrated into the organization's ongoing work.

In many of our projects, we've seen telltale signs that people are talking past each other. Social change representatives may be afraid to speak up if they don't understand something, either because they feel intimidated by the volunteers or because they don't feel comfortable asking for things of volunteers that are so generously donating their time. Similarly, we often find volunteers that are excited to try out the most cutting-edge algorithms they can on these new datasets, either because they've fallen in love with a certain model of Recurrent Neural Nets or because they want a dataset to learn them with. This excitement can cloud their efforts and get lost in translation. It may be that a simple bar chart is all that is needed to spur action.

Lastly, some volunteers assume nonprofits have the resources to operate like the for-profit sector. Nonprofits are, more often than not, resource-constrained, understaffed, under appreciated, and trying to tackle the world’s problems on a shoestring budget. Moreover, "free" technology and "pro bono" services often require an immense time investment on the nonprofit professionals' part to manage and be responsive to these projects. They may not have a monetary cost, but they are hardly free.

Socially-minded data science competitions and fellowship models will continue to thrive, but we must build empathy - strong communication through which diverse parties gain a greater understanding of and respect for each other - into those frameworks. Otherwise we'll forever be "hacking" social change problems, creating tools that are "fun," but not "functional."

4. We need diverse viewpoints

To tackle sector-wide challenges, we need a range of voices involved.

One of the most challenging aspects to making change at the sector level is the range of diverse viewpoints necessary to understand a problem in its entirety. In the business world, profit, revenue, or output can be valid metrics of success. Rarely, if ever, are metrics for social change so cleanly defined.

Moreover, any substantial social, political, or environmental problem quickly expands beyond its bounds. Take, for example, a seemingly innocuous challenge like "providing healthier school lunches." What initially appears to be a straightforward opportunity to improve the nutritional offerings available to schools quickly involves the complex educational budgeting system, which in turn is determined through even more politically fraught processes. As with most major humanitarian challenges, the central issue is like a string in a hairball wound around a nest of other related problems, and no single strand can be removed without tightening the whole mess. Oh, and halfway through you find out that the strings are actually snakes.

Challenging this paradigm requires diverse, or "collective impact," approaches to problem solving. The idea has been around for a while (h/t Chris Diehl), but has not yet been widely implemented due to the challenges in successful collective impact. Moreover, while there are many diverse collectives committed to social change, few have the voice of expert data scientists involved. DataKind is piloting a collective impact model called DataKind Labs, that seeks to bring together diverse problem holders, data holders, and data science experts to co-create solutions that can be applied across an entire sector-wide challenge. We just launched our first project with Microsoft to increase traffic safety and are hopeful that this effort will demonstrate how vital a role data science can play in a collective impact approach.

5. We must design for people

Data is not truth, and tech is not an answer in-and-of-itself. Without designing for the humans on the other end, our work is in vain.

So many of the data projects making headlines - a new app for finding public services, a new probabilistic model for predicting weather patterns for subsistence farmers, a visualization of government spending - are great and interesting accomplishments, but don't seem to have an end user in mind. The current approach appears to be "get the tech geeks to hack on this problem, and we'll have cool new solutions!" I've opined that, though there are many benefits to hackathons, you can't just hack your way to social change.

A big part of that argument centers on the fact that the "data for good" solutions we build must be co-created with the people at the other end. We need to embrace human-centered design, to begin with the questions, not the data. We have to build with the end in mind. When we tap into the social issue expertise that already exists in many mission-driven organizations, there is a powerful opportunity to create solutions to make real change. However, we must make sure those solutions are sustainable given resource and data literacy constraints that social sector organizations face.

That means that we must design with people, accounting for their habits, their data literacy level, and, most importantly, for what drives them. At DataKind, we start with the questions before we ever touch the data and strive to use human-centered design to create solutions that we feel confident our partners are going to use before we even begin. In addition, we build all of our projects off of deep collaboration that takes the organization's needs into account, first and foremost.

These problems are daunting, but not insurmountable. Data science is new, exciting, and largely misunderstood, but we have an opportunity to align our efforts and proceed forward together. If we incorporate these five principles into our efforts, I believe data science will truly play a key role in making the world a better place for all of humanity.

What's next

Almost three years ago DataKind launched on the stage of Strata + Hadoop World NYC as Data Without Borders. True to their motto to "work on stuff that matters," O'Reilly has not only been a huge supporter of our work, but arguably one of the main reasons that our organization can carry on its mission today.

That's why we could think of no place more fitting to make our announcement that DataKind and O'Reilly are formally partnering to expand the ways we use data science in the service of humanity. Under this media partnership, we will be regularly contributing our findings to O'Reilly, bringing new and inspirational examples of data science across the social sector to our community, and giving you new opportunities to get involved with the cause, from volunteering on world-changing projects to simply lending your voice. We couldn't be more excited to be sharing this partnership with an organization that so closely embodies our values of community, social change, and ethical uses of technology.

We'll see you on the front lines!

Bio: Jake Porway is a machine learning and technology enthusiast who loves nothing more than seeing good values in data. He is the founder and executive director of DataKind, an organization that brings together leading data scientists with high impact social organizations to better collect, analyze, and visualize data in the service of humanity.



Sign Up

By subscribing you accept KDnuggets Privacy Policy