Data Literacy: Using the Socratic Method

How can organizations and individuals promote Data Literacy? Data literacy is all about critical thinking, so the time-tested method of Socratic questioning can stimulate high-level engagement with data.

By Aarzoo Sidhu, The Data Thinker

The growing value of data as a business asset is undeniable. It has been described as “the new gold” and “the new oil.” Organizations around the world, regardless of the industry and size, are investing heavily in data infrastructure to get their hands on this commodity.

Unlike gold and oil, which are precious because they are rare, data is ubiquitous. So, it is not the data itself that is valuable--it is everywhere. The value lies in an organization’s ability to engage with it meaningfully and extract business insights from it.  And, therein lies the gap.

Most people who are not experts in data science (or in one of the related fields), do not know how to think critically in the face of data.

There is terror in numbers. Humpty Dumpty’s confidence in telling Alice that he was master of the words he used would not be extended by many people to numbers.
- Darrel Huff, How to Lie with Statistics

Gartner supports this sentiment and paints a rather bleak picture of the future:

By 2020, 50% of organizations will lack sufficient AI and data literacy skills to achieve business value.

If it is not providing business value, then data is useless. It goes from being the new gold to the new dirt (very costly dirt). So, in addition to investing in data infrastructure, organizations need to prioritize Data Literacy.


Data Literacy

In literature, there are many different views on what constitutes data literacy and how it varies from information literacy. To keep things simple for our current discussion, I like MIT's definition:

Data literacy includes the ability to read, work with, analyze and argue with data.
- R. Bhargava and C. D’Ignazio, Designing Tools and Activities for Data Literacy Learners

Instead of a curriculum or topic-based approach to define data literacy, this skill-based definition is more adaptable and scalable.

Now that we have a definition, let's talk about the HOW.

How can organizations promote data literacy? What is the best way to develop these skills in employees from a variety of educational backgrounds and experiences? How can data literacy be incorporated in professional development at work, outside of a formal education system?

I cannot teach anybody anything. I can only make them think.
- Socrates


Socratic Questioning

The key is to recognize that at the core of it, data literacy skills are critical thinking skills. And, good questions are the key to developing critical thinking. No one understood this better than Socrates. He believed that:

The disciplined practice of thoughtful questioning enables the scholar/student to examine ideas and be able to determine the validity of those ideas.
- What is Socratic Questioning

This method of stimulating higher level thinking is known as Socratic Questioning (or a Socratic Seminar). 

The keywords “disciplined” and “thoughtful" distinguish Socratic questioning from the general act of asking questions. Organized and deliberate questions not only help examine the information in front of you, but also help reflect on your own thinking about that information (metacognition). This kind of reflective thinking helps trace the path taken from information to conclusion and exposes any assumptions made along the way.

In their paper “Socratic Questioning,” Paul and Binker draw the link between Socratic questioning and critical thinking as follows:

Use of Socratic questioning presupposes the following points: All thinking has assumptions; makes claims or creates meaning; has implications and consequences; focuses on some things and throws others into the background; uses some concepts or ideas and not others; is defined by purposes, issues, or problems; uses or explains some facts and not others; is relatively clear or unclear; is relatively deep or superficial; is relatively monological or multi-logical. Critical thinking is thinking done with an effective, self monitoring awareness of these points.
- Richard Paul & A. J.A. Binker, The Critical Thinking Handbook Series

Further, Linda Elder and Richard Paul of the Foundation for Critical Thinking identify the following six types of Socratic questions that stimulate high level thinking.

  1. Questions that clarify.
  2. Questions that challenge assumptions.
  3. Questions that examine evidence or reasons.
  4. Questions about viewpoints and perspectives.
  5. Questions that explore implications and consequences.
  6. Questions about the question.


Socratic Questioning in Data Literacy

This method is used in law schools around the country to teach how to expose any logical fallacies in arguments. The beauty of this framework is that it can be adapted to any topic of interest. In our case, Data!

When used by individuals to probe a piece of information on their own, it strengthens their understanding of the data, exposes logical pitfalls, and drives insight.

An even more effective application of Socratic questioning is in stimulating a guided discussion among the stakeholders of a data project. By examining data together and reasoning through it together, the group can impart more context to it and construct a stronger statistical narrative; all the while developing their data literacy skills. Since the goal of a Socratic seminar is to think better, who better to lead the discussion than a Data thinker.

A Socratic Seminar can also help identify gaps in individual knowledge, promote curiosity, and instill intellectual humility. The gaps in knowledge and topics of data literacy that are uncovered in this discussion, can than be built upon using more traditional lectures and tutorials.


Sample Questions

Below are example questions for each category. Not every example question listed here will be applicable to every situation and some questions may fall in more than one category. The main goal should be to ask questions from all six categories.

1. Questions that clarify.

  • What is the question I want to answer with this statistic?
  • What does the statistic mean?
  • What are the units?
  • What is the underlying data for this statistic?
  • What do typical data points look like? What do extreme values look like?
  • What is the time range?
  • What does the graph show? How are the x-axis and y-axis labeled? Is the title/legend appropriate?

2. Questions that challenge assumptions.

  • Is a higher or lower value better?
  • Is it in line with my expectations?
  • What assumptions am I making?
  • What assumptions did the analyst make? Were the assumptions tested?
  • Is there any sampling bias? Desirability bias? Selection bias? Survivorship bias? Confirmation bias? Predictive bias? Anecdotal bias?

3. Questions that examine evidence or reasons.

  • How strong is the evidence?
  • Statistical significance vs business relevance?
  • What is missing? Any missing data? Missing measures of variation? Missing measures of uncertainty?
  • How strong is the correlation? How can causality be explored?
  • What would make me feel more/less confident about this analysis?

4. Questions about viewpoints and perspectives.

  • What additional statistics would be helpful?
  • What would the statistic look like if it supported the opposing view?
  • Which of the conflicting views has more evidence?
  • If none of this data was available, what decision would I make?
  • What does this statistic look like for my competitor?
  • What do I want this statistic to be?

5. Questions that explore implications and consequences.

  • What are the implications of this statistical finding? business impact?
  • Do other statistics support this finding? What would a contradicting statistic look like?
  • Can I act upon this finding?
  • Does it impact any actions currently in place?
  • Does it impact any decisions made in the past?
  • If there is uncertainty, how should I act under that uncertainty?
  • Does this support/refute what I already know about this issue?
  • How does it tie in with related findings from other data projects?

6. Questions about the question.

  • What was the most difficult question to answer?
  • What made it a difficult question to answer?
  • Is there a question that came up multiple times?
  • How can the dialog be summarized?

Bio: With formal education in Biostatistics and 8 years of experience in managing and analyzing data, Aarzoo Sidhu, author of The Data Thinker, is now on a journey to go beyond data science with the goal to explore the multiple paths that can be taken to go from data to insight.