When good data analyses fail to deliver the results you expect

To all those Data Scientists out there who thrive on discovering actionable insights from your data (all of you, right?), take heed from this cautionary tale of a data analysis, a dashboard, and a huge waste of resources.

comments

By Brittany Davis, Data at Narrator.ai.

Any good data analyst will spend their days asking, “What action (X) impacts my bottom line (Y)?” We explore the data and rigorously test each hypothesis. When we finally discover that X influences Y, we run to our team and convince them to focus on X.

“Get more customers to join our loyalty program to improve LTV by 30%!” we say.

Finding this insight and convincing the team to act upon it is a feat in itself — so we go home feeling proud of a hard day’s work. Maybe we even set up a nice dashboard to help them track their progress against this new target. Each week they’ll track the % of customers in the loyalty program and devise new tactics to improve that number. So data-driven!

This is the dream. We found an insight, and we made a meaningful impact… didn’t we? LTV remains relatively unchanged. How is that possible? What happened?

The analysis was solid. Our KPI was a good metric to track, but this is where things fell apart. We tracked our progress against the KPI but failed to track all the assumptions that led us to focus on that KPI in the first place. The problem was that we didn’t keep an eye on the assumptions. There are a few ways to solve this problem, which we’ll talk about later on.

Time to First Response: A Cautionary Tale

In my last role, I spent a lot of time analyzing support tickets. Our support team was laser-focused on “time to first response.” It was so ingrained in the operational model that we even had leaderboards to celebrate the reps with the fastest response times.

I’m not kidding — some people would take their laptops into the bathroom so they could respond right away if a request came in.

This goal was born out of great intentions. We looked at the data and saw that tickets with shorter response times had higher satisfaction scores.

So… “Answer tickets faster to improve customer satisfaction rates,” we said! And they did.

But satisfaction rates didn’t improve — they actually declined.

“Time to first response” used to be a good indicator for customer satisfaction, but once it became a target, the relationship between “time to respond” and “customer satisfaction” fell apart.

This happens all the time! The popular book Tyranny of Metrics is full of stories just like mine.

We do an analysis to understand how a customer behaves. A metric becomes a KPI, it goes into a dashboard, and we completely forget how we got there. We put on our blinders, and we focus on optimizing that metric at any cost, forgetting all the assumptions that got us there.

When we change our behavior, we also change our customer’s behavior, so the data powering the entire analysis starts shifting. The assumptions that were once true are now wrong. And as a result, we don’t see the outcomes we want.

Goodhart’s Law

There’s actually a term for this phenomenon: Goodhart’s Law:

“When a measure becomes a target, it ceases to be a good measure.”

Meaning that when you start to optimize for a metric, you run the risk of fundamentally changing the value it represents.

Sadly, this means that most teams will continue their days blissfully unaware that their KPI (which was at one point meaningful) has now become a vanity metric.

Don’t fall into this trap.

What can we do about it?

Track the context that led to each decision!

This means monitoring the KPI and the context of how you came to that conclusion.

With each analysis, there is a set of criteria that needs to be met before you can claim that the findings are reliable:

Does X relate to a change in Y? Yes …maybe you do a hypothesis test to check for statistical significance.
Has that relationship been consistent over time? Yes.

Ok great! It’s reasonable to assume that if we change X, then it will result in a change in Y.

But if any of the answers to these questions change, then you need to update your recommendation.

How do we do this?

Create live analyses (not dashboards) to capture the context

As data folks, our go-to option for monitoring data over time is a dashboard. Create the relevant plots and put them in a dashboard. Set it and forget it.

To capture context, we need to create more than just a collection of dashboards. No one knows what to do with that.
What happens when 6 months from now, you’re focused on a totally different project, the underlying assumptions have changed, but your team has forgotten how each plot influences the recommendation?

We love dashboards because they’re live.

But we need analyses because the plots are already interpreted, and the takeaways are actionable.

So what if we could have the benefits of both? We would have live analyses.

Live analyses preserve the context of our decision-making criteria while keeping the insights fresh and timely, so our team can reliably and confidently reach the same conclusions as if we were there interpreting the plots for them.

Time to First Response (as a live analysis)

A live analysis would have kept us from falling into the trap of optimizing for a vanity metric like Time to First Response. As soon as the underlying assumptions changed, we would have the context alongside each plot to know exactly what happened and why we shouldn’t focus on time to first response anymore.

The Customer Support analysis would tell us:

How did we arrive at “Time to First Response” as our KPI?
What assumptions did we check?
Are those assumptions still true?

And when those assumptions failed, the recommendation would have updated to “Don’t focus on decreasing first response times, as it is not impactful because XYZ is no longer true.”

We’d be able to quickly see this change and respond quickly! No more “secret” vanity metrics.

At Narrator, we use “Narratives” to share live analyses in a story-like format.

At Narrator, every analysis we create is a live analysis. Each one captures our recommendations and the steps we took to get there. And we include context alongside each plot of its interpretation and why it matters. We’ve even figured out a way to condition the interpretations based on the data from the plot using variables and if-statements under the hood.

So if the data changes, the recommendations will update, making it impossible for a metric to become a vanity metric without us realizing it.

Since we adopted this approach of creating live-analyses in a story-like format, we’ve never been blindsided by secret shifts in data. We’ve created analyses that deliver value year after year and will tell us a new story as our customers’ behaviors change.

Teams that are serious about making impactful decisions with data need live insights to ensure they’re always optimizing for the right thing. Don’t repeat the same mistakes I did.

Monitor the data, provide the context, and validate each assumption over time to ensure your team is always focused on something impactful.

Original. Reposted with permission.

Related:

When good data analyses fail to deliver the results you expect

Time to First Response: A Cautionary Tale

Goodhart’s Law

What can we do about it?

Create live analyses (not dashboards) to capture the context

More On This Topic

Latest Posts

Top Posts