Why A/B Testers Have The Best Jobs In Tech

Learning about what these people do made it clear that when you are deeply involved in A/B testing at scale, there is a tremendous rush from doing so many different things that matter.

By Dan Woods, Forbes Contributor.

While preparing for the panel I am moderating on "A/B Testing Secrets Revealed: Uber, Etsy, & Intuit” this Sunday 3/12 at SXSW, it became clear to me that the people on the panel from Intuit, Uber, and Etsy had jobs that were central to the success of their companies. Are these the best jobs in tech? Hell, yes!

Learning about what these people do made it clear that when you are deeply involved in A/B testing at scale, there is a tremendous rush from doing so many different things that matter. Please join me and the panelists Gabrielle Gianelli of Etsy, Mita Mahadevan of Intuit, and Akash Parikh of Uber to learn more.

Does your company have scalable scales to compare alternatives? / A picture taken on May 19, 2015 at Rennes' courthouse shows a statue of the goddess of Justice balancing the scales. AFP PHOTO / DAMIEN MEYER (Photo credit should read DAMIEN MEYER/AFP/Getty Images)

But to understand why, we need to take a step back and separate what an A/B testing program looks like at scale from the common understanding of basic A/B testing you might get if you are familiar with Optimizely, VWO, Google Experiments, Adobe Target, Qubit, and many A/B testing offerings.

Basic A/B testing is usually focused on the UX. You are changing something to see if it helps users of your site or application better achieve their goals. If you are good at this, you are paying attention to the statistics of what you are doing, but many people don’t, as Qubit pointed out in this fascinating paper in 2014: “Most Winning A/B Test Results are Illusory"


A/B Testing at Scale Means Building a Platform

But A/B testing at scale is a whole different animal and usually is supported by custom technology.
First of all, it’s about making testing easy. There is no way that Etsy, Intuit, or Uber could run as many tests as they do if it were difficult. But to pull this off, you have to build a platform that can be integrated into the development and deployment of your technology. For the biggest players, commercial technology cannot meet their needs. They must build a platform to make it easy to design tests, to run them, to harvest the data, and to make initial sense of the results.

Building such a platform means by definition you are learning about the most crucial, high value parts of the user experience. Why? Because A/B testing is focused on what’s important, on what matters most to creating value.

To make A/B testing at scale easy, you must put the needed mechanisms in as part of the architecture and make it part of the development process. The Software Development Lifecycle must change for web sites, mobile apps, and any other way to deliver UX.

For example, Intuit built an experimentation platform and then had to integrate it across products in its entire ecosystem from small business to personal finance—while keeping sensitive data in-house. The company’s A/B testing platform is now scaled to over 120 apps, and handles over billions of API calls just tax filing alone. The project, called Wasabi, is now available on open source as an A/B testing framework to other companies.

So A/B testers have great jobs because they are platform builders.


A/B Testing At Scale Requires Statistical Smarts

Second, there is the challenge of statistical maturity. As the Qubit paper pointed out, it’s easy to fool yourself about whether your experiments rest on sound statistical footing. Tests must be well constructed. Control groups must be in place to determine when impact decays. This all must be built into the platform.

Don’t feel too bad if you are not a stats whiz, even the experts struggle. At one point Optimizely published this paper that explained why it needed to update its approach to statistics by improving its platform to avoid common errors. See this blog from January 2015 and a deeper explanation in a white paper for the details.

But it gets much more fun than this. When you are running dozens of tests on a large and complex site, it is quite possible that tests will interfere with each other. This means the platform mentioned above has to become even more sophisticated to make it clear how tests are interacting.

A/B testers must be sophisticated about statistics.


A/B Testing at Scale Means Deeply Understanding User Segments

A/B testing often focuses on specific segments of users, which has all sorts of complexities. First of all, agreeing on definitions of core segments can be a major battle. Marketing, product, customer service and so on will all have deeply held opinions about what segments matter and how to define them.

Second, the platform should allow users to develop their own segments based on both off-line data and online behavior.

The platform has to be aware of these segments and make it easy for tests to be applied to any combination of segments, or to exclude certain segments.

A/B testers must deeply understand the customer.


A/B Testing at Scale Means Testing both UX and Business Policies for Key Processes

A/B testing is focused on what matters most. But how do you figure out what matters most? And how long will anything matter most in our rapidly changing world? 

The issues faced at Etsy, Intuit, Uber five years ago are far different than what matters now. A/B testers are at the coal face of what matters, not just for UX but also for business polices as well.

In prep for the panel, it became clear that testing business policies was a crucial and complex area. Often the A/B test cannot happen simultaneously. For some policies, you cannot change the policy just for one segment without causing problems. Business policies are also influenced by regulation and local custom.

A/B testers become experts in UX and business policies that matter.


A/B Testing at Scale Means Training the Rest of the Company

The biggest victories come when A/B testing is not just done by a center of excellence, but is available to the entire company. But, as we have seen, A/B testing must be done correctly to be effective. The platform must have good practices built in, and those using the platform must be trained to use the platform to get good results.

In this way A/B testers must be part of a large scale program of organizational change. The company must be first sold on the value of A/B testing, then the platform must productize testing practices, and a large number of people must understand how to use the platform correctly.

The introduction of A/B testing often disrupts current power structures, and may be resisted by those who now have to pay attention to data rather than live by their wits. As a friend of mine who worked at Google once told me: “Google doesn’t have a large middle management layer. Because the company has so much data and knows how to use it, data plays the key decision making roles of middle management.” Few companies introduce data as a decision making force without causing disruption.

A/B testers must become experts and large scale organizational transformation, and all that that implies.

So there you have it. Teams of A/B testers running programs at scale are doing everything mentioned above. If there is a more fascinating job in tech, I don’t know what it is. See you at the panel.

Bio: Dan Woods finds technology that matters for early adopters. Follow him on Twitter (@danwoodsearly) and LinkedIn, and read his blog here.

Original. Reposted with permission.