Developing a solid intuition about the statistical concepts that drive the interpretation of a basic A/B test is a powerful thing. Explore an A/B test simulator to lock in that intuition!
One of the challenges of successfully running an experimentation program is that experimentation relies on statistics, and statistics can be confusing! Marketers are motivated to make data-informed decisions, and the idea of an A/B test—split the universe in two and deliver experience A to half of the universe and experience B to the other half and see which experience performs better—is an excellent means of doing just that!
Except…
Except…
Except…we can’t actually split the universe. We split “the people that came to our site (or use our mobile app, or to whom we sent an email)” into two (or more) groups. Then, we use the data we collect from that splitting to make inferences (#fancystatisticsword) about the performance and impact of the different experiences we tested.
Some of the misperceptions that we run into with our clients (and even with some of our industry colleagues!) include:
- “The confidence level is how sure we are that the results we observed are accurate.” Well…not really.
- “The run-time for the test is designed to collect enough data to quantify how much of a “lift” one variation delivers over another.” Mmmm… rarely!
- “A well-designed test will definitively show which variation performs better.” Not…quite.
- “With a well-designed test, it’s both straightforward and safe to estimate an annualized revenue impact.” Oooh… danger! Sharpen up your favorite Caveat Pencil if you’re going to do that!
At the root of many of these misperceptions is a misunderstanding of the fundamental role of statistics. At its core, statistics is (emphasis added):
"the practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample." [Source: Oxford Languages]
It turns out that the “whole” is a bit of a sticky wicket. What is “the population” when we’re conducting an A/B test? Arguably, it is “all users from now to some point in the medium-to-distant future.” The “true” conversion rate of a somewhat-vaguely-defined population gets tricky!
In a Simulated World, We Can Know the Truth
One way to develop some intuition about statistics is to explore them in a world where the population is known, even if that world is one entirely of our own making!
That was the inspiration behind Search Discovery’s A/B test simulator:
The simulator is purely an educational tool. We use it as we’re establishing optimization programs with our clients, but our hope is that analysts can use it to improve their own intuition and, perhaps, to improve the statistical intuition of their own stakeholders.
The tool has three educational goals:
- Help the user develop an understanding of distributions of sample means.
- Help the user develop intuition around statistical significance when it comes to detecting an effect.
- Help the user develop intuition around the challenges of estimating the size of an effect.
The simulator has documentation and explanations built into it, but we also have put together the following video that explains and demos the tool: