A/B Testing

These are controlled experiments on a website, say, to measure the impact of a given change.

What is A/B Testing?

A/B testing is a controlled experiment that you usually run on a website. Imagine you own a shop and wonder if changing the color of the "Buy Now" button from orange to blue will get more people to click on it. A/B testing can help you find out.

In this test, you have two groups:

Control Group: Sees the old (orange button) website.
Test Group: Sees the new change (blue button).

You then measure how each group behaves and use that data to actually decide whether this change was beneficial or not. Is the Test Group more likely to click the "Buy Now" button? Your decision about the button color is based on this real-world data.

You're not just limited to button colors. You can test:

Design layout: The placement of a button, or the layout of the page etc.
UI flow: The sales funnel
Different algorithms: Testing different Recommender Systems for example. Instead of relying on error metrics, you can test using real Views/Purchases
Pricing strategies: If customers catch wind that other people are getting better prices than they are for no good reason, they're not going to be very happy with you, so be aware that pricing experiments can have a negative backlash.

Important Metrics

Before starting a test, decide what you want to achieve. This is known as your Conversion Metric. It could be anything from increased revenue to more ad clicks. Sometimes it's useful to track multiple metrics. For instance, you might want to track both revenue and customer clicks. With a loss-leader products, more complexity goes into your pricing strategy than just top-line revenue. Maybe you just care about driving ad clicks on your website, or order quantities to reduce variance. You should also consider the amount of data available for rarer events, because thats how you encounter problems with variance.

You can measure more than one thing at once too, you don't have to pick one. You can instead choose to report on the effect of many different things, like Revenue, Profit, Clicks & Ad views.

If all these things are all moving in the right direction together, that's a very strong sign that this change had a positive impact in more ways than one. So, why limit yourself to one metric? Just make sure you know which one matters the most in what's going to be your criteria for success of this experiment ahead of time.

Conversion Attribution

What if a user clicks the new blue button, browses some more, and then makes a purchase? Who gets the credit? Is it the blue button? This is known as Conversion Attribution, and it's tricky.

Beware of Variance

Say, after one week, you find that users with the blue button spent $1 more on average. Should you change all buttons to blue? Not so fast! This could be random chance, i.e. just random variance in purchases, and it's crucial to consider.

When dealing with money, the amounts can vary widely. Therefore, you must run the test long enough with a large enough sample of people to make sure the results are reliable. Sometimes you need to choose a conversion metric that has less variance. It could be that the numbers on your website just mean that you would have to run an experiment for years in order to get a significant result based on something like revenue or amount spent.

Conclusion

A/B testing is a robust tool to make data-backed decisions for your website. But, it's not as straightforward as it seems. You must choose the right metrics, understand conversion attribution, and be aware of variance to make accurate decisions. Rhe only thing that statistics and data size can tell you, are probabilities that an effect is real. It's up to you to decide whether or not it's real at the end of the day.

The Confusion Matrix T-test and p-value