Putting A/B Testing to the Test

October 24, 2016   |   Margaret Chialastri

Testing is a crucial component of any successful marketing campaign or strategy. Be it websites, landing pages, or the customer journey, A/B testing (also known as split testing) is commonly employed. The process seems simple enough; develop a variation of your control, split the traffic between the control and the variation, and find out which version outperforms the other. Unfortunately, there are mistakes that marketers make when utilizing A/B testing, which can cost you time, money, and your sanity.

Statistical Significance – When Good Is Not Good Enough

Picture the scenario: The test is running along nicely and as expected the variation is outperforming the control. At this point there is a 60% statistical significance that the variation will outperform the control. With a handful of other tests to run and no time to waste, you end the test and move on to the next variation, right? Wrong! 60% may seem good but is not statistically high enough to say with full certainty that the variation is the winner. Too many marketers will end tests before achieving 95% statistical significance and base new tests off of false or misleading data. Imagine ending one test early, running four subsequent tests, only to find out six months down the road that all of the changes you made didn’t result in a true increase in metrics?

A winner cannot be declared until the test has reached 95% or a higher statistical significance.

Sample Size Counts

Making decisions on a small amount of data in any area of digital marketing is asking for trouble. This is especially true when it comes to testing. There isn’t a one-size-fits-all baseline to shoot for. Rather, take advantage of the many online calculators that exist to help steer you in the right direction. Websites like A/B Test Significance will help you determine the minimum sample size based on current conversion rates and desired confidence level (read: 95% or more). Depending on the website it may take weeks to allow enough visitors to travel through the test in order to determine accurate results.

Just because statistical significance has been reached doesn’t mean that the test is over, ensure you have also reached the needed sample size.

There’s Strength in Weekly Tests

All testing should be run for a minimum of one week, even if statistical significance has reached 95% and the needed sample size has been exceeded. Why? Traffic metrics, engagement rates, and conversion rates will vary from day to day, hour to hour, and in order to understand the true performance of a variation, it needs to be tested every day of the week.

In the example below, conversion rates are 8% higher on weekends than on weekdays. If the test ran from Tuesday through Friday, we would be making decisions based on lower converting traffic and the results may be inaccurate.

Data by Day of Week

Start and End your tests on the same day of the week to ensure accurate results.

Focusing on the Wrong KPI

While most tests are run to increase the conversion rate of an action, we have to remember that isn’t always the best success metric to focus on. In the following example you will see that while the conversion rate is higher for the test variation, the revenue is lower and therefore may not be in this company’s best interest.

Data Variation in A/B Testing

This could happen for a number of reasons and for the above website, think e-commerce, there were more purchases (higher conversion rate) but at a lower price point (lower total revenue) than the control. The increase in conversion rate was not enough to offset the lower sales price and therefore we saw a decrease in revenues.

Test and optimize with the overall company goals in mind.

In closing, testing is crucial to digital marketing success but it’s imperative to run tests accurately. Keep these top A/B testing mistakes in mind as you continue through your optimization efforts and you will receive more accurate data in return.

