A/B testing can be valuable because different audiences behave, well, differently. Something that works for one company may not necessarily work for another. In fact, conversion rate optimization (CRO) experts hate the term “best practices” because it may not actually be the best practice for you.
But A/B tests can also be complex. If you’re not careful, you could make incorrect assumptions about what people like and what makes them click — decisions that could easily misinform other parts of your strategy.
Keep reading to learn how to do A/B testing before, during, and after data collection so you can make the best decisions from your results.
To run an A/B test, you need to create two different versions of one piece of content, with changes to a single variable. Then, you’ll show these two versions to two similarly sized audiences and analyze which one performed better over a specific period of time (long enough to make accurate conclusions about your results).
A/B testing helps marketers observe how one version of a piece of marketing content performs alongside another. Here are two types of A/B tests you might conduct in an effort to increase your website’s conversion rate:
Perhaps you want to see if moving a certain call-to-action (CTA) button to the top of your homepage instead of keeping it in the sidebar will improve its clickthrough rate.
To A/B test this theory, you’d create another, alternative web page that reflected that CTA placement change. The existing design — or the “control” — is Version A. Version B is the “challenger.” Then, you’d test these two versions by showing each of them to a predetermined percentage of site visitors. Ideally, the percentage of visitors seeing either version is the same.
Learn how to easily A/B test a component of your website with HubSpot’s Marketing Hub.
Perhaps you want to find out if changing the color of your call-to-action (CTA) button can increase its clickthrough rate.
To A/B test this theory, you’d design an alternative CTA button with a different button color that leads to the same landing page as the control. If you usually use a red call-to-action button in your marketing content, and the green variation receives more clicks after your A/B test, this could merit changing the default color of your call-to-action buttons to green from now on.
A/B testing has a multitude of benefits to a marketing team, depending on what it is you decide to test. Above all, though, these tests are valuable to a business because they’re low in cost but high in reward.
Let’s say you employ a content creator with a salary of $50,000/year. This content creator publishes five articles per week for the company blog, totaling 260 articles per year. If the average post on the company’s blog generates 10 leads, you could say it costs just over $192 to generate 10 leads for the business ($50,000 salary ÷ 260 articles = $192 per article). That’s a solid chunk of change.
Now, if you ask this content creator to spend two days developing an A/B test on one article, instead of writing two articles in that time period, you might burn $192 because you’re publishing one fewer article. But if that A/B test finds you can increase each article’s conversion rate from 10 to 20 leads, you just spent $192 to potentially double the number of customers your business gets from your blog.
If the test fails, of course, you lost $192 — but now you can make your next A/B test even more educated. If that second test succeeds in doubling your blog’s conversion rate, you ultimately spent $284 to potentially double your company’s revenue. No matter how many times your A/B test fails, its eventual success will almost always outweigh the cost to conduct it.
There are many types of split tests you can run to make the experiment worth it in the end. Here are some common goals marketers have for their business when A/B testing:
Now, let’s walk through the checklist for setting up, running, and measuring an A/B test.
Follow along with our free A/B testing kit with everything you need to run A/B testing including an test tracking template, a how-to guide for instruction and inspiration, and a statistical significance calculator to see if your tests were wins, losses, or inconclusive.
As you optimize your web pages and emails, you might find there are a number of variables you want to test. But to evaluate how effective a change is, you’ll want to isolate one “independent variable” and measure its performance — otherwise, you can’t be sure which one was responsible for changes in performance.
You can test more than one variable for a single web page or email; just be sure you’re testing them one at a time.
Look at the various elements in your marketing resources and their possible alternatives for design, wording, and layout. Other things you might test include email subject lines, sender names, and different ways to personalize your emails.
Keep in mind that even simple changes, like changing the image in your email or the words on your call-to-action button, can drive big improvements. In fact, these sorts of changes are usually easier to measure than the bigger ones.
Note: There are some times when it makes more sense to test multiple variables rather than a single variable. This is a process called multivariate testing. If you’re wondering whether you should run an A/B test versus a multivariate test, here’s a helpful article from Optimizely that compares the two.
Although you’ll measure a number of metrics for every one test, choose a primary metric to focus on — before you run the test. In fact, do it before you even set up the second variation. This is your “dependent variable.”
Think about where you want this variable to be at the end of the split test. You might state an official hypothesis and examine your results based on this prediction.
If you wait until afterward to think about which metrics are important to you, what your goals are, and how the changes you’re proposing might affect user behavior, then you might not set up the test in the most effective way.
You now have your independent variable, your dependent variable, and your desired outcome. Use this information to set up the unaltered version of whatever you’re testing as your “control.” If you’re testing a web page, this is the unaltered web page as it exists already. If you’re testing a landing page, this would be the landing page design and copy you would normally use.
From there, build a variation, or a “challenger” — the website, landing page, or email you’ll test against your control. For example, if you’re wondering whether including a testimonial on a landing page would make a difference, set up your control page with no testimonials. Then, create your variation with a testimonial.
For tests where you have more control over the audience — like with emails — you need to test with two or more audiences that are equal in order to have conclusive results.
How you do this will vary depending on the A/B testing tool you use. If you’re a HubSpot Enterprise customer conducting an A/B test on an email, for example, HubSpot will automatically split traffic to your variations so that each variation gets a random sampling of visitors.
How you determine your sample size will also vary depending on your A/B testing tool, as well as the type of A/B test you’re running.
If you’re A/B testing an email, you’ll probably want to send an A/B test to a smaller portion of your list to get statistically significant results. Eventually, you’ll pick a winner and send the winning variation on to the rest of the list. (See “The Science of Split Testing” ebook at the end of this article for more on calculating your sample size.)
If you’re a HubSpot Enterprise customer, you’ll have some help determining the size of your sample group using a slider. It’ll let you do a 50/50 A/B test of any sample size — although all other sample splits require a list of at least 1,000 recipients.
If you’re testing something that doesn’t have a finite audience, like a web page, then how long you keep your test running will directly affect your sample size.
You’ll need to let your test run long enough to obtain a substantial number of views, otherwise it’ll be hard to tell whether there was a statistically significant difference between the two variations.
Once you’ve picked your goal metric, think about how significant your results need to be to justify choosing one variation over another. Statistical significance is a super important part of A/B testing process that’s often misunderstood. If you need a refresher on statistical significance from a marketing standpoint, I recommend reading this blog post.
The higher the percentage of your confidence level, the more sure you can be about your results. In most cases, you’ll want a confidence level of 95% minimum — preferably even 98% — especially if it was a time-intensive experiment to set up. However, sometimes it makes sense to use a lower confidence rate if you don’t need the test to be as stringent.
Matt Rheault, a senior software engineer at HubSpot, likes to think of statistical significance like placing a bet. What odds are you comfortable placing a bet on? Saying “I’m 80% sure this is the right design and I’m willing to bet everything on it” is similar to running an A/B test to 80% significance and then declaring a winner.
Rheault also says you’ll likely want a higher confidence threshold when testing for something that only slightly improves conversation rate. Why? Because random variance is more likely to play a bigger role.
“An example where we could feel safer lowering our confidence threshold is an experiment that will likely improve conversion rate by 10% or more, such as a redesigned hero section,” he explained.
“The takeaway here is that the more radical the change, the less scientific we need to be process-wise. The more specific the change (button color, micro copy, etc.), the more scientific we should be because the change is less likely to have a large and noticeable impact on conversion rate.”
Testing more than one thing for a single campaign — even if it’s not on the same exact asset — can complicate your results. For example, if you A/B test an email campaign that directs to a landing page at the same time that you’re A/B testing that landing page … how can you know which change caused the increase in leads?
To do an A/B test on your website or in an email, you’ll need to use an A/B testing tool. If you’re a HubSpot Enterprise customer, the HubSpot software has features that let you A/B test emails (learn how here), calls-to-action (learn how here), and landing pages (learn how here).
For non-HubSpot Enterprise customers, other options include Google Analytics’ Experiments, which lets you A/B test up to 10 full versions of a single web page and compare their performance using a random sample of users.
Timing plays a significant role in your marketing campaign’s results, whether it’s time of day, day of the week, or month of the year. If you were to run Version A during one month and Version B a month later, how would you know whether the performance change was caused by the different design or the different month?
When you run A/B tests, you’ll need to run the two variations at the same time, otherwise you may be left second-guessing your results.
The only exception here is if you’re testing timing itself, like finding the optimal times for sending out emails. This is a great thing to test because depending on what your business offers and who your subscribers are, the optimal time for subscriber engagement can vary significantly by industry and target market.
Again, you’ll want to make sure that you let your test run long enough in order to obtain a substantial sample size. Otherwise, it’ll be hard to tell whether there was a statistically significant difference between the two variations.
How long is long enough? Depending on your company and how you execute the A/B test, getting statistically significant results could happen in hours … or days … or weeks. A big part of how long it takes to get statistically significant results is how much traffic you get — so if your business doesn’t get a lot of traffic to your website, it’ll take much longer for you to run an A/B test.
In theory, you shouldn’t restrict the time in which you’re gathering results. (Read this blog post to learn more about sample size and timing.)
A/B testing has a lot to do with quantitative data … but that won’t necessarily help you understand why people take certain actions over others. While you’re running your A/B test, why not collect qualitative feedback from real users?
One of the best ways to ask people for their opinions is through a survey or poll. You might add an exit survey on your site that asks visitors why they didn’t click on a certain CTA, or one on your thank-you pages that asks visitors why they clicked a button or filled out a form.
You might find, for example, that a lot of people clicked on a call-to-action leading them to an ebook, but once they saw the price, they didn’t convert. That kind of information will give you a lot of insight into why your users are behaving in certain ways.
Again, although you’ll be measuring multiple metrics, keep your focus on that primary goal metric when you do your analysis.
For example, if you tested two variations of an email and chose leads as your primary metric, don’t get caught up on open rate or clickthrough rate. You might see a high clickthrough rate and poor conversion rates, in which case you might end up choosing the variation that had a lower clickthrough rate in the end.
Now that you’ve determined which variation performs the best, it’s time to determine whether or not your results statistically significant. In other words, are they enough to justify a change?
To find out, you’ll need to conduct a test of statistical significance. You could do that manually … or you could just plug in the results from your experiment to our free A/B testing calculator.
For each variation you tested, you’ll be prompted to input the total number of tries, like emails sent or impressions seen. Then, enter the number of goals it completed — generally you’ll look at clicks, but this could also be other types of conversions.
The calculator will spit out the confidence level your data produces for the winning variation. Then, measure that number against the value you chose to determine statistical significance.
If one variation is statistically better than the other, you have a winner. Complete your test by disabling the losing variation in your A/B testing tool.
If neither variation is statistically better, you’ve just learned that the variable you tested didn’t impact results, and you’ll have to mark the test as inconclusive. In this case, stick with the original variation — or run another test. You can use the failed data to help you figure out a new iteration on your new test.
While A/B tests help you impact results on a case-by-case basis, you can also apply the lessons you learn from each test and apply it to future efforts.
For example, if you’ve conducted A/B tests in your email marketing and have repeatedly found that using numbers in email subject lines generates better clickthrough rates, you might want to consider using that tactic in more of your emails.
The A/B test you just finished may have helped you discover a new way to make your marketing content more effective — but don’t stop there. There’s always room for more optimization.
You can even try conducting an A/B test on another feature of the same web page or email you just did a test on. For example, if you just tested a headline on a landing page, why not do a new test on body copy? Or color scheme? Or images? Always keep an eye out for opportunities to increase conversion rates and leads.
Learn how to best carry out some of the steps above using the free ebook below.
Originally published Mar 18, 2020 10:15:00 AM, updated March 18 2020