Shopify Expert Insights

E-Com Advice from our experienced in-house team

This is a guest post by Nick Disabato.

Having run over 500 tests for over 3 dozen businesses over the past 6 years, I know a thing or two about CRO – and where things can go wrong.

First, though, here are some places where things rarely go wrong:

  • Getting enough traffic for statistically significant results. For online stores, you can at least run a new round of tests every month on the 2-3 pages that you get the most traffic, as well as your cart and checkout. Other lines of business are not so lucky!
  • The design & development process for a prototype. If you’re ready to start A/B testing, you already know how to actually build software, and you probably have design & development resources retained.
  • Focusing on the right metrics. You already know that AOV, CLTV, conversion rate, reorder rate, upsell take rate, and ARPU matter. Nobody is going around maximizing subscriptions to their mailing list, or views of their blog posts. Store owners care about getting a positive ROI out of their CRO activities, and that means revenue generation.

Which is great! But this isn’t enough. Most of the things that trip up store owners are around mindset, and following best practices that aren’t really great after all.

The main reason anyone tests is to move the needle for your store. So, how do you increase the likelihood that you’ll build tests that win?

Here are the 5 biggest blind spots that hold store owners back from an outsize win rate and significant gains in profit, and what you can do to not fall prey to each one of them:

Blind spot 1: Acting on instinct

Tests are most likely to win if they respond to customer needs and fix specific, observable issues. Put another way, you should have a process that researches how customers are acting on your store, figure out what motivates them to buy, and then create tests that respond to those behaviors and needs.

Evidence is a great way to find new test ideas. Instinct isn’t. “What if we tested this now?” is a great way to run tests that lose. But most stores do this, in the absence of a concrete plan.

Let’s say you have an add to cart button. You want to run a test that changes its text or appearance in some capacity. What do you change it to? Why? What do you think will happen if you make that change?

Good tests are intentional and deliberate, and every design decision that you make should have a clear reason. By researching any potential changes, prioritizing them to be built out, and creating hypotheses that connect directly to new experiments, you’re more likely to run tests that win.

Blind spot 2: Running “popular” A/B tests

Everyone has heard of the button color test that made a zillion dollars. Same with testing headlines. In fact, button colors & headlines are the most popular two kinds of tests – not only because they are well-known examples of A/B testing, but because they’re extremely easy to put together.

Popular A/B tests are generally popular because they have a popular perception of being low-risk and high-reward. And to be sure, Draft has run button color tests that have won. But those button color tests were also researched! For example, when running tests at the Wirecutter, we ran a test that changed every product’s buy button to the color of the store that it pointed to: Amazon orange, Walmart blue, Apple gray, etc.

We did this because customers often didn’t notice that they were going to other stores, and were thrown off by the disconnect between the Wirecutter’s experience and (for example) Amazon’s. Clickthrough rate increased as a result, because people knew where they were going and what they could do when there.

Rather than making your buttons brighter, change your buttons for a reason. Look into what motivates customers, and then change your headline to something that clearly meets their needs.

Blind spot 3: Not following statistics

Let’s say you run a test and find that your ARPU went up by 83¢ at 85% confidence. Does that mean you should roll the decision out to everyone, and expect that your ARPU will increase by 83¢ into perpetuity?

Not necessarily. Think of that 83¢ figure as the center of a bell curve with, in this case, a rather shallow slope. That bell curve represents the full range of expected long-term outcomes that you might get from rolling out your variant to all customers. In practice, you might do better or worse than 83¢. And an 85% confidence level is not generally what most optimizers use to determine success; in fact, most winners are rolled out if they are called at 95% confidence or above.

It’s also dangerous to peek at an experiment’s data while the test is still active. You should calculate your expected sample size and experiment duration ahead of time, check in only to fix bugs with your goal reporting, and only take action on the results after the experiment has run its course.

The best explanation of the math behind this is “How Not to Run an A/B Test” by Evan Miller. Here at Draft, we use his sample size calculator every day.

Blind spot 4: The HiPPO

The HiPPO, or “highest paid person’s opinion,” can severely harm the progress of your testing program.

Test ideas should never be prioritized by the rank of who suggested them. Good ideas can come from anywhere in the organization. In fact, the best ideas often come from people who are in the trenches every day, understanding the real problems that customers are facing. At Draft, we’ve come to trust customer support staff more than the CEO when it comes to figuring out the best things to test. They suggest ideas that win more often!

We solve this problem by giving the entire team access to our Trello board for project management, and point everyone to suggest new ideas in a specific column. New ideas are then discussed, researched, prioritized, built, and tested. This way, the CEO is given precisely as much power as the newest contractor.

Blind spot 5: Following the leader

Ecommerce has a habit of copying. Someone does something that works well, and others follow the same playbook.

This makes sense on the face of things, because nobody has any clear sense of what works at any time. But if you don’t actually lead on your own store’s user experience, you’re leaving your store’s fate in the hands of people who are playing at a higher level of the game.

Following the leader is a bad idea because of, you guessed it, a lack of research. Instead of copying what “works” for someone else, investigate why, what that means for you, and how – and if – you should respond to it.

It’s always good to pay attention to what other stores are doing. But it’s dangerous to implement others’ ideas wholesale without determining how they fit int your own big picture. Trust, but verify.

The goal is test ideas that win

The industry-wide success A/B test success rate is about 12.5%, or 1 in every 8 tests. Do you want a failure rate of 87.5%? No, of course you don’t. You’ll waste time and money on a bunch of tests that fail to move the needle for your business.

At Draft, our success rate hovers around 65% as of press time. How? Because we research our ideas, carefully prioritize new tests, and listen to the whole team.

Fundamentally, value-based design is a matter of abandoning your ego. You may know the kind of product that your customers need, but you won’t know how your customers behave – and why they buy – until you investigate them. By avoiding these blind spots, you should be able to improve your business’s optimization maturity, and create an ROI-positive optimization program in the process.

About the author
Nick Disabato is a designer & writer from Chicago. He runs Draft, an optimization consultancy for online stores. His most recent book is Value-Based Design, the definitive how-to guide for getting a positive ROI out of any design work. He's spoken at SxSW, O'Reilly Web 2.0, and eCommerce All-Stars, among a bunch of other places, and he thinks your dog is very good.