Relevance is one of the most important attributes of good site search. But it isn’t something you can just flip a switch and turn on. Honing search results for relevance is a process of trial and error as you adjust your back end search parameters to meet your users’ needs and optimize conversions.
We are all familiar with A/B testing your website as a way for you to test how much a particular variable impacts your audience reaction and conversion metrics. And in a previous blog post, we wrote about how to A/B test your site search for relevance.
Today, we’ll dive into how to apply Algolia’s advanced A/B testing feature to your site search.
Why use A/B testing to improve site search
As you already know, A/B testing removes the guesswork from iterations on your relevance by showing you exactly how a particular setting or variable affects your predetermined conversion goals.
For example, you might want to test the way you rank the search results you return to users. Previously, you’ve added the business metric “publication date” in the ranking strategy, but you’re wondering if it might be more effective to use the business metric sales ranking. You think it might drive more conversions, but you want more data before you commit to the change site wide.
In an A/B test, half of the users on your website would be exposed to the original (or control) search experience — the one where your site ranks results by publication date. The other half of your audience will be exposed to the sales ranking system. You can then collect data about the behavior of both sets of users, and have a definitive answer on which setup drives more conversions.
One of the advantages of using Algolia A/B testing is that anyone on your team can create A/B tests entirely from our dashboard, without a single line of code (but it is of course available from the API if that’s what you prefer to do), and on any device.
4 great A/B testing example scenarios
What can you test with site search A/B testing? Here are some ideas:
- Tweak your custom rankings. Let’s say you have an e-commerce website where your items are currently sorted by popularity, but you are not actually certain that this is the best way to show results. Test the effectiveness of sorting them by publication date. A/B testing will let you test these two different custom ranking options against each other to see which one drives more sales.
- Test the impact of personalization. Once you start using personalization, you may want to assess the impact it has on your conversion and click-through rates. A/B testing is a great way to do this.
- A/B test the impact of synonyms. Test the impact of two different sets of synonyms on the same set of records and see which one actually performs better in terms of relevance.
- A/B test two sets of Query Rules. Test the impact of two different sets of Query Rules — two merchandising strategies, for example — to see which one actually performs better.
How does A/B testing actually work?
Let’s look at a practical application of A/B testing with Algolia: an easy, six-step process. (Note: you’ll need to set up Click Analytics first in order to capture click & conversion events).
1. Create a replica
Let’s go to your search dashboard. You can already see there is an index created, with defined custom ranking – # of sales per item (aka Sales ranking)
But, you are not sure this is the best ranking strategy, so you’d like to check it against ranking by publication date.
When you A/B test, you’re testing two search indices against each other, so the first thing you need to do is create a replica — aka a copy of your search index. To create a replica, go to the Indices section of the left-hand toolbar and click the “Replicas” option. From there, you can select the option to “Create Replica Index.”
The replica will be synchronized with your main index, so any changes to the main index will be forwarded to your replica. They are identical apart from the changes you manually make to the replica.
2. Configure your replica
Once you’ve created your replica, it’s time to configure it with the factor you are testing. In this case, we’ve defined a different custom ranking based on publication date instead of sales ranking.
3. Create a new A/B test
Once you’ve configured your replica, go to the A/B testing tab on the left-hand toolbar. Click on “New test” to create your new A/B test.
4. Pick your variants
Once you create a new test, you’ll have the option to pick the two variants you plan to test. The first variant should be your main index or your control. The second will be the replica you just created. It’s a good idea to provide a description for each variant for later reference, such as “custom ranking based on sales rank” and “custom ranking based on publication date.”
5. Define your traffic split
Underneath the variants, define the percentage traffic split between the two variables. An ideal split is 50/50, but if you have a high-traffic website (more than 100,000 page views a month), you can define an unequal traffic split and still see statistically significant results.
6. Define the test duration
Ideally, we want an A/B test to run for what we call two business cycles, which takes into account short-term seasonality effects. For an e-commerce website, a business cycle would mean one week. So, in this case, I want my test to run for 14 days.
Once you’ve defined these parameters, simply press “Create.”
How to interpret the results
The most important part of A/B testing is analyzing the results to determine which variant was most successful. Algolia’s results panel is clearly structured to help you make that decision:
In the results panel, you’ll see identifying information about the test: the name, the scenarios (the two indices you tested), and the status. The test will either be “Running,” “Stopped,” “Finished,” or “Failed.” You can click on the analytics bar next to each scenario to see the individual metrics for each.
Here are the other important components of the test panel:
- Traffic split. This reflects the volume of users assigned to one index or the other.
- Number of users. This tells you how many users were exposed to each variable. If a large volume of users were tested, this tells you the results are significant.
- CTR. This is the click-through rate or the number of users who clicked on results over the total number of searches, times a hundred. A click indicates the user found what they were looking for.
- Conversion. This is the conversion rate — the number of conversion events (like sales or downloads) over the total number of searches, times a hundred.
- Confidence score. This tells you how confident Algolia is that the results are a consequence of the configuration change and not random chance. The confidence score should be higher than 95% before you analyze the results.
Once your test reaches a 95% or greater confidence score and your two-cycle time period has passed, you can analyze the results to see which index generated more clicks and conversions.
In the A/B testing example above, you can see that the sales rank custom setup (Scenario B) performed better with a 5.2% uplift on CTR and a 4% increase in conversion rate. The confidence score for both is excellent, meaning there were enough searches and events to draw the conclusion that B yields better results. Once the test is over, you can apply the winning configuration to the main index.
In this test, the numbers demonstrate that ranking search results by sales rank leads to more conversions rather than ranking by publication date.
Best practices for A/B testing site search
A/B testing is a scientific process, so it’s important that you consistently follow a few guidelines to ensure the most accurate results.
Set a proper user token
User tokens are the user identifiers that connect search events with eventual conversions. IP addresses alone are not sufficient for accurate tracking and can lead to incorrect results. We highly recommend that you explicitly use a user token for the most accurate results. If you aren’t already using them, you can generate them with Algolia’s insights cookie.
Use the same user token for searches and for click events
It is very important to use the same user token for searches and for click events because that’s how the A/B test will connect the users with the events that those users performed. If you’re using two different user tokens, then the A/B test won’t work.
Disable A/B test (and analytics) for automated search agents or bots
If your website uses search automators (for example, if you have an option where users can subscribe to alerts when their search returns new results), then exclude those search agents from your A/B test. If you don’t, you’ll end up with a single “user” who is performing a vast number of searches but never clicks or converts. This can severely bias one side of your test results. Make sure automations are excluded from both your testing and your analytics setups.
Wait until the end of your test before relying on the confidence score
The confidence score is calculated based on the number of searches conducted — so you may actually reach a high confidence level within just a few hours of starting your test. That doesn’t mean these results are correct, though. It’s important to leave your test running for two full business cycles to account for all the seasonality effects that might change your results. For example, people search differently on the weekends versus weekdays.
Test a single hypothesis at once
Only change one search element at a time so that you can clearly test the effects of that change on your users’ behavior. Trying to run more than one A/B test at a time can lead to confusing and unclear results. How do you know which change led to which behavior?
Don’t make changes once the test is live
Once the test is running, don’t make any changes to either of the test indices, the traffic allocations, or the test goals. Changing variables mid-experiment will make the results irrelevant, as it will be impossible to tell which modification impacted user behavior.
Make search decisions with confidence
Improving relevance is the shortcut to improving your site search functionality. You can experiment to hone those results, but you don’t want to tank your conversions or customer satisfaction in the process.
Just like you A/B test other components of your website, you should A/B test your site search to eliminate the guesswork and zero in on the variables that make your results useful and effective. Test only a subset of the population to avoid costly mistakes — this will also give you confidence when you go to make major changes.