Concepts / Getting insights and analytics / A/B Testing
Jan. 07, 2019

A/B Testing

Algolia has in some ways become its own client. It has leveraged two signature features - Relevance Tuning and Analytics - to create a new tool: A/B Testing. Let’s see what this means.

  • Relevance Tuning enables you to give your users the best search results. Algolia offers numerous settings and methods to manage relevance.

  • Analytics makes relevance tuning data-driven, ensuring that your configuration choices are sound and effective.

Relevance tuning, however, can be tricky. The choices are not always obvious. It is sometimes hard to know which settings to focus on and what values to set them to. It is also hard to know if what you’ve done is useful or not. What you need is input from your users, to test your changes live.

This is what A/B Testing does. It allows you to create 2 alternative indices, A and B, each with their own settings, and to put them both live, to see which one performs best.

A/B Testing defined

Create two alternative indices, with only a small difference in their relevance settings. Call them A and B. Put them both live on your website. Present A to some users and B to the rest. Make sure A-users are always using A, and B-users are always using B (using a unique user id). With Algolia’s analytics, capture the same user events for both A and B. Measure these captured events against each other, creating scores. Use these scores to determine whether A or B is a better user experience. Adjust your main index accordingly. Start a new test.

AB comparative testing is widely-used in the industry, to measure the usability and effectiveness of a website. Algolia’s focus is on measuring Search, and more specifically Relevance: Are your users getting the best search results? Is your search effective in engaging and retaining your customers? Is it leading to more clicks, more sales, more activity for your business?

With this feature, you can run alternative indices in parallel, capturing click and conversion analytics to compare effectiveness. You make small incremental changes to your main index and have those changes tested - live and transparently by your customers - before making them official. A/B Testing goes directly to an essential source of information, your users, by including them in the decision-making process, in the most reliable and least burdensome way.

A/B Testing - Implementation

We have designed A/B Testing with simplicity in mind, to encourage you to perform A/B Testing regularly and often.

For starters: A/B Testing does not require any coding intervention. It can be managed from start to finish by people with no technical background.

As a prerequisite, you’ll need to perform some technical preparations in your environment; but once prepared, anyone can set up tests, using only the Dashboard.

Here is a checklist of the steps you need to take. We explain each one in the next sections of this page.

Examples - What kind of tests can you make?

As already mentioned, we allow 2 kinds of A/B Tests:

  • Changing your index settings
  • Reformatting your data

Example 1, Changing your index settings

Add a new custom ranking with the attribute, _number_of_likes_

  • You’ve recently offered your users the ability to like your items, which include music, films, and blog posts.
  • Now you have a large amount of likes and dislikes, and you’d like to use this information to sort your search results.
  • So you add a number_of_likes attribute to A, create B (a replica of A), and then adjust A and B’s settings accordingly:
    • A does not sort by number_of_likes (so it’s the same as before)
    • B sorts by number_of_likes
  • You name your test “Test new ranking with number_of_likes”.
  • You want this test to run for 30 days, to be sure to get enough data and a good variety of searches.
  • You set B at only 10% usage, because of the risk of introducing a new sorting. You don’t want to change the user experience for too many users until you’re are absolutely sure the change is desirable.
  • When your test reaches 95% confidence or greater, you will be able to see whether there was any improvement, and whether the improvement is large enough to justify the cost of implementing. In most cases, a settings change costs nothing, it’s just a simple configuration change on the Dashboard.

Example 2, Reformatting your data

Add a new search attribute: short_description

  • Your company has added a new short description for every item. You want to see if this short description will help return more relevant results.
  • Add the new attribute short_description to Index A.
  • Create replica B, which will have all of the same attributes and settings as A.
  • For B only, add the new attribute to its searchableAttributes settings.
  • You create a test called “Testing the new short description”.
  • You have enough traffic to know that 7 days is sufficient.
  • For the same reason as example 1, you give B only 30% usage (70/30) - because of the risk. You estimate that after 7 days, there will be enough data for both A and B to make a decision, and you’d rather not risk degrading an already good search with an untested attribute.
  • Once you reach 95% confidence, you can judge the improvement and the cost of implementation to see whether this change is beneficial.

Did you find this page helpful?