Ranking Formula

Introduction

Algolia redesigned from the ground up the way search engines tackle relevance. And made a lot of big improvements, among which:

  • Merging the textual and the business relevance is much easier
  • You can make sure that the business relevance won’t override the textual relevance
  • The ranking formula is easy to maintain, more stable, and is relevant for the whole catalog, not only for the most popular queries

And we’re an open book: we want our users to be able to see, understand and tweak everything that our search engine does when it comes to ranking. There are 3 settings that you need to understand to get the best relevance: Attributes To Index, Custom Ranking and Ranking Formula.

By the end of this guide, you’ll be able to refine the relevance of our engine for your specific use-case, and help your users find what they’re looking for.

Searchable Attributes

You can find this setting under the Ranking tab of your index page, in the dashboard. It has 3 purposes:

  1. Declare the attributes of your records that you want to make searchable
  2. Order these attributes by importance
  3. Declare if the order of the words inside the attribute matters or not

Define the attributes you want to be searchable

We’ll take the example of an index with the following records:

  {
    "name": "John Doe",
    "company": "Acme",
    "url": "http://www.acme.com"
  },
  {
    "name": "Jane Dawson",
    "company": "John & Bill",
    "url": "http://www.johnandbill.com"
  }

Let’s say that you want the search into the name and company attributes, but not the url. We set both attributes in the setting Attributes to Index to make them searchable.

Ranking 1

Searchable Attributes (Attributes To Index)

Searching for John Doe will return the first record, and searching for Jane Dawson will return the second. But typing http won’t return any results, which is what we wanted.

Order the searchable attributes by order of importance

What will happen if we type John? This word is present in both objects: the first record’s name is “John Doe” and the second record’s company is “John & Bill”. Which result do you want to appear first in the results?

You can decide that by ordering your attributes in the same Attributes to Index setting: the higher an attribute is in the list, the more important it will be.

Ranking 2

You can also give the same importance to two attributes by putting them on the same line (separated by a comma).

Ranking 3

Custom Ranking

Ordered/Unordered

For each attribute we also have an additional setting: Ordered or Unordered. In Ordered, matching words at the beginning of a given attribute will be considered more important than words further in this attribute.

For instance, the object iPhone 5 will be ranked higher than Case for iPhone for the query iphone, because this word is in first position of the attribute (instead of the third position).

Custom Ranking

To return the best results, Algolia uses all the information available: the information sent by the users but also the information that you can give us about your records. This way, we can combine the textual relevance (based on what is given by the end-user) and the business relevance (based on the metrics you provide).

To communicate your business metrics to the engine, you can set them in the Custom Ranking. You can put any type of numerical or boolean value that represents the popularity/importance of your records.

Check that the attributes used in CustomRanking are not formatted as a string: that would rank the objects alphabetically.

It can be a raw value like the number of sales, views or likes. It can also be a computed value such as a popularity score that you calculated on your side.

Let’s take an example:

  {
    "name": "iPhone 4",
    "units_sold": 20
  },
  {
    "name": "iPhone 5",
    "units_sold": 10
  },
  {
    "name": "iPhone 6",
    "units_sold": 200
  }

Ranking 4

Custom Ranking

If we use the units_sold attribute in our Custom Ranking, and type the query “iPhone”, we’ll get the following results: Iphone 6 will be first, followed by iPhone 4 and iPhone 5.

You can decide whether you want the sort to be descending (bigger values appear first in the results) or ascending (smaller values appear first in the results).

Note: You can set multiple attributes in the custom ranking. But to understand how that would work, we first need to understand how the tie-breaking algorithm works. That’s the focus of our next chapter.

Ranking formula: A tie-breaking algorithm

Most search engines use a coefficient-based approach and rank results based on a unique float value that is hard, if not impossible, to decipher. To improve the relevance of the engine, we built a tie-breaking algorithm, here’s how it works:

  1. All the matching records are sorted according to the first criterion.
  2. If any records are tied, those records are then sorted according to the second criterion.
  3. If there are still records that are tied, those are then sorted according to the third criterion
  4. and so on, until each record in the search results has a distinct position.

You can modify the Ranking Formula if you want, but we recommend not to change it: from what we’ve seen, it is perfect for the majority of the use-cases.

Our default ranking formula uses 7 ranking criteria. The 6 first ranking criteria are related to textual relevance: the number of typos, the number of matching query words, the proximity between words…

If two objects are still tied after going through all the text-based criteria, then we use the last custom criterion (reflecting the customRanking that you have defined) to rank the results.

Here’s how our 7 criteria work:

Typo

Algolia can retrieve the words searched by the user even if a typing mistake was made. By default, we’ll match words that have 0, 1 or 2 typos. The criterion Typo in the ranking formula makes sure that a word without typos will be ranked higher than one with 1 typo, themselves being ranked higher than the ones with 2 typos.

Geo (if applicable)

If you’re using our geo-search feature, ranks the results by distance, from the closest to the furthest. The precision of this ranking is set by the parameter aroundPrecision. For example, with aroundPrecision=100 , two results up to 100 meters close will be considered equal.

Words (if applicable)

By default, Algolia discards all results that don’t contain all the words of the query. But if you declared some words as optional, this rule will rank them by number of words typed by the user that matched.

This criterion is not counting the number of times the word appears in the record. It is only counting the number of words typed by the user that matched (if the user typed 2 words, the maximal score for this criterion is 2 - even if a record contains this word 10 times).

Proximity

For a query that contains two or more words, how physically near are those words in the matching record? This criterion will rank higher the objects that have the words closer to each other (George Clooney is better than George word Clooney).

Attribute

This criterion is the one taking into account the settings that you have selected in Attributes to Index:

  • The order of the attributes will be used to rank higher the objects that have matched in an attribute placed on top of the list of indexed attributes
  • If you have selected Ordered, we’ll rank higher the objects whose matching words are at the beginning of a given attribute.

Exact

Records with words (not just prefixes) that exactly match the query terms are ranked higher.

By default, with single word queries, the query word needs to match the entire attribute in order to have the exact ranking criterion set to 1. This can be changed by using the exactOnSingleWordQuery parameter.

Custom

This criterion is the one taking into account the settings that you have selected in Custom Ranking. If you have multiple attributes in your Custom Ranking, the behavior will be the same than for the rest of the Ranking Formula: we’ll only look at a criterion to refine the ranking when there is an equality on all the previous criteria.

For example, if you have the following Custom Ranking:

Ranking 5

Custom Ranking

With featured begin either true or false and number_of_likes being a numerical value.

The objects that have the same ranking after the 6 first criteria will then be ranked between them this way:

  1. Featured objects, ranked from the most to the least liked
  2. Not featured objects, ranked from the most to the least liked

Combination of all criteria

When we apply the tie-breaking algorithm on the 7 criteria, the matching results will be sorted with the following order:

- 0 typos
   - 0 typos && MAX(words)
     - 0 typos && MAX(words) && MIN(proximity)
       - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute)
         - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact)
           - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact) && MAX(custom)
           - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact) && MAX(custom)-1
           - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact) && MAX(custom)-2
           - 
         - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact)-1
         - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute) && MAX(exact)-2
         - 
       - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute)-1
       - 0 typos && MAX(words) && MIN(proximity) && MAX(attribute)-2
       - 
     - 0 typos && MAX(words) && MIN(proximity)+1
     - 0 typos && MAX(words) && MIN(proximity)+2
     - 
   - 0 typos && MAX(words)-1 matching words
   - 0 typos && MAX(words)-2 matching words
   - 
- 1 typo
  - 
- 2 typos
  - 

Troubleshoot

Algolia provides a way to understand why an object is ranked the way it is, to help you troubleshoot the relevance settings. This can be done directly in the Dashboard or via the API.

Via the dashboard

If you go to your dashboard and make a search, you have a “Ranking Info” section that details how Algolia ranked this object.

Ranking 6

If you look at the second hit, you’ll have the difference between this object and the one above it.

Ranking 7

Via the API

The ranking information can be retrieved via the API. For that, you need to set the parameter getRankingInfo=1.

Did you find this page helpful?

We're always looking for advice to help improve our documentation! Please let us know what's working (or what's not!) - we're constantly iterating thanks to the feedback we receive.

Send us your suggestions!