Algolia’s ranking strategy leverages our tie-breaking algorithm, handling challenging search problems such as typos, geo located results, exact matches and more, in combination with both your data’s textual and business relevant attributes.
Using our proprietary algorithm in combination with your data attributes, Algolia creates a unique relevancy and ranking that returns relevant results from the first keystroke of a user’s query.
Most search engines use a coefficient-based approach and rank results based on a unique float value that is hard, if not impossible, to decipher.
To improve relevance, Algolia built a tie-breaking algorithm, here’s how it works:
- All the matching records are sorted according to the first criterion.
- If any records are tied, those records are then sorted according to the second criterion.
- If there are still records that are tied, those are then sorted according to the third criterion and so on, until each record in the search results has a distinct position.
The default order of criteria
Algolia pre-defines the order of its criteria. For example, Typos comes first, Geolocation is next, Exact word-matching is last. There are 8 in all.
We recommend using this out-of-the-box ranking order as it works well for the vast majority of use cases. You can, however, modify the order of rules if necessary.
These 8 ranking criterion of our tie-breaking algorithm help us define what is both textually and business relevant.
Algolia can retrieve the records searched by the user even if a typing mistake was made. By default, we’ll match words that have 0, 1 or 2 typos per word. This is called typo-tolerance.
The Typo criterion in the ranking formula makes sure that a record without typos will be ranked higher than one with 1 typo, themselves being ranked higher than ones with 2 typos, and so on.
Geo (if applicable)
For example, with
aroundPrecision=100, two results up to 100 meters apart will be considered equal.
Words (if applicable)
This criterion is only applicable if you are using the optionalWords setting.
By default, Algolia discards all results that don’t contain all the words of the query. But with
optionalWords, where you declare some words as optional, the Words criterion will rank them by the number of words typed by the user that matched. Keep in mind that this is not counting the number of times the word appears in the record, but rather counting the number of words typed by the user that matched.
For example, if the user typed 2 words, the maximal score for this criterion is 2 - even if a record contains this word 10 times.
If a query has used filters or optional filters, the filters criterion will rank records according to a filtering score. All filters start out with a score of 1 - so, records with one match will score higher than records with no match (1 > 0). Equally, records with more matches will score higher than records with less matches - because Agolia counts each match.
For purposes of tie-breaking, all records with the same score are ranked the same, and so the ranking formula will drop to the next criterion to break the tie.
You can adjust the scoring in 2 significant ways:
- With filter scoring, you can use variable scores, scoring some filters higher than 1. By setting a filter with a score = 2, or score=3, you can favor that filter over others.
- With sumOrFiltersScores, you can accumulate the scores of disjunctive (OR) matches to come up with a total score, ranking records higher than records with a lesser total score.
The Filter criterion can be quite powerful in defining relevance, as seen in the personalization example.
For a query that contains two or more words, Proximity calculates how physically near those words are to each other in the matching record. This criterion will rank higher the objects that have the words closer to each other.
George Clooney is a better proximity match than
George word Clooney.
The Attribute criterion only looks at attributes you have placed in the searchableAttributes (also referred to as AttributesToIndex). Additionally, attributes at the top of the
searchableAttribute list will rank higher than the lower ones.
There is also an importance to the ordering of the matches within the attribute itself. If you have selected
ordered, we’ll rank higher those objects whose matched words are closer to the beginning of a given attribute. For example, words in position 2 of an attribute are ranked higher than words in position 5. If not
ordered, the position of the word is not taken into account.
*Computing the “best-matched attribute”
For tie-breaking purposes, the ranking formula looks for the best-matched attribute*.
As seen in the order of the 8 criterion, the default ranking formula puts proximity before attribute, which has a subtle but important effect on computing the best-matched attribute: attributes whose matched terms are closest in proximity to each other are ranked highest.
On the other hand, if you put proximity after attribute, or remove proximity altogether, the best-matched attribute will be the one with the most matched words.
For example, below is an example of what happens when you put attribute above proximity. If you have two searchable attributes - profession and full-name - and the query is “jerry singer”, the best match will be determined by the order of the
searchableAttributes (profession first, then full-name). The query “jerry singer” will be therefore ranked by profession (“singer”) not full-name (“jerry singer”):
But as mentioned, the default for Algolia is to rank proximity first, before attribute, which changes the result: the attribute that contains the 2 words “jerry” and “singer” in closest proximity will be ranked higher. So here, full-name will be ranked higher than profession.
Subtle. We recommend keeping the proximity criterion before the attribute criterion. Proximity usually leads to a better identification of the best-matched attribute.
Records with words (not just prefixes) that exactly match the query terms are ranked higher.
This criterion is the one taking into account the settings that you have selected in Custom Ranking - defined below.
If you have multiple attributes in your Custom Ranking, the behavior will be the same as for the rest of the Ranking Formula: we’ll only look at a criterion to refine the ranking when there is a tie on all the previous criteria.
For example, if you have the following Custom Ranking:
featured being either
number_of_likes being a numerical value, then the tie-breaker for objects with the same ranking after the 6 first criteria will be as follows:
- Featured objects, ranked from the most to the least liked
- Not featured objects, ranked from the most to the least liked
Continue building your Algolia knowledge with these concepts:
Did you find this page helpful?
We're always looking for advice to help improve our documentation!
Please let us know what's working (or what's not!).
We're constantly iterating thanks to the feedback we receive.