Search by Algolia
Easily integrate Algolia into native apps with FlutterFlow
engineering

Easily integrate Algolia into native apps with FlutterFlow

Algolia's advanced search capabilities pair seamlessly with iOS or Android Apps when using FlutterFlow. App development and search design ...

Chuck Meyer

Sr. Developer Relations Engineer

Algolia's search propels 1,000s of retailers to Black Friday success
e-commerce

Algolia's search propels 1,000s of retailers to Black Friday success

In the midst of the Black Friday shopping frenzy, Algolia soared to new heights, setting new records and delivering an ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

Generative AI’s impact on the ecommerce industry
ai

Generative AI’s impact on the ecommerce industry

When was your last online shopping trip, and how did it go? For consumers, it’s becoming arguably tougher to ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What’s the average ecommerce conversion rate and how does yours compare?
e-commerce

What’s the average ecommerce conversion rate and how does yours compare?

Have you put your blood, sweat, and tears into perfecting your online store, only to see your conversion rates stuck ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What are AI chatbots, how do they work, and how have they impacted ecommerce?
ai

What are AI chatbots, how do they work, and how have they impacted ecommerce?

“Hello, how can I help you today?”  This has to be the most tired, but nevertheless tried-and-true ...

Catherine Dee

Search and Discovery writer

Algolia named a leader in IDC MarketScape
algolia

Algolia named a leader in IDC MarketScape

We are proud to announce that Algolia was named a leader in the IDC Marketscape in the Worldwide General-Purpose ...

John Stewart

VP Corporate Marketing

Mastering the channel shift: How leading distributors provide excellent online buying experiences
e-commerce

Mastering the channel shift: How leading distributors provide excellent online buying experiences

Twice a year, B2B Online brings together America’s leading manufacturers and distributors to uncover learnings and industry trends. This ...

Jack Moberger

Director, Sales Enablement & B2B Practice Leader

Large language models (LLMs) vs generative AI: what’s the difference?
ai

Large language models (LLMs) vs generative AI: what’s the difference?

Generative AI and large language models (LLMs). These two cutting-edge AI technologies sound like totally different, incomparable things. One ...

Catherine Dee

Search and Discovery writer

What is generative AI and how does it work?
ai

What is generative AI and how does it work?

ChatGPT, Bing, Bard, YouChat, DALL-E, Jasper…chances are good you’re leveraging some version of generative artificial intelligence on ...

Catherine Dee

Search and Discovery writer

Feature Spotlight: Query Suggestions
product

Feature Spotlight: Query Suggestions

Your users are spoiled. They’re used to Google’s refined and convenient search interface, so they have high expectations ...

Jaden Baptista

Technical Writer

What does it take to build and train a large language model? An introduction
ai

What does it take to build and train a large language model? An introduction

Imagine if, as your final exam for a computer science class, you had to create a real-world large language ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

The pros and cons of AI language models
ai

The pros and cons of AI language models

What do you think of the OpenAI ChatGPT app and AI language models? There’s lots going on: GPT-3 ...

Catherine Dee

Search and Discovery writer

How AI is transforming merchandising from reactive to proactive
e-commerce

How AI is transforming merchandising from reactive to proactive

In the fast-paced and dynamic realm of digital merchandising, being reactive to customer trends has been the norm. In ...

Lorna Rivera

Staff User Researcher

Top examples of some of the best large language models out there
ai

Top examples of some of the best large language models out there

You’re at a dinner party when the conversation takes a computer-science-y turn. Have you tried ChatGPT? What ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What are large language models?
ai

What are large language models?

It’s the era of Big Data, and super-sized language models are the latest stars. When it comes to ...

Catherine Dee

Search and Discovery writer

Mobile search done right: Common pitfalls and best practices
ux

Mobile search done right: Common pitfalls and best practices

Did you know that 86% of the global population uses a smartphone? The 7 billion devices connected to the Internet ...

Alexandre Collin

Staff SME Business & Optimization - UI/UX

Cloud Native meetup: Observability & Sustainability
engineering

Cloud Native meetup: Observability & Sustainability

The Cloud Native Foundation is known for being the organization behind Kubernetes and many other Cloud Native tools. To foster ...

Tim Carry

Algolia DocSearch is now free for all docs sites
product

Algolia DocSearch is now free for all docs sites

TL;DR Revamp your technical documentation search experience with DocSearch! Previously only available to open-source projects, we're excited ...

Shane Afsar

Senior Engineering Manager

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Most search engines basically boil down to counting the number of times a query word appears in a document. Some of the more sophisticated ones use some algorithm to give each document a unique but vaguely-defined float value. That makes it hard to improve the relevance of search results or use any other data format.

Algolia works differently. We designed it mainly for database search, so the query-counting approach doesn’t work anymore. Instead, our ranking algorithm rates each matching record on several criteria (like the typo count or geo-distance), to which we individually assign an integer value score. You can even apply your own criteria to model your business logic directly inside the search engine. You get to pick the order of the criteria used, and then going down the list of criteria, all results which are currently tied are sorted.

A record’s score on each criterion is explicitly listed in the search results (see _rankingInfo  below for the query “the rains”), so you can understand how one record can rank higher than another one. We will explain each of these criteria in this article.

{
"hits": [
     {
          "name": "The Rains Came",
          "url": "/title/tt0031835/",
          "rating": 6.8,
          "year": "(1939)",
          "nb_voters": 881,
          "rank": 16232,
          "objectID": "24324",
          "_highlightResult": {
               "name": {
                    "value": "The Rains Came",
                    "matchLevel": "full"
                },
                "year": {
                     "value": "(1939)",
                     "matchLevel": "none"
                }
          },
          "_rankingInfo": {
               "nbTypos": 0,
               "firstMatchedWord": 0,
               "proximityDistance": 1,
               "userScore": 2379657,
               "geoDistance": 0,
               "geoPrecision": 1,
               "nbExactWords": 2
          }
},
...
}

Search-as-you-type

Before diving in, you first need to understand that Algolia searches for matching prefixes, not matching whole-words. For example, if you are searching for “Joe B”, we would consider all the following records as matches:

  • Joe Black
  • Joe Benson
  • Joe Bolick

Prefix matching is what enables us to return relevant results even when a user has only typed a single letter. When Google introduced instant search, they claimed that showing results before you finish typing can save 2-5 seconds per search.

Note: By default, when the query contains multiple terms, Algolia only uses the last term as a prefix. This is because when searching, say, for a person by name, it’s quite normal to type their entire first name but not their last (like George Cloo). Not so for the reverse (like Geo Clooney). You can override this behavior by setting queryType=prefixAll.

Ranking algorithm criteria

By default, Algolia ranks every matching record by using the following criteria, in the order listed below. The higher up the criterion on the list, the more importance it has on ranking. You can easily change this order if you want, but we have found that this default order is the best one in 90% of the use cases.

  1. Typos
  2. Geo-location (if applicable)
  3. Proximity
  4. Attributes
  5. Exact
  6. Custom

Let’s understand each one of these criteria by applying them to an example:

[
  {
    "objectID": 1,
    "name": "Jo Blak",
    "company": "Utility Trailer Sales",
    "nbCalls": 4
  },
  {
    "objectID": 2,
    "name": "Jo T. Black",
    "company": "Steritek Inc",
    "nbCalls": 45
  },
  {
    "objectID": 3,
    "name": "Joe Black",
    "company": "Pip Printing",
    "nbCalls": 9
  },
  {
    "objectID": 4,
    "name": "Joe Thompson",
    "company": "Black Birds inc",
    "nbCalls": 9
  },
  {
    "objectID": 5,
    "name": "Deanna Gerbi",
    "company": "Thompson, Joey & Blackburn ltd",
    "nbCalls": 7
  }
]

1. Typos

Are there words that start (that is, are prefixed) with a term typed by the user? And if so, do they match exactly the query?

  • 0 points means there are prefixes that exactly match all the terms in the query.
  • 1 point means there is a 1-character discrepancy between the matching prefixes and the query terms.
  • 2 points means there is a 2-character discrepancy, and so on.

Example: for the query “joe black”, here is how each result would rank for typos only (joey is considered as a typo as only the last word of the query is searched as a prefix):

Rank

Record

Score

Why

1

record 3

0

joe black

1

record 4

0

joe thompson black birds inc 

1

record 5

1

thompson, joe_ & blackburn ltd

2

record 2

1

jo_ t. black

3

record 1

2

jo_ bla_k

 

Note: By default, Algolia accepts 1 typo for words having at least 3 characters and 2 typos for words having at least 7 characters (this behavior can be configured withminWordSizefor1Typo  andminWordSizefor2Typos  query parameters). This means that the query “ab” only matches words starting with “ab”, while the query “abc” matches words starting by “abc” but also “aba”, “abb”, “aac”, etc.  A typo is defined by an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters (Damerau–Levenshtein distance). As it is extremely unusual to mistype the first character of a word, a typo on the first character counts for 2 points instead of 1.

2. Geo-location (if using)

How far is the record’s physical location from a central, predefined point? This criterion lets us rank by geoDistance score, which is that distance in meters.

We can even use the aroundPrecision query parameter to consider similar results as equal (for example, we can set this parameter to 10 so that a result 100 meters from our central point and a result 109 meters from our central point will be considered equally relevant).

We don’t use geo-location in our example, but you can find a dedicated guide in our documentation.

3. Proximity

For a query that contains two or more words, how physically near are those words in the matching record? Algolia adds 1 point for each word in between query words, with a maximum of 8 points.

  • 0 points means no proximity: there was only one word in the query
  • 1 point means the best possible match: the words are next to each other
  • 2 points means there is one word between the matched query words
  • and so on.

When words are in different attributes they get automatically the maximum of 8 points per new attribute. So if three query words are in three different attributes, the score is 16. If three words are in two different attributes, the score is 8.

In our example, we have a 3-way tie between records 1, 3 and 5 (‘&’ is considered as a separator and is not taken into account). Record 2 has a word in between the matched query words (Jo T. Black), while record 4 matches in two different attributes:

Rank

Record

Score

1

record 1

1

1

record 3

1

1

record 5

1

1

record 2

2

2

record 4

8

4. Attributes

This is the order of the attributes that Algolia will follow to search inside a record. Records where there is a match in the first listed attribute rank higher (that is, gets fewer points) than records with a match in an attribute that’s lower on the list. The exact number of points are determined by the position of the first matching word in the attribute.

In our example, say we consider the name as more important than the company. We would then use the setting attributesToIndex:[“name”, “company”] to indicate that we want to index, or search in, the attributes “name”, and then “company”, in this order of importance.

Lastly, matching text at the beginning of a given attribute will be considered more important than matching text further in this attribute. You can disable this behavior if you add your attribute inside unordered(AttributeName). If we considered the position of the match not relevant for the attribute “company”, we would use the setting attributesToIndex:[“name”, “unordered(company)”]. 

With “a” being the numerical index of the first matched attribute, and “w” being the numerical index of the first matched word within that attribute, the amount of points a result gets is determined by the formula: ((a – 1) * 1000) + w – 1. This ranks the results by the priority of the attribute they match, and if they tie, by the location of the match within the attribute.

Rank

Record

Score

1

record 1

0

1

record 2

0

1

record 3

0

1

record 4

0

2

record 5

1001

5. Exact

Records with words (not just prefixes) that exactly match the query terms rank higher. A record gets 1 point for every word that is exactly matched.

Here is how our records would rank based on exact-matching alone for the query “joe black”:

Rank

Record

Score

Why

1

record 3

2

joe black

1

record 4

2

joe tompson black bird inc

2

record 2

1

jo t. black

3

record 1

0

3

record 5

0

 

6. Custom / Business metrics

By this point, we’ve figured out how relevant a result is given the user’s query. But now, you can specify additional criteria, like custom business metrics that express a record’s popularity. 

With other search engines, you have to choose between sorting the results according to their relevance to the user’s query or according to their popularity (number of visits, ratings, sales, etc), but you can’t do both. This means users may get results that are outrageously popular, but completely irrelevant to their search. With Algolia, you can integrate popularity (or anything else, like population, or the last date of update) into the relevance calculation. To us, it is just an additional criterion, so it will support – rather than outweigh – classic relevance criteria.

In our example, we may consider people with whom we had many calls more popular than others. For people having the same number of calls, we can just order them by alphabetical order. We would then use the setting:  customRanking:[“desc(nbCalls)”, “asc(name)”]

For this criterion alone, here’s how our example records rank:

Rank

Record

Score

Why

1

record 2

4

nbCalls=45

2

record 3

3

nbCalls=9, “Joe B”  < “Joe T”

3

record 4

2

nbCalls=9

4

record 5

1

nbCalls=7

5

record 1

0

nbCalls=4

The score is actually the order of entries in the index (biggest score being first). There is never equal scores for this criterion. Therefore, custom should always be the last criterion of your ranking as no subsequent criterion would ever be checked.

Note: Custom ranking is computed at index time (for performance reasons) and cannot be changed dynamically with each query. If you need to change the ranking depending on context, you need to create one index per desired ranking formula. We recommend using the primary/replica feature to make it easier to keep several indices in sync. You only need to push your updates to the primary and they are automatically replicated to the replica indices (see the replica parameter in index settings).

Determining the overall rank

So what is the exact ranking of our query “joe black”?

1. Typos: After looking at typos, we can already rank record 1 as last. Since record 3 and 4, as well as record 2 and 5, are tied, we need to compare them to the next criterion.

Typos

Record 3

0

Record 4

0

Record 2

1

Record 5

1

Record 1

2

2. Geo:  Not applicable. All records have a score of 0. Next!

Typos

Geo-distance

Record 3

0

0

Record 4

0

0

Record 2

1

0

Record 5

1

0

Record 1

2

3. Proximity: Record 4 matches in two distinct attributes is thus scored less than record 3. Record 2 has a word (T.) between query terms and thus scores less than record 5.

Typos

Geo-distance

Proximity

Record 3

0

0

1

Record 4

0

0

8

Record 5

1

0

1

Record 2

1

0

2

Record 1

2

Since each record now has a distinct rank, in this example there’s no need to go through another round and compare scores for the Attributes, Exact and Custom criteria.

A second example

Before jumping to our conclusion, let’s now look at what would be the result for the query composed of the single character ‘j’:

Typo

Geo-distance

Proximity

Attributes

Exact

Custom

Record 2

0

0

0

0

0

4

Record 3

0

0

0

0

0

3

Record 4

0

0

0

0

0

2

Record 1

0

0

0

0

0

1

Record 5

0

0

0

1001

With such a simple query, we obtain a 4-way tie before checking the custom score that will finally consider record 2 as the best one because of the important number of calls it received.

Isn’t that cool? Algolia is examining already-relevant results and getting closer to the mark each time the user types an additional letter, all thanks to prefix-matching and this flexible, customizable algorithm. And that’s just the out-of-the-box configuration! If you’d like to take this a step further, check out this article here.

About the author
Nicolas Dessaigne

Co-founder & board member at Algolia

linkedintwitter

Smarter search strategies to engage your users

How to apply business logic for more relevant search results.

Image of Jason HarrisImage of Olivier Lance
Jason Harris - Developer AdvocateOlivier Lance - Solutions Engineer
Image of Jason Harris
Jason HarrisDeveloper Advocate
Image of Olivier Lance
Olivier Lance Solutions Engineer
Smarter search strategies to engage your users Watch the webinar

Recommended Articles

Powered byAlgolia Algolia Recommend

Inside the Algolia Engine Part 4 — Textual Relevance
engineering

Julien Lemoine

Co-founder & former CTO at Algolia

Best webpage structure & indexing for document search: the Laravel Example
engineering

Julien Lemoine

Co-founder & former CTO at Algolia

Inside the Algolia Engine Part 5 – Highlighting, a Cornerstone of Search UX
engineering

Julien Lemoine

Co-founder & former CTO at Algolia