4 questions to ask for relevant site search results

Back to all blogs

Relevance – it’s what we’re all going for with our search implementations, but it’s so subjective that it’s nearly impossible to nail down. How in the world are we supposed to optimize our search index to get the most relevant (read: converting and revenue-creating) results to our users?

Way back in 2016, we wrote an article with 10 tips to achieve highly relevant search results. We already had a lot of experience and 1500 customers loving our search toolset, but we’ve grown a lot since then: our satisfied customer base has multiplied by 11 and we’ve become industry leaders. All that experience comes with new lessons, so we wanted to rewrite this guide to bring you a simpler, more straightforward four questions to ask yourself to improve the relevance of your search results and rake in all the benefits that come with it.

What does relevance mean to you?

The biggest key to relevance in search is to remove the subjectivity. It’s impossible for us to optimize the algorithm for every single usecase, but we give you the tools to optimize it for your usecase. But to make the best use of those tools, you have to figure out what exactly makes a result relevant in your application.

For example, if you’re working with a database of movies, it’d definitely make sense to have simple searchable terms split out as their own attributes in the dataset (like the movie’s title, director, year, an array of lead actors, genre, and other stuff people might search for). But you could get away with a lot of the data unformatted in a “description” text field:

{
    "title": "Star Wars: Episode IV - A New Hope",
    "alternative_titles": [
         "Star Wars",
         "Star Wars: Episódio IV - Uma Nova Esperança",
         "Star Wars: Épisode IV - Un nouvel espoir",
         …
    ],
    "genre": [
         "Adventure",
         "Fantasy",
         "Science Fiction"
    ],
    "objectID": "440309800",
    "actors": [
         "Mark Hamill",
         "Carrie Fisher",
         "Harrison Ford",
         "Alec Guinness",
         …
    ],
    "director": "George Lucas",
    "year": 1979,
    "description": "Luke Skywalker joins forces with a Jedi Knight, a cocky pilot, a Wookiee and two droids to save the galaxy from the Empire's world-destroying battle station, while also attempting to rescue Princess Leia from the mysterious Darth Vader."
}

It’s not totally necessary to break up your description into a list of characters or tags. You’re still going to need to have the description around somewhere to eventually display with the search result, so adding super fine-grained attributes is just going to duplicate that data, take up more storage space, and not really help you much. Plus, it could actually decrease relevance: Luke Skywalker would be a far more popular result than any actor named Luke, so if I’m actually searching for an actor, I’d get results from Star Wars films crowding out what I’m probably looking for.

That logic doesn’t hold up in e-commerce, though. If you’ve got a product database where users could search for nearly any product attribute, your search index has to be very fine-grained. Every piece of data that can be split out as a new searchable or facetable attribute gets its own spot, and then you can order them by importance to your application.

Takeaway: Figure out exactly what makes a result “relevant” in your application, and structure your search index data around that.

Do I really need to mess with Algolia’s defaults?

The answer is no, not unless you really know what you’re doing. Lots of research and development (and validation by 17,000 customers with production implementations) has gone into those defaults, so you’d really only need to mess with them in very specific usecases. In years past (before these values were optimized), it was more common to suggest in instructional articles and in general advice to change these defaults, and since much of that content still exists on the Internet today, let’s look at a few pieces of Algolia’s algorithm to set you off on the right foot if you do actually have to change some things.

Typo tolerance

If your record contains “iphone”, you should be able to find it via “ipjone” or “iphoen”. This will be handled automatically by the Algolia engine. If you add in more typo-checking, you’ll get less relevant results popping up, since a couple substitutions, additions, and deletions could get you all the way to a completely different relevant query (just think of how often your phone’s hyperactive spellcheck changes a correctly spelled word to something else entirely). On the other hand, turning this down could mean penalizing your users for not spelling their queries perfectly, and that’s no fun for anyone (especially if they’re searching for brand names or something other words we don’t need to spell often).

Stop words and query modification

Stop words are the most commonly used words in a given language like “the”, “of”, “to”, “be”, “or”, etc. We’ve come across a lot of developers who think that they need to remove these words to leave more space for the more meaningful words in a query that’ll contribute to finding a more relevant result. But that’s just not true. If your search engine treats them right, these words can be very useful, and removing them could make finding some results almost impossible. For example, Google “To Beta or not to Beta”. It’ll pull up detailed scientific articles on software development, astronomy, ornithology, and the economy. Strip out the stop words and duplicate, and you’re just left with just “beta”. When I Google that, I get results on the Greek letter, the motorcycle company, and Apple’s beta-testing program. Those stop words have significant value! While this might be an exaggerated example specifically to highlight the phenomenon, it shows up in a smaller scale on much smaller datasets and in much less obvious situations.

Search as-you-type

Algolia’s libraries automatically start searching the database from the very first letter that the user types in the search box. So if you’re using InstantSearch, you get this functionality out-of-the-box. But if you’re rolling your own UI, you might be thinking that this is a waste of time to implement, or a waste of HTTP requests. There have been some creative ways of going about it (one of the most interesting I’ve seen is setting a timer after each new letter typed, and only sending the search request if the user hasn’t typed anything new after some fraction of a second), but as “elegant” as those solutions might seem, the data just doesn’t back that approach. We’ve tested it many times, and it’s become clear that any heuristic that launches the query after more than a single letter is typed leads to poor user experience and suboptimal conversion rates. It’s worth the cost of the HTTP requests.

Natural language handling

Natural languages have a ton of variety in them. Just think about how many different ways there are to make plurals in English! Here’s one article with 8 helpful rules for plural nouns: rule 7 is that some words are already singular and plural (think sheep), and rule 8 is that there are actually no rules! So how can we match queries to results that use different forms of the same words? There are a lot of techniques out there that factor into the algorithm, like stemming, lemmatization, and phonetization, but Algolia’s algorithm handles this already for you. Rolling your own would really only be necessary if you’re working in a language that we don’t support (and we support 68 of the world’s most common languages as of July 2023, so that’d be a really niche usecase). To add onto that, rolling your own runs the risk of bungling situations where data in your index isn’t from the language you’d expect. Take last names, for example, which very often don’t line up with the spelling and grammatical rules you’d expect from the context language.

Takeaway: Unless you really know what you’re doing, let Algolia’s years of experience guide the best configuration for your search engine.

How can I make the search results more relevant for specific users?

Answering this question starts by creating an index that’s mostly returning relevant results to everyone. If you’ve gotten to this point in the article and implemented the suggestions above, you’re already doing a good job thinking from a wide point of view.

But once you get to that point, is there still room to improve? Of course! You don’t have to settle for returning the set of results that would be most likely to contain what some average user is looking for. You can use Algolia’s personalization tools to return what that specific user is looking for! Imagine you’re a user shopping for grocery delivery online. Wouldn’t the user expect that after buying the same brand of milk several times, that brand would show up first in the search results for “milk”, regardless of that brand’s popularity with other customers? From the user’s point of view, that seems like common sense, but until recent years, that wasn’t that common in large-scale search implementations. Remember, users don’t think about relevance with such a wide point of view; they’re judging how accurate your search engine is, not by how close it gets to returning a result that most users would respond to, but by how close it gets to returning what that exact user is searching for. They likely have something specific in mind when they’re searching, and it’s Algolia’s personalization tools that can help you serve it to them. We go way deeper in this documentation guide, if you’d like to learn more about how you can implement it quickly.

Takeaway: Personalize the results to each individual user so that even niche queries get relevant results.

Am I taking advantage of Algolia’s AI-powered tools?

We’re strong believers in the ability of artificial intelligence’s power to augment and improve the human experience. So in that vein, we’ve created AI tools that can help you squeeze more relevance (and by extension, revenue) out of your search experience by spotting and making the most of patterns that would be near impossible to spot by eye:

Dynamic Synonym Suggestions – Words that mean the same thing can trip up search engines, because users might search for an item by a name we don’t have in our dataset. Usually we’d fix this by manually registering synonyms inside of our Algolia implementation (like specifying that “pants” and “trousers” are the same thing, so a user who searches for “trousers” will still get results for “pants”), but that’s tedious and requires a lot of friction and lost revenue just to get us to notice that we need to add new synonyms. However, searches don’t happen in isolation; if a user just searched for something but didn’t click on any results, and then they immediately search for something else, they’re probably searching for another synonym of the same query (i.e. they just changed their query from “trousers” to “pants”). If you’re giving Algolia the data it needs, it can pick up when this happens enough and register the synonyms for you, saving you all of the revenue that would have been lost as the next searchers won’t get frustrated and leave your site.
Dynamic Re-Ranking – Again, searches don’t exist in isolation. They’re also affected by external, cultural forces. For example, drugstore site users in 2019 who searched for “mask” might have expected skincare-related results. But come 2020, they were far more likely to be looking for surgical masks. Why? Because the global pandemic changed what makes those results relevant. The same changes happen all the time on smaller scales and in industry-specific situations, so it’s nearly impossible to keep up with by just tweaking rules in your index. Dynamic Re-Ranking uses AI to analyze what your users are searching for and what results they actually find, and then prioritizes what users seem to want right now, effectively taking into account whatever external forces are shifting what relevance means for your application.
Query Categorization – Often, results in the same category as what the user was searching for will perform better than results that match textually but aren’t super relevant. For example, say you’re shopping for groceries on Walmart’s website, and you search for “banana”. It’s likely that you’re specifically searching for yellow bananas, so that’s obviously going to be the first search result. But what should the second and third results be? Wouldn’t it be more sensible to add other produce items that people shopping for bananas may also want to buy, like plantains, apples, or strawberries? Or should result #2 be a Halloween banana costume? It’s obvious that the other fruits are more relevant results, even though the costume shares the keyword. Algolia’s AI-powered query categorization will figure out what category your user is searching in, and then boost the results that also match that category, leading to the user’s perception of higher relevance specifically to them and that specific shopping session, even if objectively the results returned are further away from the query by most metrics.

Takeaway: Use Algolia’s AI-powered tools to get more out of your search index.

That’s it! Here’s a recap of our lessons learned today:

Figure out exactly what makes a result “relevant” in your application, and structure your search index data around that.
Unless you really know what you’re doing, let Algolia’s years of experience guide the best configuration for your search engine.
Personalize the results to each individual user so that even niche queries get relevant results.
Use Algolia’s AI-powered tools to get more out of your search index.

If you’ve got experiences to share on you’ve managed highly relevant search results, we’d love to hear about them! Drop us a line on Discord here.

4 questions to ask for relevant search results

What does relevance mean to you?

Do I really need to mess with Algolia’s defaults?

Typo tolerance

Stop words and query modification

Search as-you-type

Natural language handling

How can I make the search results more relevant for specific users?

Am I taking advantage of Algolia’s AI-powered tools?

Recommended Content

Get the AI search that shows users what they need

Agentic intelligence layer powering commerce discovery

A leader for the third consecutive year

Increased Operating Profit and Improved Efficiency

Named a leader in knowledge discovery

Top scores across every B2B category