Co-founder & CTO at Algolia
Sorry, there is no results for this query
Good search relevance is about finding the right information as well as putting customers and online businesses on an equal footing. Search relevance not only tries to satisfy the intent of the customers as expressed in their search query, it also enables online businesses to present their products and services most advantageously without breaking the relevance expectations of their customers.
In this article, we’ll see how this broader sense of search relevance extends to include browsing and discovery, activities that do not necessarily need to start with a search box.
Relevance algorithms attempt to (1) match the text of a query to some underlying content while (2) anticipating and satisfying the needs of both customer and business. This definition is a reasonably good summation, simple in some ways, but loaded with a wealth of unstated potential and power – or complexity, depending on whether you see things half full or half empty.
If we simplify search relevance to be about looking for information, then a more general sense of relevance can mean something in addition to that. Our article on search and browse discussed the advantages of pulling and pushing content, i.e., searching and browsing. Users pull content when they actively search and expect to find the right information. A business pushes content when it actively suggests or surfaces information as the user searches (or browses) the company’s various digital interfaces.
Let’s compare the two:
|Pulling / Searching content||Pushing / Browsing content|
Pulling and pushing are two parts of a whole. You can’t really separate them. Pushing the right information at the right moment (Browse) is not very different from a user pulling the right content as they search (Search). In other words, they are both designed to satisfy user intent – albeit in different ways. While this article focuses more on the pull than on the push (search instead of browse and discovery), in practice, they are never separated; they always act as a complete whole. We call this complete picture the relevance cycle.
The first part of this article focuses on basic search algorithms. There’s a companion article that goes deeper into how search engines achieve the best search relevance. At the end of the article, we provide links for further reading on the second part of the cycle, the browse/push functionality.
As you can see from both images, relevance travels along a cycle that includes both pulling and pushing information. The complete search experience starts with the search query, where the search engine applies finding and ordering algorithms to return a set of relevant search results. Then the search engine enhances the search results with merchandising, content management, AI-powered personalization, and recommendations – all of which depend on how you have configured these features in your system.
A measure of good relevance is that the best matches show up on page 1 of your search results. But that’s not really what relevance is.
A more appropriate definition of search relevance is: Whatever best matches a search query. But what does “best match” mean? Is it some kind of abstract relevance score? Or is it whatever feels right? As in: if it feels like the results match the query, or it appears to be what I was looking for, then it’s relevant?
Unfortunately, this is a bit too subjective. If I type in Beetles and see a lot of scary-looking bugs, did the engine show me relevant search results? Yes – even if I were looking for the famous long-haired musicians, The Beatles.
In fact, there’s a difference between what appears to be relevant and what is actually relevant. Good relevance has no intention of its own. Only the words in the query matter.
However, the best search engines can read between the lines. If you’re selling music, the query “beetles” should find the music of The Beatles – which can be accomplished with techniques like typo tolerance and synonyms.
And suppose you sell both beetles and Beatles. In that case, two other aspects of relevance may come in handy:
There are many other UI possibilities for search results pages – for example, category pages and redirects. The techniques are many, with the goal to satisfy the full possibilities of a searcher’s intention and customer experience.
Taking all of the above into account, one thing is (unfortunately) true: What you consider relevant might not show up on Page 1. This can be a frustrating user experience – and should be avoided wherever possible. And yet, in some contexts, such as music, there will always be a strong subjective aspect.
For example: should we show the Rolling Stones on Page 1 for the query “famous rock groups”? Should the results show only British and American rockers? Other countries were also rocking in the 60s. Besides, how does a search engine know which musical groups were “famous”?
Another common problem is how to resolve a tie? For example, “bea” could equally bring up artists like Beans on Toast and Joe “Bean” Esposito who have a devoted but smaller audience. There are solutions to this (breaking results into categories, or limiting the number of results for the same artist). But the most important “fix” to this problem is the as-you-type, instant-results interface, where search results instantly appear on the web page as people type. With instant results, fans of Joe “Beans” Esposito can continue typing until their artist shows up. This extra typing is fine because today’s users are comfortable with being more precise to find lesser-known items.
Luckily, not every topic has such a strong subjective element. If someone is looking for the “right” shoes, a good search engine should guide the user to the best answer. This is made possible by structuring the shoe data.
Getting relevance right is no small task. A google search or any web search engine has to get it right for billions of people sifting through trillions of (largely) unstructured pieces of information.
On a smaller scale, say an online marketplace like Amazon, a search engine can be far more consistent. Amazon knows its content and knows the queries often used by its customers. It can therefore structure its content around this knowledge.
This is where search engines like Algolia come in, allowing you to customize your search based on what you know about your content. Algolia’s search engine is agnostic about what it searches. Its algorothms can search movies, products on ecommerce sites,, blogs, hospital and customer records, Salesforce datasets, newspaper articles, and other use cases. You need to structure your content in a way that best represents the subject matter.
We discuss this in our companion piece to this article. But it’s worth taking a quick look at what it means to structure your datasets.
Here’s a good example of what we mean by structure. A shoe has:
Algolia’s relevance algorithms rely entirely on how you make the search engine aware of what information it’s searching for. It starts with creating valuable datasets, as above. However, you also need to tell the engine the meaning of this data. It sounds like a lot of complexity, but it’s actually very intuitive:
We’ve used the terms “best matches”, “quality matches”, and “ranking” often. Here’s what they mean. When we say “relevance”, we often mean “textual relevance”, which refers to how a search engine compares the words of a user’s query to the content and returns matching results. But we also mean ranking, which is about ordering the results by best matches, often referred to as “order by relevance”.
So, if an item matches the words in the search box, the search engine determines whether the match is strong or weak. We’ve already seen that above with the query “bee”, where Bee is stronger than Beetle.
The best matches are the strongest matches.
To illustrate this, imagine a search experience that uses the simplest form of relevance – a letter-by-letter comparison.
Here are four famous quotations:
Textual matching can be exact or partial. A search for “be” matches the text of 3 records: record 1 (“be”), record 2 (“best”), and record 4 (“bed”). In this example, record 1 is the strongest because it’s an exact match, whereas records 2 and 4 are weaker because they are only partial matches.
Continuing with this example, if we allow 1-word typos, then a search for “that” would find records 1 and 3, with record 1 (“that”) stronger than record 3 (“what”), which differs from the query by 1 letter.
Now that the search engine has found the best records and orders them by the strongest matches, it’s time to move ahead in the cycle. At this point, the business can consider adding or reordering records based on current sales promotions or industry trends. It can also redirect the user to a page designed for the item or categorization they are interested in. It can also personalize the results, favoring items that the user prefers (learned through analytics, AI, NLP, and machine learning). Finally, it can also start recommending related items alongside the results. We discuss all that elsewhere on our blog. For example:
We’ll end with an new search example. Take a fairly reasonable query: “dark pointy shoes for dancing”.
As discussed, relevance begins not only with the query but with how you structure your content. Thus, the results of the “pointy dancing shoes” query comes from how your data structure can answer these questions:
And then sit back and relax, knowing that those pointy dancing shoes will rise to the top.