Search by Algolia
What is online retail merchandising? An introduction
e-commerce

What is online retail merchandising? An introduction

Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

5 considerations for Black Friday 2023 readiness
e-commerce

5 considerations for Black Friday 2023 readiness

It’s hard to imagine having to think about Black Friday less than 4 months out from the previous one ...

Piyush Patel

Chief Strategic Business Development Officer

How to increase your sales and ROI with optimized ecommerce merchandising
e-commerce

How to increase your sales and ROI with optimized ecommerce merchandising

What happens if an online shopper arrives on your ecommerce site and: Your navigation provides no obvious or helpful direction ...

Catherine Dee

Search and Discovery writer

Mobile search UX best practices, part 3: Optimizing display of search results
ux

Mobile search UX best practices, part 3: Optimizing display of search results

In part 1 of this blog-post series, we looked at app interface design obstacles in the mobile search experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 2: Streamlining search functionality
ux

Mobile search UX best practices, part 2: Streamlining search functionality

In part 1 of this series on mobile UX design, we talked about how designing a successful search user experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 1: Understanding the challenges
ux

Mobile search UX best practices, part 1: Understanding the challenges

Welcome to our three-part series on creating winning search UX design for your mobile app! This post identifies developer ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Teaching English with Zapier and Algolia
engineering

Teaching English with Zapier and Algolia

National No Code Day falls on March 11th in the United States to encourage more people to build things online ...

Alita Leite da Silva

How AI search enables ecommerce companies to boost revenue and cut costs
ai

How AI search enables ecommerce companies to boost revenue and cut costs

Consulting powerhouse McKinsey is bullish on AI. Their forecasting estimates that AI could add around 16 percent to global GDP ...

Michelle Adams

Chief Revenue Officer at Algolia

What is digital product merchandising?
e-commerce

What is digital product merchandising?

How do you sell a product when your customers can’t assess it in person: pick it up, feel what ...

Catherine Dee

Search and Discovery writer

Scaling marketplace search with AI
ai

Scaling marketplace search with AI

It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the ...

Bharat Guruprakash

Chief Product Officer

The changing face of digital merchandising
e-commerce

The changing face of digital merchandising

This 2-part feature dives into the transformational journey made by digital merchandising to drive positive ecommerce experiences. Part 1 ...

Reshma Iyer

Director of Product Marketing, Ecommerce

What’s a convolutional neural network and how is it used for image recognition in search?
ai

What’s a convolutional neural network and how is it used for image recognition in search?

A social media user is shown snapshots of people he may know based on face-recognition technology and asked if ...

Catherine Dee

Search and Discovery writer

What’s organizational knowledge and how can you make it accessible to the right people?
product

What’s organizational knowledge and how can you make it accessible to the right people?

How’s your company’s organizational knowledge holding up? In other words, if an employee were to leave, would they ...

Catherine Dee

Search and Discovery writer

Adding trending recommendations to your existing e-commerce store
engineering

Adding trending recommendations to your existing e-commerce store

Recommendations can make or break an online shopping experience. In a world full of endless choices and infinite scrolling, recommendations ...

Ashley Huynh

Ecommerce trends for 2023: Personalization
e-commerce

Ecommerce trends for 2023: Personalization

Algolia sponsored the 2023 Ecommerce Site Search Trends report which was produced and written by Coleman Parkes Research. The report ...

Piyush Patel

Chief Strategic Business Development Officer

10 ways to know it’s fake AI search
ai

10 ways to know it’s fake AI search

You think your search engine really is powered by AI? Well maybe it is… or maybe not.  Here’s a ...

Michelle Adams

Chief Revenue Officer at Algolia

Cosine similarity: what is it and how does it enable effective (and profitable) recommendations?
ai

Cosine similarity: what is it and how does it enable effective (and profitable) recommendations?

You looked at this scarf twice; need matching mittens? How about an expensive down vest? You watched this goofy flick ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is cognitive search, and what could it mean for your business?
ai

What is cognitive search, and what could it mean for your business?

“I can’t find it.”  Sadly, this conclusion is often still part of the modern enterprise search experience. But ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Looking for something?

Creating search relevance for ecommerce and media with great content

Nov 10th 2021 product

Creating search relevance for ecommerce and media with great content
facebookfacebooklinkedinlinkedintwittertwittermailmail

We continue our series on search relevance. Our first article defined relevance, describing a relevance cycle that starts with search (i.e., finding and ordering items) and continues with advanced browse and discovery strategies like merchandising, content management, personalization, and recommendations. 

In this article, we focus on the first part of the cycle, Search. We walk through some real examples and search configuration strategies that create a production-level relevance. We break down then optimize the two pillars of search relevance: 

  • Textual relevancefinding items that match a query 
  • Ranking ordering the found items by which items match the query best

First, we’ll show that these pillars are largely driven by how you structure and configure your data. To illustrate this, we’ll describe the basics: searchable attributes, faceting, and custom ranking – terms that we’ll define as we go along.

We’ll discuss important configurations like synonyms, language, and typo settings. Finally, we’ll demonstrate the importance of analytics and testing your most important queries to see that they return the desired search results.

Matching words to content

When we say “relevance”, we often mean “textual relevance”, which refers to how a search engine compares the words of a query to your underlying data (search index) and returns matching results.  

There are two questions to ask of relevance: 

  • Does the item in the content match the words in the query? 
  • And if yes, is it a strong or a weak match? The best matches are the strongest matches.

To illustrate what we mean by finding weak and strong matches, imagine a search that uses the simplest form of relevance – namely, letter-by-letter comparisons

Here are four  famous quotations (we already discussed this in part 1, but from a slightly different angle):

  1. “To be, or not to be: that is the question” (Author William Shakespeare)
  2. “It was the best of times, it was the worst of times” (Author Charles Dickens)
  3. “Ask not what your country can do for you – ask what you can do for your country” (US President John F. Kennedy)
  4. “For a long time I used to go to bed early” (Author Marcel Proust)

Textual matching can be exact or partial. A search for “be matches the text of 3 records: record 1 (“be”), record 2 (“best”), and record 4 (“bed”). In this example, record 1 is the strongest because it is an exact match, whereas records 2 and 4 are weaker because they are only partial matches. 

Continuing with this example, if we allow 1-word typos, then a search for “that” would find records 1 and 3, with record 1 (“that”) stronger than record 3 (“what”), which differs from the query by 1 letter.

Matching intent

If it were only that simple… Truth is, people who search are far more demanding than that. Search is not just about matching text. They want their results to feel intuitive. Here are some additional questions to ask:

  • What order should I show these records? 
  • What should the word “best” return? Only record 2 (“best of times”)? But what if the user had been looking for the idea of “best” = like, the best quotes? 
  • What about creating a synonym for best, like “greatest”?
  • Should the query “qestion” (“question” mistyped) return record 1? 
  • What about someone looking only for “literary quotes” (records 1, 2, 4)? 
  • What if I want to find only quotations translated from the French (record 4)? 

To help achieve some of this, we need to consider the quality of the content / data and how best to configure it.

The content, or data, needs to contain more detail. Here’s an updated dataset:

  1. “To be or not to be: that is the question”, shakespeare, british, hamlet, theatre, citations
  2. “It was the best of times, it was the worst of times”, dickens, british, tale of two cities, novel
  3. “Ask not what your country asks of you – ask what you can do for your country”, kennedy, american, speech, politics
  4. “For a long time I used to go to bed early”, proust, french, in search of lost time /  a la recherche du temps perdu, novel

With all that great data, there’s one more step: the engine needs to be configured. We’ll add some synonyms. Synonyms are not part of the data, they are part of an engine’s settings. Let’s add “best = greatest”. We’ll also configure the engine to match singular and plural forms (book = books) and tell it to tolerate typos and misspellings. There are other configurations to make, as we’ll discuss later. 

Rolling up our sleeves – let’s create an optimized set of searchable data

The above quotations were used to illustrate what we mean by relevance. Now let’s look at a typical media / ecommerce example: a collection of books (to buy or for streaming).

Here, we can envision the book’s information (i.e., attributes) as follows:

  • Title
  • Description
  • Author
  • Genre
  • Cover image
  • Popularity

To search, we don’t need to use the “cover image”, “price”, or “popularity”. And we’ll also ignore “description” for now. The other fields (“title”, “author”, “genre”) will be used to find the book. This process of selecting only some attributes helps the engine focus on searchable attributes, which makes your information more precise and guarantees relevant search results. 

  • Title (searchable)
  • Description
  • Author (searchable)
  • Genre (searchable)
  • Price
  • Cover image
  • Popularity

The other fields are present for other purposes. For example, for displaying in the search results (“price”, “image”) and to sort by (“popularity”, “price”).

Ordering by relevance, known as Ranking

The order of your searchable information is equally important. We call it Relevance Ranking. For example, if you type in “king” to find books written by Stephen King, you may see books with the word “king” in the “title” before seeing Stephen King books. To fix this, you’ll need to put your attributes in the below order (e.g., “author” above “title”, making the search engine look at author first): 

  • Author
  • Title
  • Genre
  • Description
  • Price
  • Cover image
  • Popularity

You can also add “genre”at the top, so “king horror” will guarantee good results. 

  • Genre
  • Title
  • Author
  • Description
  • Price
  • Cover image
  • Popularity

The point here is that you can select the priority of what gets searched. This is an important business customization, a unique choice for each bookseller.

To improve the matching, let’s add “description” as a fallback. However, we need to be careful. Long attributes like “description” can be noisy and generate false positives in relevance. Such attributes often contain too many easy matches. In such cases, you can create a “short-description” attribute that takes a subset of keywords from the “description”.

  • Genre
  • Title
  • Author
  • Short-description
  • Price
  • Cover image
  • Popularity

Lastly, let’s make sure that “title”, “author”, and “short-description” can be searched in the beginning or middle of the field. This enables a search for “peace” to find “War and Peace”, even though “peace” is not the first word in the title. This is not good for every attribute, for example “genre”. We use a special word “ (unordered)” to accomplish this.

  • Genre
  • Title (unordered)
  • Author (unordered)
  • Short-description (unordered)
  • Price
  • Cover image
  • Popularity

Creating order with custom ranking

Here we focus on the attribute “popularity”. We just need to tell the engine to use “popularity” to help order the records.

  • Genre
  • Title (unordered)
  • Author (unordered)
  • Short-description (unordered)
  • Price
  • Cover image
  • Popularity

This idea of custom ranking needs some context. When a user searches for “king”, all records that have the word “king” in one of its searchable attributes will show up. Having “author” as a field will of course ensure this (regarding Stephen King).

The question is: Which Stephen King book should be at the top of list? Which second? That’s where popularity comes in. There’s no way to know which to show first, since they all match equally well, so we can order by popularity. We can also order by newest release. 

Now, what if there is an unknown author that you want your users to know about. You can add another custom attribute such as promote-book, which is true or false, or a rating system with 1 to 5, 1 being very important to display to the users. Other custom ranking attributes: highest margin or most trendy. Or a combination of two or more attributes. 

We describe this process as tie breaking, where we compare matches and put the strongest matches at the top and the weaker ones lower in the order. For example, if someone types in “ki”, the author’s who match “ki” exactly (for example, Ki Lynn), will show up higher than authors who only match partially. But if the user adds “ng”, making it “king”, Ki Lynn drops (as an inexact match) and Stephen King rises to the top. Additionally, with popularity as a custom ranking, Stephen King will most likely fill up the first 10 pages of results.  (There are ways to avoid this last fact – where a popular author can hog the top places in the results. You can limit Stephen King to only 3 results, for example.)

Optimizing relevance even further with synonyms, typo tolerance, filters, analytics

Now your data is searchable and will rank properly. The search engine doesn’t need to know much more about your data. You’ve structured it and given it meaning by specifying how to search it. More can be done to manage certain particular situations, but what we’ve done is largely sufficient to go live with. But we can go one step further by creating synonyms and facets (i.e., filters).

Adding synonyms

Every customer has a set of unique words that can be satisfied by creating synonyms. In the case of books, there’s the word “novel” as a synonym of “book”. There are theaters and plays, authors and writers, and such mundane synonyms as “chairs” and “seats”, or “pants”, “slacks”, and “trousers”. Many relevant synonyms can come from a dictionary, but others are very specific to an industry or a company. They can also AI-power your synonyms

Adding filters

Already alluded to, for books, adding attributes like “genre” and “author” help users drill down and single out collections of books based on these and other filters. Filters can work in the background or can be placed onscreen as facets.

Typos & misspellings

Your search engine should be able to find words that only differ 1 or 2 letters. Mistyping “shakspear” should find, nonetheless, “Shakespeare”. This is called typo tolerance.

Language settings

Set the language of your search solution to the native language(s) of your users. If you sell only English books, you need to tell the search engine that your users will be typing in English. The search engine can then apply certain language-specific logic (such as determining plurals or separation of words). However, if your audience is also French, you might want to start adding French text to your data, or creating a different set of data – one English, the other French..

And you’ll want the search engine to distinguish between single and plural, and to not get bothered by “stop” words such as “the”, “what”, and “and”.

Analytics and Testing

Finally, Analytics. You want to track the top 10 most typed-in queries. Then, test these top 10 queries and other analytics metrics to see if you are satisfied with the results. Also, avoid providing no results. Check to see if your top searches are returning no results, or only a small amount of results. And tracking user clicks and conversions.

Finally, use all your analytics reports, as well as AB test, to analyze and improve your search performance.

Conclusion – Going beyond search

Matching text is a critical starting point. To do this, you want to anticipate your users’ queries and ways of expressing themselves, and then you’ll structure your data to match the words of the most common queries. As we’ve seen, simple textual matching does not solve every problem. You also need features like filtering, ranking, attribute priorities, handling typos, synonyms, and other language-based characteristics to allow the search engine to read between the lines. Next step is to go beyond search and see how you can deepen your relevance with personalization, merchandising, browsing, discovery, and recommendations.

About the author
Peter Villani

Sr. Tech & Business Writer

linkedinmediumtwitter

Recommended Articles

Powered byAlgolia Algolia Recommend

What is search relevance in the era of browsing, discovery, and recommendations?
product

Peter Villani

Sr. Staff Writer

Algolia's top 10 tips to achieve highly relevant search results
product

Julien Lemoine

Co-founder & former CTO at Algolia

Comparing Algolia and Elasticsearch For Consumer-Grade Search Part 2: Relevance Isn’t Luck
engineering

Josh Dzielak