Engineering

Implementing faceted search with dynamic faceting (code included)
facebooklinkedintwittermail

This is the third and last article in a three-part series of blog posts that describe the technical and data aspects of facets and faceted search. Here in part 3, we look at the search process, from query to execution to response, and show how to generate facets dynamically.

This is the third article in our Facets & Data series. Our focus in this series is technical, outlining the logic and facets data model of facet search. Parts 1 and 2 are in our Tech Blog. The article — Facets and faceted search, every JSON attribute counts — is where we define what faceting is, and explain the critical role that facets play in structuring your data. It also illustrated how JSON is the most flexible way to represent your index data including facets. The second article — A facets data model using JSON — introduced the most common data structures for facets: simple facet values, nested faceting, hierarchical categories, and user and AI tagging, all of which are used for different aspects of facet search. Check out our docs for more information on indexing and faceting.

In this article — the third and final in the series — data continues to be our central focus, but we also discuss process. This time we dive into the data and processes used in the query cycle: from request, to execution, to response, all within the context of an advanced facet use case: dynamic faceting.

Before diving into the technical details, let’s see what dynamic faceting is all about.

Overview of dynamic faceting

Dynamic faceting displays to the user a different set of facets depending on the user’s intent. To understand this, consider an ecommerce music store that sells two categories of items: CDs and audio equipment. This business wants to display a set of relevant facets. What good is it to propose “brand” when the user’s intent is to find their favorite music? Likewise, what good is it to propose “musical genre” when the user’s intent is to find audio equipment? Dynamic faceting ensures that only the most appropriate facets show up.

See a live demonstration of dynamic faceting — with a full code base — on our dynamic faceting sandbox

Ecommerce businesses with a diversity of products benefit from displaying different facets depending on the items the user is searching for:

  • Pharmaceutical companies display different facets for their medical vs. cosmetic products.
  • Newspapers display different facets in their Entertainment and Political sections.
  • Online marketplaces, like the Amazon example below, change facet lists as people navigate through their vast diversity of offerings.

Example use case: ecommerce marketplaces & dynamic faceting  

Amazon uses dynamic faceting for many of its categories. In the image below, you see two queries: “music” on the left, “movies” on the right. As you can see, both sides include the “price” facet, but the musical query includes “customer reviews”, “artist”, and “musical format”, while the movie query includes “director”, “video format” and “movie genre”. 

Picture of Amazon dynamic facets

Amazon uses dynamic faceting to create an enhanced search experience by guiding the user in a smart and curated way depending on the products they are searching for

Let’s see how this is done.

The query cycle and the logic behind facet search

First, some terminology:

  • Facet keys are attributes like “color”, “price”, “shoe_category”, and “sleeves”.
  • Facet values are the key’s values. For example, “color” contains “red” and “green”; “sleeves” contain “short” and “long”.

The dataset

We’ll use a dataset with two kinds of products: shirts and shoes. The example below contains two typical items. Both items include the “price”, “color”, and “clothing_type” facets. However, shirts contain a “sleeves” facet and shoes a “shoe_category” facet.

{
    "name": "Bold Shirt",
    "desc": “Be bold, wear a t-shirt with only one color”,
    “Image_url”: “images/shirt-123.jpg”
   “Price”: 49.99,
    "color": “white”,
    "gender": “male”,
    “clothing_type”: “shirt”,
    "sleeves": "short"
},
{
    "name": "Blazing Speed Sneakers",
    "desc": “Sneakers to win races!”,
    “Image_url”: “images/sneakers-789.jpg”
    "brand": “nike”,
    “Price”: 189.99,
    "color": "red",
    "clothing_type": "shoe",
    "shoe_category": "sneaker"
}

The query cycle

A search query follows a 4-part cycle. Here’s an overview. We’ll give more details and code examples in the section that follows. 

  1. Send the user’s query to the search engine. 
  2. Execute the search and retrieve the records that match the query. In this step, you’ll derive the facets from the retrieved records .
  3. Send back the results and facets.
  4. Render the results and facets onscreen.

As you’ll see, instead of using a pre-defined list of facets, the logic consists in dynamically generating a new list with each query. This is possible by doing the following:

  • On the back end, you’ll extract facets from every set of query results.
  • On the front end, you’ll use undefined container-placeholders instead of pre-defined containers.

The query request: sending the query with or without a filter

The starting point of the cycle is to send a query and any facet value the user has selected to filter their results. Filtering results creates a cohesive result set, which in turn generates a list of facets relevant to all of the items that appear in the results. On the other hand, if the user does not select a facet, the items will be more diverse — and therefore, the facets might not apply to all products. 

However, this is perfectly fine. As you’ll see in the next step, presenting the top 5 most common facets ensures that most items will contain these facets. 

Now for the code. Here’s how Algolia’s API implements the query “Get all short-sleeved summer t-shirts”. (Since all search tools allow filtering, the following code is only one among many ways to do this).

results = index.search('summer t-shirt', {
  filters: 'clothing_type:shirt AND sleeve:short'
});

The query execution: creating the list of top 5 facets

The dataset we use in this article contains two kinds of products, each with a set of unique facet attributes. After executing the query, the search engine extracts every facet key that shows up in every record, then selects 5 facets that appear most often. 

Why top 5? Because a screen with 5 facets is usually enough. Ten is an outer limit – any more would be overkill and create unused clutter.

There are two methods to create a list of facets:

  1. A pre-defined list: save a list of facet keys either directly in the code or in a separate dataset or local storage.
  2. A dynamically generated list: extract a distinct list of facet keys from the results of the query and put them in the query response as a separate record. 

We’ve chosen to use method 2, but some implementations use method 1.

Method 1 – Hardcoding the list of facet keys

For this method, you create a fixed set of facets based on what you already know about your products. This list can be either hardcoded, or placed in a file or table in a database that can be manually updated whenever new products are added or modified. 

Whatever the manner of storing the fixed list, the list must contain the following kind of example:

  • For every “clothing_type=shirt”, send back the following facets: “color, price, clothing_type, sleeves, gender”.
  • For every “clothing_type=shoe”, send back the following facets: “color, price, clothing_type, shoe_category, brand”.

This is not the preferred method because, as with all hardcoding or semi-hardcoding, it has limited scalability. If you want to add more relations, for example “shoe_styles“ = “high-top, leisure, and cross-fit”, you’ll need to manually add a new line to the list. Manual maintenance is extra work and prone to error and delay. 

The approach we present in the rest of this article (method 2) removes manual maintenance from the process, making the process deductive and therefore entirely dynamic. 

Method 2 – Dynamically generate the list of facets

In this method, we’ll extract the list of facets from the products themselves.

In a sense, the only difference between the manual and dynamic approaches is that the generated list of top 5 facets is dynamic. The resulting list itself will be formatted in the same way. 

Here are the steps:

  1. Get the query results:
    • Execute the query, find X number of products. 
    • Save those records, to be sent back as the search results.
  2. Get the top 5 facet keys and all their values:
    • From the results, collect all facet attributes (we’ll show below how to identify an attribute as a facet). 
    • Create a record that contains the full list of extracted facet keys. 
    • Determine which facets appear the most often. Sort the list by the highest number of records.
    • Take only the top 5 from this list. These are the most common facets. 
    • Add the product’s values to their respective key.

Done. 

The generated list would be the same as with method 1:

  • For every “clothing_type=shirt”, send back the following facets: “color, price, clothing_type, sleeves, gender”.
  • For every “clothing_type=shoe”, send back the following facets: “color, price, clothing_type, shoe_category, brand”.

Thus, if most products are shirts, we would display “sleeves” and “gender” as the 4th and 5th facets to display. If they were shoes, then the 4th and 5th facets would be “shoe_category” and “brand”.

The query response: sending back the response

Here we’ll simply send back the search results and the generated list of 5 facet keys with their respective values. 

The response should include whatever the front end needs to build its search results page. In all facet search query cycles, the front end needs:

  • A list of products with “name”, “description”, “price”, and “image_url”. (Don’t send image files, as their size will slow down the overall response time of the search). 
  • The list of facet key and their values.

A response will also contain additional information needed for display purposes or business logic. We do not show those. Go here to see a complete query response.

Results:

Here is how we would return the set of shirts. All attributes will be used as information in the search results, except “objectID”, which will be used to identify a product for technical reasons (click analytics, detailed page view, or other reasons).

"results": [
{
    "objectID": "123",
    "name": "Bold Shirt",
    "desc": “Be bold, wear a t-shirt with only one color”,
    “Image_url”: “images/shirt-123.jpg”
   “Price”: 49.99,
    "color": “white”,
    "sleeves": "short"
}
]

Facets:

The facet response is a combination of facet keys with their values. Here’s a small extract. The example does not include all of the facets:

"facets": {
    "Clothing Type": {
      "Shirts": 100,
      "Sneakers": 50
    },
    "Sleeves": {
      "Short": 30,
      "Long": 10
    },
}

Two things to note:

  • Only “sleeves” and “gender” show up, not “shoe_category” or “brand”. As mentioned, this is because there are more shirts than shoes. 
  • The number after each facet indicates the number of records that have that value. We discuss this a bit more in the section on adding number of facets. 

The front-end display: dynamically displaying the list of facets

The job here is to render the results and facets on the screen. In terms of UX design, industry standard is to have the results in the middle, and facets on the left. 

First, the HTML. Add placeholder-containers for the results and facets:

<div id="wrapper">
    <div id="app-container">
      <div id="left-side">
        <div class="sidebar">
          <div id="facet-lists"></div>
        </div>
      </div>
      <div id="center-side">
        <div id="results-container"></div>
      </div>
    </div>
</div>

There is one container (“facet-lists”) for the 5 facets. The rendering code generate an unordered list to display the facets in that container. 

The results go in the “results-container”.

Next, render the data. As this can take many forms, and this article is not strictly a tutorial, take a look at our dynamic faceting GitHub repo for a complete front-end implementation.  

Making the solution more robust

Adding number of facets

You’ll want to let your users know how many records have a given facet value. Getting the number of facets is useful because they inform users about the search results. For example, it is useful to know that there are more short-sleeved shirts than long-sleeved shirts. These numbers of facets are normally calculated in the back end during query execution. 

Adding facet metadata

Every record needs to contain information that helps the process know which attributes are facets. To do this, you need to add facet meta data to each record by using an additional attribute that defines the record’s facets: 

“facets”: [“sleeves”, “price”, ..]

Going one step further, it’s also useful to include the type of attribute: 

“facets”: [ 
  [“sleeves”, “string”], 
  [“price”, “numeric”] 
]

With this information, the front-end code can apply a range slider for price and a dropdown for the sleeves.

Grouping items using facets

You’ll want to treat color differently from the other facet attributes. This is because different colors appear on the same shirt. For this, the logic question is: Do you need 1 record per shirt that includes an attribute with all available colors? or 1 record per color, which requires multiple records for each shirt? Typical database thinking would say, of course, only 1 record. However, as discussed at length in our first article, faceted search is different

We put all searchable items in separate records. This allows people to find “red shirts” using the search bar without needing to click on a color facet. To accomplish this, we set up every item as one color. However, in the response, we don’t need to return the 3 shirt records. We can collapse the 3 records into 1 record, creating a new meta attribute: “available_colors”:

“available_colors”: “red, green, blue”

Parting words

Our main goal in this article was to add process to data in our series on facets & data. We described a particular way to execute a search and display a set of facets, following a query cycle of request, execution, response, and display. 

To stretch your understanding of facets, we did this within the context of an advanced use case: we made our facet search dynamic. Dynamic faceting creates a more intuitive and useful facet search experience, particularly for businesses that offer a diverse collection of products and services.

See the feature live our dynamic faceting sandbox, and check our our implementations of dynamic faceting on GitHub: dynamic faceting on the the front end and dynamic faceting using Query Rules. You can also find out more about indexing and faceting in our docs on facets.

About the authorPeter Villani

Peter Villani

Sr. Tech & Business Writer

Recommended Articles

Powered by Algolia AI Recommendations

Faceted Search: An Overview
UX

Faceted Search: An Overview

Jon Silvers

Jon Silvers

Director, Digital Marketing
Inside the Engine Part 8: Handling Advanced Search Use Cases
Engineering

Inside the Engine Part 8: Handling Advanced Search Use Cases

Julien Lemoine

Julien Lemoine

Co-founder & former CTO at Algolia
A facets data model using JSON
Engineering

A facets data model using JSON

Peter Villani

Peter Villani

Sr. Tech & Business Writer