Search by Algolia
AI-powered search: From keywords to conversations
ai

AI-powered search: From keywords to conversations

By now, everyone’s had the opportunity to experiment with AI tools like ChatGPT or Midjourney and ponder their inner ...

Chris Stevenson

Director, Product Marketing

Vector vs Keyword Search: Why You Should Care
ai

Vector vs Keyword Search: Why You Should Care

Search has been around for a while, to the point that it is now considered a standard requirement in many ...

Nicolas Fiorini

Senior Machine Learning Engineer

What is AI-powered site search?
ai

What is AI-powered site search?

With the advent of artificial intelligence (AI) technologies enabling services such as Alexa, Google search, and self-driving cars, the ...

John Stewart

VP Corporate Marketing

What is a B2B marketplace?
e-commerce

What is a B2B marketplace?

It’s no secret that B2B (business-to-business) transactions have largely migrated online. According to Gartner, by 2025, 80 ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago
e-commerce

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago

Twice a year, B2B Online brings together industry leaders to discuss the trends affecting the B2B ecommerce industry. At the ...

Elena Moravec

Director of Product Marketing & Strategy

Deconstructing smart digital merchandising
e-commerce

Deconstructing smart digital merchandising

This is Part 2 of a series that dives into the transformational journey made by digital merchandising to drive positive ...

Benoit Reulier
Reshma Iyer

Benoit Reulier &

Reshma Iyer

The death of traditional shopping: How AI-powered conversational commerce changes everything
ai

The death of traditional shopping: How AI-powered conversational commerce changes everything

Get ready for the ride: online shopping is about to be completely upended by AI. Over the past few years ...

Aayush Iyer

Director, User Experience & UI Platform

What is B2C ecommerce? Models, examples, and definitions
e-commerce

What is B2C ecommerce? Models, examples, and definitions

Remember life before online shopping? When you had to actually leave the house for a brick-and-mortar store to ...

Catherine Dee

Search and Discovery writer

What are marketplace platforms and software? Why are they important?
e-commerce

What are marketplace platforms and software? Why are they important?

If you imagine pushing a virtual shopping cart down the aisles of an online store, or browsing items in an ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is an online marketplace?
e-commerce

What is an online marketplace?

Remember the world before the convenience of online commerce? Before the pandemic, before the proliferation of ecommerce sites, when the ...

Catherine Dee

Search and Discovery writer

10 ways AI is transforming ecommerce
e-commerce

10 ways AI is transforming ecommerce

Artificial intelligence (AI) is no longer just the stuff of scary futuristic movies; it’s recently burst into the headlines ...

Catherine Dee

Search and Discovery writer

AI as a Service (AIaaS) in the era of "buy not build"
ai

AI as a Service (AIaaS) in the era of "buy not build"

Imagine you are the CTO of a company that has just undergone a massive decade long digital transformation. You’ve ...

Sean Mullaney

CTO @Algolia

By the numbers: the ROI of keyword and AI site search for digital commerce
product

By the numbers: the ROI of keyword and AI site search for digital commerce

Did you know that the tiny search bar at the top of many ecommerce sites can offer an outsized return ...

Jon Silvers

Director, Digital Marketing

Using pre-trained AI algorithms to solve the cold start problem
ai

Using pre-trained AI algorithms to solve the cold start problem

Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Now, ecommerce businesses are beginning to clearly see ...

Etienne Martin

VP of Product

Introducing Algolia NeuralSearch
product

Introducing Algolia NeuralSearch

We couldn’t be more excited to announce the availability of our breakthrough product, Algolia NeuralSearch. The world has stepped ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

AI is eating ecommerce
ai

AI is eating ecommerce

The ecommerce industry has experienced steady and reliable growth over the last 20 years (albeit interrupted briefly by a global ...

Sean Mullaney

CTO @Algolia

Semantic textual similarity: a game changer for search results and recommendations
product

Semantic textual similarity: a game changer for search results and recommendations

As an ecommerce professional, you know the importance of providing a five-star search experience on your site or in ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is hashing and how does it improve website and app search?
ai

What is hashing and how does it improve website and app search?

Hashing.   Yep, you read that right.   Not hashtags. Not golden, crisp-on-the-outside, melty-on-the-inside hash browns ...

Catherine Dee

Search and Discovery writer

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Every search interface relies on a fast back-end data-indexing process that keeps its search results up to date in as timely a manner as possible. But search indexing is only one side of the coin. The other side is the real-time speed of a high-quality relevant search engine. 

For all search engines, the search request is the highest priority, with indexing a (very) close second. There are several reasons for this, but the most important is a business argument: every search is a potential game changer, a path to a conversion. Any slow or dropped search request, or irrelevant result, is a potential financial or business loss.

To achieve maximum speed & relevance, a search engine must:

  • Prioritize search requests over indexing requests
  • Structure its indexes so that queries execute in real-time (milliseconds), with the best relevance 

As a result, it takes a little extra time to update an index. But if you learn to follow a few indexing best practices, you’ll even things out.

“All well and good,” say the full stack and back-end developers. “I understand the priority of search. But I want to know more about my data. How do I get my data onto your servers? Can it handle my use cases? Does it accept any kind of data? Is it simple, secure, fast?” 

In a recent article on indexing, we explored a variety of advanced use cases, and focused on two search indexing essentials: fast updates and wide applicability. Now it’s time to dig into the code and explain some speed-enhancing algorithms and indexing best practices that ensure you get the highest indexing speed for any search use case

There are two primary areas to focus on here:

  • The wide applicability of our indexes
  • The high performance of our indexing API to update your data

The wide applicability of our search indexing

To understand indexing on its own terms, we need to decouple it from search and outline the most popular indexing scenarios:

Indexing for search

A well-structured index provides the foundation for a fast and fully-featured customer-facing search interface, with great relevance. In fact, indexing is so important to search & relevance that it needs to be designed and implemented with as much care and dedication as the front end.

Indexing to create a company-wide, multi-purpose, searchable data layer

Multiple indexes can form a single touch point for all back-office data. When put together in a certain way, your indexes can create a company-wide searchable data layer that lies between your back-office and all front ends used internally (employees) or externally (customers, partners).

Indexing as a “matchmaker” – the collaborative indexing use case

The “matchmaker” scenario is when Company X builds an Algolia index and makes it available to external data providers. In this scenario, Company X builds a collaborative website, such as a marketplace or streaming platform, where it displays the products/media of multiple vendors, partners, and contributors. To accomplish this, Company X exposes its Algolia index to these external data providers, allowing them to send data once they understand the format.

Here’s the main difference between the first two scenarios:

  • A single search interface requires at least one index, which should be structured with that interface in mind.
  • In the company-wide data layer scenario, it’s different: you need to generalize the structure of your index(es). The data that makes up this multi-purpose data layer needs to be structured to (a) allow multiple feeds of data from widely different back-office applications, and (b) serve multiple use cases and interfaces, whether user-facing or system-to-system.

What about indexing performance?

The wide applicability of our indexing wouldn’t be possible nor survive the competitive digital business environment if it were not performant in all situations. While we offer high indexing speed out of the box, this hinges on implementing best indexing practices. That’s what this article is about.

Just a word about what we mean by “out-of-the-box high performance”. Our indexing comes with the following technologies:

  • A search engine using advanced indexing techniques
  • High-performant bare-metal servers configured for performance 
  • A globally available cluster-based cloud infrastructure, with low latency and server redundancy (i.e., no server downtime)
  • An API with a retry method to ensure (contractually) 99.99% availability 

Best practices for fast indexing performance (with code snippets)

The most important indexing practice is to run a batching algorithm that updates multiple records in one indexing operation, in a regular and timely manner. This is true for all use cases. 

Why do we recommend batching? Because there’s a small performance cost to every indexing request. An index request involves a small “reindexing” of your entire index, which could take up to 1 second, or more if the index is very large. Thus, sending 100s of indexing requests, one record at a time, can create an indexing queue that will slow down the entire indexing process. To mitigate this, it’s important to limit the number of indexing requests made on the server by sending less requests.

Taking all that into account, here are the 3 most important indexing best practices (pretty standard fare for data updates):

  1. Batching updates instead of sending updates one record at a time
  2. Incremental updates instead of full (re)indexing
  3. Partial indexing (updating only changed attributes)

1 – Batch indexing instead of updating one record at a time

One common mistake is to send one record at a time. If your back-end data constantly changes, it would be wrong to send each change as it occurs. As stated above, bottlenecks occur when you create a queue of 100’s of indexing requests that are waiting to be processed.

Instead, as a best practice, use batch indexing. You send each change to a temporary cache, and then regularly send that cache to Algolia, for example, every 5 minutes or 30 minutes for larger indexes. Never batch faster than 1 minute, because you’ll end up creating a bottleneck. 

This code example builds a new index. It batch-saves 10000 record-chunks using the save_objects method of Algolia’s Python API.

#python

import json
from algoliasearch import algoliasearch

client = algoliasearch.Client('YourApplicationID', 'YourAdminAPIKey')
algolia_index = client.init_index('bubble_gum')

with open('bubble_gum.json') as f:
  records = json.load(f)

  chunk_size = 10000
  for i in range(0, len(records), chunk_size):
    algolia_index.save_objects(records[i:i + chunk_size])

See how our API has automated the batching process.

2 – Incremental updates instead of full indexing

Improving upon the previous suggestion, you don’t want to send too many records in a single batch. To reduce indexing request sizes, you should perform incremental updates, where you update only the new records.

This code adds a new Bubble Gum series.

#python

algolia_index.save_objects([
  {"objectID": "myID1", "item": "Classic Bubble Gum", "price": "3.99"},
  {"objectID": "myID2", "item": "Raspberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID3", "item": "Cherry Bubble Gum", "price": "3.99"},
  {"objectID": "myID4", "item": "Blueberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID5", "item": "Mulberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID6", "item": "Lemon Bubble Gum", "price": "3.99"}
])

Note: It’s a good idea to do a full reindex of all records every night or at least weekly.

Check out our complete incremental updating solution.

3 – Partial indexing (updating only changed attributes)

To lower the indexing traffic even more, you’ll want to send only the attributes that have changed, not the whole record. For this, you’ll use a partial indexing strategy.

This code changes only the price of some of the Bubble Gums, no other attribute.

#python

algolia_index.save_objects([
  {'objectID': 'myID1', 'price': 4.99},
  {'objectID': 'myID3', 'price': 4.99},
  {'objectID': 'myID6', 'price': 2.99}
])

Check out our complete partial-indexing solution.

Next readings

Our first article on indexing presented a high-level overview of standard and advanced indexing use cases. This article walked you through indexing best practices and the implementation details of a standard indexing process. Our next article discusses how to optimize indexing in advanced use cases.

Our remaining articles will provide front & back end code for some of the advanced indexing use cases we discussed, starting with real-time pricing.

To get started with indexing, you can upload your data for free, or get a customized demo from our search experts today.

About the author
Peter Villani

Sr. Tech & Business Writer

linkedinmediumtwitter

Recommended Articles

Powered byAlgolia Algolia Recommend

An Exploration of Search and Indexing: Fast Indexing Scenarios
product

Peter Villani

Sr. Tech & Business Writer

How to optimize an already fast indexing process (advanced use cases)
engineering

Peter Villani

Sr. Tech & Business Writer

Inside the Algolia Engine Part 1 — Indexing vs. Search
engineering

Julien Lemoine

Co-founder & former CTO at Algolia