Search by Algolia
How personalization boosts customer engagement
e-commerce

How personalization boosts customer engagement

You land on your favorite retailer’s website, where everything seems to be attractively arranged just for you. Your favorite ...

Jon Silvers

Director, Digital Marketing

What is retail analytics and how can it inform your data-driven ecommerce merchandising strategy?
e-commerce

What is retail analytics and how can it inform your data-driven ecommerce merchandising strategy?

There is such tremendous activity both on and off of retailer websites today that it would be impossible to make ...

Catherine Dee

Search and Discovery writer

8 ways to use merchandising data to boost your online store ROI
e-commerce

8 ways to use merchandising data to boost your online store ROI

New year, new goals. Sounds positive, but looking at your sales data, your revenue and profit aren’t so hot ...

John Stewart

VP, Corporate Communications and Brand

Algolia DocSearch + Astro Starlight
engineering

Algolia DocSearch + Astro Starlight

What is Astro Starlight? If you're building a documentation site, your content needs to be easy to write and ...

Jaden Baptista

Technical Writer

What role does AI play in recommendation systems and engines?
ai

What role does AI play in recommendation systems and engines?

You put that in your cart. How about this cool thing to go with it? You liked that? Here are ...

Catherine Dee

Search and Discovery writer

How AI can help improve your user experience
ux

How AI can help improve your user experience

They say you get one chance to make a great first impression. With visual design on ecommerce web pages, this ...

Jon Silvers

Director, Digital Marketing

Keeping your Algolia search index up to date
product

Keeping your Algolia search index up to date

When creating your initial Algolia index, you may seed the index with an initial set of data. This is convenient ...

Jaden Baptista

Technical Writer

Merchandising in the AI era
e-commerce

Merchandising in the AI era

For merchandisers, every website visit is an opportunity to promote products to potential buyers. In the era of AI, incorporating ...

Tariq Khan

Director of Content Marketing

Debunking the most common AI myths
ai

Debunking the most common AI myths

ARTIFICIAL INTELLIGENCE CAN’T BE TRUSTED, shouts the headline on your social media newsfeed. Is that really true, or is ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

How AI can benefit the retail industry
ai

How AI can benefit the retail industry

Artificial intelligence is on a roll. It’s strengthening healthcare diagnostics, taking on office grunt work, helping banks combat fraud ...

Catherine Dee

Search and Discovery writer

How ecommerce AI is reshaping business
e-commerce

How ecommerce AI is reshaping business

Like other modern phenomena such as social media, artificial intelligence has landed on the ecommerce industry scene with a giant ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

AI-driven smart merchandising: what it is and why your ecommerce store needs it
ai

AI-driven smart merchandising: what it is and why your ecommerce store needs it

Do you dream of having your own personal online shopper? Someone familiar and fun who pops up every time you ...

Catherine Dee

Search and Discovery writer

NRF 2024: A cocktail of inspiration and innovation
e-commerce

NRF 2024: A cocktail of inspiration and innovation

Retail’s big show, NRF 2024, once again brought together a wide spectrum of practitioners focused on innovation and transformation ...

Reshma Iyer

Director of Product Marketing, Ecommerce

How AI-powered personalization is transforming the user and customer experience
ai

How AI-powered personalization is transforming the user and customer experience

In a world of so many overwhelming choices for consumers, how can you best engage with the shoppers who visit ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

Unveiling the future: Algolia’s AI revolution at NRF Retail Big Show
algolia

Unveiling the future: Algolia’s AI revolution at NRF Retail Big Show

Get ready for an exhilarating journey into the future of retail as Algolia takes center stage at the NRF Retail ...

John Stewart

VP Corporate Marketing

How to master personalization with AI
ai

How to master personalization with AI

Picture ecommerce in its early days: businesses were just beginning to discover the power of personalized marketing. They’d divide ...

Ciprian Borodescu

AI Product Manager | On a mission to help people succeed through the use of AI

5 best practices for nailing the ecommerce virtual assistant user experience
ai

5 best practices for nailing the ecommerce virtual assistant user experience

“Hello there, how can I help you today?”, asks the virtual shopping assistant in the lower right-hand corner ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

Add InstantSearch and Autocomplete to your search experience in just 5 minutes
product

Add InstantSearch and Autocomplete to your search experience in just 5 minutes

A good starting point for building a comprehensive search experience is a straightforward app template. When crafting your application’s ...

Imogen Lovera

Senior Product Manager

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Every search interface relies on a fast back-end data-indexing process that keeps its search results up to date in as timely a manner as possible. But search indexing is only one side of the coin. The other side is the real-time speed of a high-quality relevant search engine. 

For all search engines, the search request is the highest priority, with indexing a (very) close second. There are several reasons for this, but the most important is a business argument: every search is a potential game changer, a path to a conversion. Any slow or dropped search request, or irrelevant result, is a potential financial or business loss.

To achieve maximum speed & relevance, a search engine must:

  • Prioritize search requests over indexing requests
  • Structure its indexes so that queries execute in real-time (milliseconds), with the best relevance 

As a result, it takes a little extra time to update an index. But if you learn to follow a few indexing best practices, you’ll even things out.

“All well and good,” say the full stack and back-end developers. “I understand the priority of search. But I want to know more about my data. How do I get my data onto your servers? Can it handle my use cases? Does it accept any kind of data? Is it simple, secure, fast?” 

In a recent article on indexing, we explored a variety of advanced use cases, and focused on two search indexing essentials: fast updates and wide applicability. Now it’s time to dig into the code and explain some speed-enhancing algorithms and indexing best practices that ensure you get the highest indexing speed for any search use case

There are two primary areas to focus on here:

  • The wide applicability of our indexes
  • The high performance of our indexing API to update your data

The wide applicability of our search indexing

To understand indexing on its own terms, we need to decouple it from search and outline the most popular indexing scenarios:

Indexing for search

A well-structured index provides the foundation for a fast and fully-featured customer-facing search interface, with great relevance. In fact, indexing is so important to search & relevance that it needs to be designed and implemented with as much care and dedication as the front end.

Indexing to create a company-wide, multi-purpose, searchable data layer

Multiple indexes can form a single touch point for all back-office data. When put together in a certain way, your indexes can create a company-wide searchable data layer that lies between your back-office and all front ends used internally (employees) or externally (customers, partners).

Indexing as a “matchmaker” – the collaborative indexing use case

The “matchmaker” scenario is when Company X builds an Algolia index and makes it available to external data providers. In this scenario, Company X builds a collaborative website, such as a marketplace or streaming platform, where it displays the products/media of multiple vendors, partners, and contributors. To accomplish this, Company X exposes its Algolia index to these external data providers, allowing them to send data once they understand the format.

Here’s the main difference between the first two scenarios:

  • A single search interface requires at least one index, which should be structured with that interface in mind.
  • In the company-wide data layer scenario, it’s different: you need to generalize the structure of your index(es). The data that makes up this multi-purpose data layer needs to be structured to (a) allow multiple feeds of data from widely different back-office applications, and (b) serve multiple use cases and interfaces, whether user-facing or system-to-system.

What about indexing performance?

The wide applicability of our indexing wouldn’t be possible nor survive the competitive digital business environment if it were not performant in all situations. While we offer high indexing speed out of the box, this hinges on implementing best indexing practices. That’s what this article is about.

Just a word about what we mean by “out-of-the-box high performance”. Our indexing comes with the following technologies:

  • A search engine using advanced indexing techniques
  • High-performant bare-metal servers configured for performance 
  • A globally available cluster-based cloud infrastructure, with low latency and server redundancy (i.e., no server downtime)
  • An API with a retry method to ensure (contractually) 99.99% availability 

Best practices for fast indexing performance (with code snippets)

The most important indexing practice is to run a batching algorithm that updates multiple records in one indexing operation, in a regular and timely manner. This is true for all use cases. 

Why do we recommend batching? Because there’s a small performance cost to every indexing request. An index request involves a small “reindexing” of your entire index, which could take up to 1 second, or more if the index is very large. Thus, sending 100s of indexing requests, one record at a time, can create an indexing queue that will slow down the entire indexing process. To mitigate this, it’s important to limit the number of indexing requests made on the server by sending less requests.

Taking all that into account, here are the 3 most important indexing best practices (pretty standard fare for data updates):

  1. Batching updates instead of sending updates one record at a time
  2. Incremental updates instead of full (re)indexing
  3. Partial indexing (updating only changed attributes)

1 – Batch indexing instead of updating one record at a time

One common mistake is to send one record at a time. If your back-end data constantly changes, it would be wrong to send each change as it occurs. As stated above, bottlenecks occur when you create a queue of 100’s of indexing requests that are waiting to be processed.

Instead, as a best practice, use batch indexing. You send each change to a temporary cache, and then regularly send that cache to Algolia, for example, every 5 minutes or 30 minutes for larger indexes. Never batch faster than 1 minute, because you’ll end up creating a bottleneck. 

This code example builds a new index. It batch-saves 10000 record-chunks using the save_objects method of Algolia’s Python API.

#python

import json
from algoliasearch import algoliasearch

client = algoliasearch.Client('YourApplicationID', 'YourAdminAPIKey')
algolia_index = client.init_index('bubble_gum')

with open('bubble_gum.json') as f:
  records = json.load(f)

  chunk_size = 10000
  for i in range(0, len(records), chunk_size):
    algolia_index.save_objects(records[i:i + chunk_size])

See how our API has automated the batching process.

2 – Incremental updates instead of full indexing

Improving upon the previous suggestion, you don’t want to send too many records in a single batch. To reduce indexing request sizes, you should perform incremental updates, where you update only the new records.

This code adds a new Bubble Gum series.

#python

algolia_index.save_objects([
  {"objectID": "myID1", "item": "Classic Bubble Gum", "price": "3.99"},
  {"objectID": "myID2", "item": "Raspberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID3", "item": "Cherry Bubble Gum", "price": "3.99"},
  {"objectID": "myID4", "item": "Blueberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID5", "item": "Mulberry Bubble Gum", "price": "3.99"},
  {"objectID": "myID6", "item": "Lemon Bubble Gum", "price": "3.99"}
])

Note: It’s a good idea to do a full reindex of all records every night or at least weekly.

Check out our complete incremental updating solution.

3 – Partial indexing (updating only changed attributes)

To lower the indexing traffic even more, you’ll want to send only the attributes that have changed, not the whole record. For this, you’ll use a partial indexing strategy.

This code changes only the price of some of the Bubble Gums, no other attribute.

#python

algolia_index.save_objects([
  {'objectID': 'myID1', 'price': 4.99},
  {'objectID': 'myID3', 'price': 4.99},
  {'objectID': 'myID6', 'price': 2.99}
])

Check out our complete partial-indexing solution.

Next readings

Our first article on indexing presented a high-level overview of standard and advanced indexing use cases. This article walked you through indexing best practices and the implementation details of a standard indexing process. Our next article discusses how to optimize indexing in advanced use cases.

Our remaining articles will provide front & back end code for some of the advanced indexing use cases we discussed, starting with real-time pricing.

To get started with indexing, you can upload your data for free, or get a customized demo from our search experts today.

About the author
Peter Villani

Sr. Tech & Business Writer

linkedinmediumtwitter

Recommended Articles

Powered byAlgolia Algolia Recommend

An Exploration of Search and Indexing: Fast Indexing Scenarios
product

Peter Villani

Sr. Tech & Business Writer

How to optimize an already fast indexing process (advanced use cases)
engineering

Peter Villani

Sr. Tech & Business Writer

Inside the Algolia Engine Part 1 — Indexing vs. Search
engineering

Julien Lemoine

Co-founder & former CTO at Algolia