We don’t always write about the data or content side of search, which we call indexing. To be honest, we take indexing for granted. We say, “send us your data”, and then, once that’s out of the way, you can start focusing on the critical parts – building powerful customer-facing search interfaces, increasing conversions, customer engagement, and catalog discovery.
Not that there’s anything wrong with primarily writing about how to build a great business-value search UI/UX. But by not talking about how Algolia indexes your data, we skip over a step that lies at the heart of what we offer and creates just as much value for your business as the search interfaces that you build.
The value of search indexing is that it can transform any and all back-office data and file types into searchable data, enabling fast and relevant search on any device, globally, for any user – customers, partners, and employees. Once this multitude of data is placed into a single touch point between the back and front ends, you can leverage Algolia’s fully-featured Search & AI in all facets of your business, to handle any use case, internally or online.
In this respect, we can think of Indexing as on a par with Search and Recommend.
Most companies have one or more general purpose databases that store business-critical content like product data, sales history, and customer, inventory, and financial information. Algolia is not where this data is stored originally. Algolia is not a general purpose storage database. Nor is it an application that manages a specific business functionality, like customer relations or product and inventory management.
We don’t expect every sale to be sent to Algolia. We don’t provide that kind of search. That’s the territory of a purpose-specific POS system. But Algolia does have a role in the back office: it is built to interface with every source of truth, to provide a fast and relevant searchable dataset fed by any back-office system. Its cloud infrastructure provides a reliable and secure searchable data layer that lies between the back-office data source and any front-end that serves data to users.
Algolia’s power is to help surface the content in these back-office systems, to lead users to the original information. For example, Algolia can be used to store just enough information about products – price, image, description, amount available, and popularity, to help customers buy or employees to manage. In this use case, clicking on a product should take the user back to a product page, whose detail comes directly from the original source of data.
To serve all front-end search needs, it’s important that Algolia provide a solid indexing foundation. We will soon be publishing a series of articles on “headless search”, where we’ll discuss how our API-first indexing decouples the back from the front end, to create this middle layer of searchable data. Be the first to find out about these articles, as well as other current and future content by subscribing to our newsletter.
Let’s break down the differences between Algolia’s search and indexing options and offerings.
We’ve designed and optimized Algolia to provide instantaneous search results, so that there is no delay between typing a letter and getting an immediate response. Our SLA is that we ensure search availability 99.99% of the time. Here are the some details of our software’s commitments and offerings:
Behind every search lies a simple indexing framework – a single call to send any data from any system in a completely flexible format.
Why do we give search priority over indexing? Two main reasons:
The typical search and indexing process goes as follows:
Note that the re-indexing needs to be regular, but not for every sale. A single sale does not change the catalog. And it is rare that a significant change occurs in the course of a single day, let alone minutes. Price, availability, promotion campaigns, and custom rankings like popularity take time to adjust.
Thus, the most important concern for our customers is to ensure that all searches are a success and instantaneous, and that the back-end updates are regular and responsive enough so that significant changes are taken into account in a timely manner. Usually, this means every 30 minutes, or even 1 hour, or overnight. Every company chooses its best timeframe for its various use cases.
There are, however, some active timeframes, where a company needs 10-second updates, or even instant updates. Let’s discuss these unique use cases.
Let’s look at a definition before getting started.
Real-time can mean:
The following use cases will require one of these.
The crisis starts with a container ship lodged into the river bed of the Suez Canal. It’s not only incapable of delivering its own goods, it’s also blocking other container ships and therefore a large portion of the world’s supply of goods. On the sales side, ecommerce companies all over the world have to rethink their stock immediately, to avoid out-of-stock orders. They also have to find replacement items and adjust their promotional campaigns, to keep their online catalogs as engaging as before the crisis. While their IT departments re-index the online catalogs to remove the blocked items, their merchandisers set out to procure the replacement items and rethink their promotions.
On the other side of the equation, wholesale suppliers have to match the new demand by removing the blocked items from their B2B catalogs and push different products to their buyers.
Indexing shows up in three parts:
Another case requiring a fast indexing response occurs in the 5 weeks between Black Friday and the end of the Christmas season. In this scenario, companies need to constantly monitor their stocks and keep ahead of the market, to ensure they can fulfill every order and respond to their customer’s needs with the best and most important items with every search and promotion. In this case, they monitor fairly often – some companies multiple times throughout the day.
In this scenario, we have a large marketplace with independent vendors who are constantly updating their inventories. For the most part, these vendors do not own a full stock of items; instead, they own one of each product. Therefore, one sale can make a product unavailable, and so they need to remove objects as soon as a sale is made. Some marketplaces allow users to put a temporary hold on objects that are added to a wish list.
Every marketplace website comes up with a strategy to manage this unpredictable and fast-moving flux of out-of-stock and product holds. Here are some strategies:
eBay adds an additional complication, where the buying and selling of items is joined by an auction and bids that change prices on the fly. The way you would handle this kind of real-time updating depends on the UI. However, this case comes very close to needing real-time data processing on both search and indexing. Let’s look closer at this last use case.
The situation here is that every time a vendor makes a change to a product – especially price – they want users to see this change (and any associated calculations) immediately. As such, it is no different than a typical stock market tool, where you can see prices change like flickering lights. Consider an automated trading tool that buys and sells stocks as stock prices change. Any delay impacts the competitiveness of the system and could impact large amounts of money.
One can put into question if any B2C scenario has such a need for real-time indexing. Do prices change that much, that quickly? Do vast amounts of people and systems buy and sell goods based on the smallest price change?
Perhaps B2B has a better case for needing instantaneous price changes to be displayed. It is quite possible suppliers out-bid each other in real-time.
In any bidding universe – we can add eBay to this – it is critical to have the fastest indexing response times.
This is where we get to the heart of fast indexing. Is any search engine’s indexing going to be as fast as its search functionality? That is, while a search engine provider contractually commits to displaying instant results as users type, it does not commit in the same way to “instant” or near real-time indexing. This is because, in order to achieve real-time search (search in milliseconds), you must index data in a certain way that inevitably takes time (1 to 10 seconds, depending on the size of the index and the number of updates in the indexing request).
Algolia values a faster-than-database search (milliseconds) at the expense of a slower-than-database indexing process (seconds). And as seen in the advanced use cases above, even in a crisis or constantly changing inventory / pricing environment, the speed of our indexing engine is reliable, responsive, and exceeds expectations.
This article has presented a high-level overview of standard and advanced indexing use cases. Our next article walks you through indexing best practices and the implementation details of a standard indexing process. That’s followed up by an article on how to optimize indexing in advanced use cases.
Our remaining articles will provide front & back end code for some of the advanced indexing use cases we discussed, starting with real-time pricing.
To find out how Algolia’s powerful indexing and cloud infrastructure can transform your digital strategy, sign up for free and see it for yourself. Or get a customized demo from our search experts today.
Peter Villani
Sr. Tech & Business WriterPowered by Algolia AI Recommendations
Peter Villani
Sr. Tech & Business WriterPeter Villani
Sr. Tech & Business WriterJulien Lemoine
Co-founder & former CTO at Algolia