Search by Algolia
Vector vs Keyword Search: Why You Should Care
ai

Vector vs Keyword Search: Why You Should Care

Search has been around for a while, to the point that it is now considered a standard requirement in many ...

Nicolas Fiorini

Senior Machine Learning Engineer

What is AI-powered site search?
ai

What is AI-powered site search?

With the advent of artificial intelligence (AI) technologies enabling services such as Alexa, Google search, and self-driving cars, the ...

John Stewart

VP Corporate Marketing

What is a B2B marketplace?
e-commerce

What is a B2B marketplace?

It’s no secret that B2B (business-to-business) transactions have largely migrated online. According to Gartner, by 2025, 80 ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago
e-commerce

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago

Twice a year, B2B Online brings together industry leaders to discuss the trends affecting the B2B ecommerce industry. At the ...

Elena Moravec

Director of Product Marketing & Strategy

Deconstructing smart digital merchandising
e-commerce

Deconstructing smart digital merchandising

This is Part 2 of a series that dives into the transformational journey made by digital merchandising to drive positive ...

Benoit Reulier
Reshma Iyer

Benoit Reulier &

Reshma Iyer

The death of traditional shopping: How AI-powered conversational commerce changes everything
ai

The death of traditional shopping: How AI-powered conversational commerce changes everything

Get ready for the ride: online shopping is about to be completely upended by AI. Over the past few years ...

Aayush Iyer

Director, User Experience & UI Platform

What is B2C ecommerce? Models, examples, and definitions
e-commerce

What is B2C ecommerce? Models, examples, and definitions

Remember life before online shopping? When you had to actually leave the house for a brick-and-mortar store to ...

Catherine Dee

Search and Discovery writer

What are marketplace platforms and software? Why are they important?
e-commerce

What are marketplace platforms and software? Why are they important?

If you imagine pushing a virtual shopping cart down the aisles of an online store, or browsing items in an ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is an online marketplace?
e-commerce

What is an online marketplace?

Remember the world before the convenience of online commerce? Before the pandemic, before the proliferation of ecommerce sites, when the ...

Catherine Dee

Search and Discovery writer

10 ways AI is transforming ecommerce
e-commerce

10 ways AI is transforming ecommerce

Artificial intelligence (AI) is no longer just the stuff of scary futuristic movies; it’s recently burst into the headlines ...

Catherine Dee

Search and Discovery writer

AI as a Service (AIaaS) in the era of "buy not build"
ai

AI as a Service (AIaaS) in the era of "buy not build"

Imagine you are the CTO of a company that has just undergone a massive decade long digital transformation. You’ve ...

Sean Mullaney

CTO @Algolia

By the numbers: the ROI of keyword and AI site search for digital commerce
product

By the numbers: the ROI of keyword and AI site search for digital commerce

Did you know that the tiny search bar at the top of many ecommerce sites can offer an outsized return ...

Jon Silvers

Director, Digital Marketing

Using pre-trained AI algorithms to solve the cold start problem
ai

Using pre-trained AI algorithms to solve the cold start problem

Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Now, ecommerce businesses are beginning to clearly see ...

Etienne Martin

VP of Product

Introducing Algolia NeuralSearch
product

Introducing Algolia NeuralSearch

We couldn’t be more excited to announce the availability of our breakthrough product, Algolia NeuralSearch. The world has stepped ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

AI is eating ecommerce
ai

AI is eating ecommerce

The ecommerce industry has experienced steady and reliable growth over the last 20 years (albeit interrupted briefly by a global ...

Sean Mullaney

CTO @Algolia

Semantic textual similarity: a game changer for search results and recommendations
product

Semantic textual similarity: a game changer for search results and recommendations

As an ecommerce professional, you know the importance of providing a five-star search experience on your site or in ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is hashing and how does it improve website and app search?
ai

What is hashing and how does it improve website and app search?

Hashing.   Yep, you read that right.   Not hashtags. Not golden, crisp-on-the-outside, melty-on-the-inside hash browns ...

Catherine Dee

Search and Discovery writer

Conference Recap: ECIR23 Take-aways
engineering

Conference Recap: ECIR23 Take-aways

We’re just back from ECIR23, the leading European conference around Information Retrieval systems, which ran its 45th edition in ...

Paul-Louis Nech

Senior ML Engineer

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Before comparing Algolia to Elasticsearch, it’s good to understand a few things about the nature of search.

Search architecture is unique

The type and quality of search experience you can deliver depends heavily on your choice of search engine, hardware, datacenter region and front-end web and mobile development frameworks. It’s important to make the right choice for each part of the stack but it’s equally important to make a set of choices that work together as a whole. Because search user experience goals are so demanding, a vertically-integrated approach to architecture is more important than for other types of applications. Latency, for example, is not only a function of the search engine but of every step between the search user interface and backend infrastructure.

Search is mission critical

Search is one of the hardest features to get right, both because users benchmark search experiences against Google & Amazon, and because search is a balance of multiple disciplines, not limited to UX, relevance tuning and performance optimization. Development teams often delay building search because of a lack of confidence in getting it right and the fear that it will take longer than expected. Yet search is often the most mission critical feature—the quality of an application’s search has a big influence on the perception of application’s overall quality. In domains like e-commerce, introducing a search bug can cost millions of dollars.

The combination of these factors make search one of the riskiest areas of development for business and consumer applications. When comparing different ways of delivering a solution, like Algolia and Elasticsearch, we want to look at how each approach specifically addresses the full, end-to-end set of risks. In this comparison, we will look not only at the search engine but the full search architecture, starting with end-to-end latency.

Mission-critical search for a global user base

There are many different types of search applications. To focus the comparison of Algolia and Elasticsearch, we want to hone in on a specific family of use cases which we refer to consumer-grade search. Consumer-grade search is the type of experience delivered by companies like Google, Amazon and Facebook to billions of people worldwide. It connects people with products, content and key pieces of structured data. It is fast, reliable, works on multiple platforms and the results are highly relevant.

The search tolerates misspellings, alternate phrasings or user mistakes. Relevance is not caveat emptor, it’s caveat venditor – the search must adapt to the user, not the other way around. Consumers have high expectations of relevance but equally demanding expectations for the user interface. They expect an effortless, multi-faceted search and browsing experience, the kind pioneered by sites like Amazon.

Consumer-grade search doesn’t just apply to consumer-facing applications. Today’s business application users have become just as demanding, in part because many business applications are now distributed in app stores and compete directly with consumer versions.

The expectations of the average user can seem unattainably high, but this is why Algolia exists. Algolia is laser-focused on helping customers meet the perfectly unreasonable search expectations of the average Internet user.

About Elasticsearch

As a search engine that also functions as a scalable NoSQL database, Elasticsearch accommodates many different types of applications while not being opinionated toward one specific case. Elasticsearch is used for search but also log processing, real-time analytics, running map-reduce and other distributed algorithms, and even as an application’s primary database. The breadth of Elasticsearch is impressive and it does things that Algolia is not well suited for – streaming logs, map/reduce querying, complex aggregations and operating on billions of documents at a time. Algolia itself has used Elasticsearch internally for tasks like storing logs and computing rollups.

In this comparison, however, we are focusing on consumer-grade search. This is the most common situation we are asked to compare. Building a consumer-grade search application with Elasticsearch requires a nontrivial amount of backend and front-end software engineering. There are many more steps than just provisioning and operating an Elasticsearch cluster.

In this series we’ll dive into what some of those steps are; however, you can already take a look at how Algolia solves for these steps in our Inside the Engine series. In it we explore implementation details like I/O optimization, query tokenization, multi-attribute relevance, highlighting and synonym handling. These are features that must be accounted for in any search project, including those with Elasticsearch at the core.

End-to-end latency budget

The first feature of search is speed. Whole-transaction latency, from keystroke to visible search result, is what forms the user’s first impression of a search. A search application architect needs to have this in mind from the beginning, as a huge number of factors can affect the end-to-end latency.

To make things more difficult, for consumer-grade search the upper bound on satisfactory end-to-end latency is very, very low. Most consumer search experiences, including Google, Facebook and those of Algolia customers, deliver new results with every keystroke. This type of experience, known as instant search, is loved by users for its interactive feel but it only works if search results can be returned in the blink of an eye. Less, even: a human eye blink takes 300-400 milliseconds. An instant search starts to feel laggy at only 100 milliseconds.

For as-you-type search to be as satisfying as possible, Algolia recommends the end-to-end latency be no more than 50ms. This is the speed at which search feels truly real-time, where the user feels in full control of the experience. Under these conditions, users are likely to keep reformulating their query until they find what they’re looking for, rather than abandon or bounce.

If you’re using Elasticsearch or Algolia to power an as-you-type search, these are the important numbers to keep in mind as you design your architecture. It is possible to consistently reach these numbers if you know 1) where latency is likely to accumulate and 2) how to reduce or eliminate it.

That’s what we’ll look at in the following side-by-side table: how Algolia reduces latency in each layer of the stack, where latency can accumulate inside of Elasticsearch, and what work can be done inside or on top of an Elasticsearch implementation to reduce the risk of added latency.

Algolia

Elasticsearch

Global User Base

Low device-to-datacenter latency requires infrastructure in multiple regions.

Tip: add 1-2ms for every 124 miles of distance over fiber.

 

Automatically replicate indices to any of 15 regions throughout the world using our Distributed Search Network (DSN).

 

It is possible to cluster Elasticsearch across multiple data centers, but not recommended. The recommended solutions involve replicating manually via a messaging queue to clusters that are not aware of each other.

RAM

If a query’s data is not all in RAM, it may have to load data from the much slower disk.

 

Algolia indices are stored in RAM (256GB or more) and memory mapped to the nginx process. No pre-loading (warming) is required to get great performance for the first query.

 

The ES cluster must have enough RAM and be properly tuned to make sure large indices stay in memory. If you are also supporting an analytics workload, you risk large analytics queries evicting data for searches.

Virtualization

In sharing hosting environments like AWS, performance can fluctuate because of contention with other customers.

 

Algolia runs on bare metal hardware with high-frequency 3.5GHz-3.9Ghz Intel Xeon E5–1650v3 or v4 CPUs. Clock speed is directly related to search speed.

 

Elasticsearch can be deployed on bare metal and optimized hardware, but at a premium cost compared to AWS or cloud-based solutions.

Sorting

Before results are presented to the user, they have to be put in the right order.

 

Algolia presorts results at indexing time according to the relevance formula and custom ranking. There is a minimal sorting step at the end to account for dynamic criteria like the number typos and proximity of words.

 

Sorting is done at the end of each query. Depending on the number of results to be sorted, this can impact latency.

Relevance

Speed is often traded off to get better relevance.

 

Tokenization required for partial word matching and typo tolerance is done mostly at indexing time.

 

Advanced techniques like ngrams, shingles and fuzzy matching make indices larger and also require analysis at query time.

DNS

DNS can be slow before it’s cached by the user’s device. If a DNS provider is under DDOS, requests will be slow or fail to complete.

 

Algolia uses two DNS providers to increase reliability. Logic to fallback from one to the other is part of all API clients.

 

Elasticsearch does not provide out-of-the-box support for redundant DNS, but you could build it yourself.

Load Balancing

Load balancing & coordination can cause network congestion and add latency.

 

Algolia API clients connect directly to the server with data on it. There is no network hop or single point of failure for reaching a cluster.

 

An ES cluster needs the right ratio of data nodes and coordinating nodes to avoid adding latency. 10G network bandwidth is recommended for large clusters.

Garbage Collection

Applications running in the JVM require momentary pauses to free up used memory. During these pauses, requests are queued.

 

The Algolia engine is written in C++, it does not use the JVM.

 

The JVM can be tuned to reduce the frequency and impact of GC pauses. The tuning depends on the workload and server resources available. This is a painstaking process about which much has been written.

Sharding

Sharding allows data to be scaled across multiple indices. Overloaded shards exhibit degraded performance.

 

Algolia handles any required sharding behind the scenes, it is invisible to the user. Shards can be dynamically rebalanced to avoid hot spots.

 

If original shard assumptions are wrong, such as the choice of a shard key, an Elasticsearch cluster will have to be rebuilt or rebalanced down the road to alleviate performance hotspots.

Heavy Indexing

Large indexing operations can negatively impacts search performance because they compete for the same CPU and memory.

 

Algolia splits search and indexing into separate processes with different CPU priorities.

 

The Elasticsearch cluster must be configured to use different nodes for searching and indexing.

Conclusion

Latency can creep in from any number of places. Great care needs to be taken at each layer of the stack to avoid exceeding the latency budget and causing users to abandon. Algolia’s hosted search approach means that we can give our customers the benefit of our expertise in reducing latency. For users of Elasticsearch, latency needs to be understood and addressed by the implementing engineering team.

Read other parts of the Comparing Algolia and Elasticsearch for Consumer-Grade Search series:

Part 1 – End-to-end Latency
Part 2 – Relevance Isn’t Luck
Part 3 – Developer Experience and UX

 

About the author
Josh Dzielak

Comparing Algolia and ElasticSearch

Explore key differentiators for different use cases.

Get more info
Ebook
Ebook

Recommended Articles

Powered byAlgolia Algolia Recommend

Comparing Algolia and Elasticsearch for Consumer-Grade Search Part 3: Developer Experience and UX
product

Josh Dzielak

Can an Elasticsearch alternative stack up?
product

Louise Vollaire

Product Marketing Manager

Comparing Algolia and Elasticsearch For Consumer-Grade Search Part 2: Relevance Isn’t Luck
engineering

Josh Dzielak