Search by Algolia
Introducing new developer-friendly pricing
algolia

Introducing new developer-friendly pricing

Hey there, developers! At Algolia, we believe everyone should have the opportunity to bring a best-in-class search experience ...

Nick Vlku

VP of Product Growth

What is online visual merchandising?
e-commerce

What is online visual merchandising?

Eye-catching mannequins. Bright, colorful signage. Soothing interior design. Exquisite product displays. In short, amazing store merchandising. For shoppers in ...

Catherine Dee

Search and Discovery writer

Introducing the new Algolia no-code data connector platform
engineering

Introducing the new Algolia no-code data connector platform

Ingesting data should be easy, but all too often, it can be anything but. Data can come in many different ...

Keshia Rose

Staff Product Manager, Data Connectivity

Customer-centric site search trends
e-commerce

Customer-centric site search trends

Everyday there are new messages in the market about what technology to buy, how to position your company against the ...

Piyush Patel

Chief Strategic Business Development Officer

What is online retail merchandising? An introduction
e-commerce

What is online retail merchandising? An introduction

Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

5 considerations for Black Friday 2023 readiness
e-commerce

5 considerations for Black Friday 2023 readiness

It’s hard to imagine having to think about Black Friday less than 4 months out from the previous one ...

Piyush Patel

Chief Strategic Business Development Officer

How to increase your sales and ROI with optimized ecommerce merchandising
e-commerce

How to increase your sales and ROI with optimized ecommerce merchandising

What happens if an online shopper arrives on your ecommerce site and: Your navigation provides no obvious or helpful direction ...

Catherine Dee

Search and Discovery writer

Mobile search UX best practices, part 3: Optimizing display of search results
ux

Mobile search UX best practices, part 3: Optimizing display of search results

In part 1 of this blog-post series, we looked at app interface design obstacles in the mobile search experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 2: Streamlining search functionality
ux

Mobile search UX best practices, part 2: Streamlining search functionality

In part 1 of this series on mobile UX design, we talked about how designing a successful search user experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 1: Understanding the challenges
ux

Mobile search UX best practices, part 1: Understanding the challenges

Welcome to our three-part series on creating winning search UX design for your mobile app! This post identifies developer ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Teaching English with Zapier and Algolia
engineering

Teaching English with Zapier and Algolia

National No Code Day falls on March 11th in the United States to encourage more people to build things online ...

Alita Leite da Silva

How AI search enables ecommerce companies to boost revenue and cut costs
ai

How AI search enables ecommerce companies to boost revenue and cut costs

Consulting powerhouse McKinsey is bullish on AI. Their forecasting estimates that AI could add around 16 percent to global GDP ...

Michelle Adams

Chief Revenue Officer at Algolia

What is digital product merchandising?
e-commerce

What is digital product merchandising?

How do you sell a product when your customers can’t assess it in person: pick it up, feel what ...

Catherine Dee

Search and Discovery writer

Scaling marketplace search with AI
ai

Scaling marketplace search with AI

It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the ...

Bharat Guruprakash

Chief Product Officer

The changing face of digital merchandising
e-commerce

The changing face of digital merchandising

This 2-part feature dives into the transformational journey made by digital merchandising to drive positive ecommerce experiences. Part 1 ...

Reshma Iyer

Director of Product Marketing, Ecommerce

What’s a convolutional neural network and how is it used for image recognition in search?
ai

What’s a convolutional neural network and how is it used for image recognition in search?

A social media user is shown snapshots of people he may know based on face-recognition technology and asked if ...

Catherine Dee

Search and Discovery writer

What’s organizational knowledge and how can you make it accessible to the right people?
product

What’s organizational knowledge and how can you make it accessible to the right people?

How’s your company’s organizational knowledge holding up? In other words, if an employee were to leave, would they ...

Catherine Dee

Search and Discovery writer

Adding trending recommendations to your existing e-commerce store
engineering

Adding trending recommendations to your existing e-commerce store

Recommendations can make or break an online shopping experience. In a world full of endless choices and infinite scrolling, recommendations ...

Ashley Huynh

Looking for something?

Harry Logger and the Metrics’ Stone
facebookfacebooklinkedinlinkedintwittertwittermailmail

Introduction

As Algolia grows, we need to reconsider existing legacy systems we have in place and make them more reliable. One of those systems was our metrics pipeline. Each time a user calls the Algolia API, whether the operation involves search or indexing, it generates multiple lines of logs.

We generate around 1 billion lines of logs per day, which represent 2TB of raw data

Those logs contain information about the query that, when harvested, can yield insightful results. From those we compute metrics. Those are all the numbers you can find in the Algolia dashboard, like average search time, which user agents are using an index, etc. It’s also used for the billing as it’s computing the number of objects and the number of operations our customers perform.

As we are big fans of Harry Potter, we nicknamed this project “Harry Logger”.

The Chamber of Logs

The first thing we had to do was to transfer, in a resilient way, using as few resources as possible, the logs for the API machines to our metrics platform. The old system was doing a  fine job, but worked by having a centralized system pulling the logs from each machine. We wanted to go to a push strategy using a producer/consumer pattern.

This shift enabled us to do 2 things:

  • Replicate the consumers on multiple machines
  • Put a retry strategy closer to the logs, in the producer

We needed something that works reliably and in a clean way, hence we asked Dobby to do the job. For performance reasons, Dobby was developed in Golang:

The prisoner of SaaS

Our second job was to compute metrics on top of those logs. Our old system was a monolithic application that ran on one machine, meaning it was a single point of failure (SPOF). We wanted the new version to be more reliable, maintainable and distributed.

As SaaS is in our DNA, we went to various companies that specialized in the processing of metrics based on events (a line of log in our case). All the solutions we encountered were top notch, but as a company, the quantity of the data we generate on a daily basis presented an issue. As of today, we generate around 1 billion lines of logs per day, which represent 2TB of raw data. And no vendor was able to handle it. At this point, we were back to square one.

The Streams of Fire

After much consideration, we concluded that we had to build our own system. As the logs are a stream of lines, we decided to design our new system to compute metrics on top of a stream of data (and it’s trendy to do stream processing).

We tried the typical architecture:

As we didn’t want to maintain and host this architecture (we have enough API servers to maintain), we decided to consider a cloud provider. We managed to find every tool on the shelf, which meant we’d have less software to operate and maintain. As always, the issue was price. This streaming platform was 100 times more expensive than our old system. We had a lot of back-and-forth with our cloud provider to try to reduce it, but unfortunately, it was by design in their system. And again, we went back to square one.

The Half-Blood Batches

During our tests, we found that the Stream processing software we were using was also able to work in batch mode, not trendy but maybe it was a way to fix our pricing issue? By only switching to batch mode, the price was reduced by a factor of 10! We only needed an orchestrator to launch the batches:

After some development of reliable orchestrator we had a fully working system, but it was still 50% more expensive than we envisioned.

The Order of DIY

One of our engineers, a bit fed up by the amount of time we took to optimize the system, decided to do a proof of concept without using any framework or PaaS software. After a few days of coding, he managed to develop a prototype that suited our needs, was reliable and had a running cost 10 times lower than the batches!

The tale of the Wealthy Bard

Migrating our log processing toolchain yielded many outcomes that were valuable to our team. In addition to improving the reliability and the evolutivity of our current toolchain, we also increased our internal knowledge regarding our PaaS provider. The process also helped us identify deployment pain points that we will address later this year.

Finally, we iterated by using and composing different solutions to solve the same problem. The best solution, for us, was the one closest to the no-compromise with respect to your budget limits. In our case, it was possible to have both by finally keeping the multi-master key/value storage of our PaaS provider.

About the author
Rémy-Christophe Schermesser

Staff Software Engineer

14-day free trial

Create a full-featured search experience in no time.

Get started
14-day free trial

Recommended Articles

Powered byAlgolia Algolia Recommend

Redesigning our Docs – Part 6 – The processes and logistics of a large scale project
algolia

Maxime Locqueville

DX Engineering Manager

Redesigning Our Docs - Part 3 - The UX/UI Phase
ux

Nicolas Meuzard

Product Designer

Deploying Algolia to Search on more than 2 Million Products
algolia

Maxime