We have 2 standard server setups:
- 6 cores CPU
- 12 threads
- 128G memory
- 2x800 GB SSD RAID-0
- 4 cores CPU
- 8 threads
- 64GB memory
- 2x400 GB SSD RAID-0
Early in our company’s history, we started quite big, with 4 processors, 8 threads, 32Gb RAM, and 2x120GB SSD 320-series disks. However, we quickly found that each element needed improvement. For example:
We discovered that 4 cores and 8 threads were not always enough to handle large numbers of operations running in parallel. We ultimately went with 6 cores and 12 threads, which handles an ongoing flow of index operations without impacting the speed of the search. This upgrade also gives us a lot of extra processing speed to manage our servers’ system resources.
For the disks, we experimented with many kinds of SSD RAID-0 before choosing the right one. The current SSD S3500 series gives us a faster hard-drive, thereby removing a serious bottleneck to performance.
To strike the right balance of RAM and disk space, for both caching and data capacity, we ran through a number of use cases and engine tweaks and countless performance tests before arriving at the right amount of memory and disk size - no more no less. For example, we needed to increase the memory to process large indexes while leaving enough space to perform in-memory searches.
To get an idea of the importance of balancing RAM size with the right number of cores and threads, consider large indexes. Algolia puts every customer’s full index in RAM (for performance). Additionally, it breaks up large indexes into smaller pieces, where each piece (shard) is given a dedicated thread on the same server. This permits faster updates and searches. Meanwhile, other threads are dedicated to processing searches, while still more threads are monitoring the system and managing consensus. As you can see, our machines require a large minimum of RAM + cores, with a reasonable buffer for flexibility.
To go into far more detail about our engine, take a look at our CTO’s 8-part series Inside the Engine.
Algolia’s bare metal servers
While virtualization is the choice of a vast majority of SaaS services, Algolia has decided to use bare metal servers. Bare metal servers give applications direct access to the physical and software resources of a computer. For example, all Algolia searching and indexing operations are processed by the Algolia engine, which in turn directly interacts with the computer’s essential resources, such as the Operating System, CPU, RAM, and disk.
With a virtual machine, on the other hand, an applications needs to pass by one or more additional layers of system-level software (the virtual machine) before reaching the services of the underlying physical server. This slows things down, but it also creates flexibility, by spreading a single server’s capacity over many discreet use-cases: one customer might want to run a massive SQL Server database on a Windows computer; another customer might want to perform CPU-intensive calculations using an old version of unix; and another might want to simulate a macintosh - all of which can be done on a shared server using virtualization.
For Algolia, virtual machine are not necessary. Bare metal servers have been time- and task-slicing for years, without the need to virtualize. With powerful hardware components, a single server can handle countless customers - especially if they are all doing the same operations, which is the case with Algolia.
Additionally, many of our larger customers use dedicated servers (or more precisely clusters), giving them exclusive access to the entire cluster. Most of our smaller accounts share servers - which means that they share the same server (cluster) and Algolia engine. However, whether it is dedicated or shared, the principle is the same - Algolia operates directly on the machine.
Note that when we say server, we are actually referring to a cluster of 3 identical servers. Thanks to clusters, Algolia can provide an SLA reliability of 99.99%, because a cluster guarantees that at least 1 of the 3 servers will be available at all times.
Find out more
If you want more hardware and system software details, or to learn more about Algolia’s architectural decision-making process, take a look at the following articles.
- The history of Algolia’s architecture as told by our CTO/architect
- Our architecture in even more detail
- How our architecture is specifically designed for Search-as-a-Service
- A word about our solid state hard-drives
- An article about how we achieve ultra-low latency
You can monitor your servers and clusters via the dashboard. Go to Dashboard -> API Status, then click on cluster name.
For Enterprise customers, we have a Monitoring API which provides a window into all cluster and DSN activity.
Did you find this page helpful?
We're always looking for advice to help improve our documentation!
Please let us know what's working (or what's not!).
We're constantly iterating thanks to the feedback we receive.