Large indexing jobs

Algolia always prioritizes search operations, so that indexing operations don’t impact the search performance.

Preparation

If possible, keep Algolia in the loop when you plan to index a large number of records. This lets Algolia monitor and optimize the configuration of your servers and indices, for example, by fine-tuning the sharding of internal indices for your specific data.

Contact your customer success manager or the Algolia support team.

Configure indices before uploading records

Configure an before uploading records. Setting the searchableAttributes parameter beforehand is particularly important to ensure the best indexing speed. By default, Algolia indexes all attributes, but you’ll likely want to search in only a few of them. For more information, see Searchable attributes.

Ensure the data fits on your servers

For best performance, Algolia stores all indices in memory on your servers. It’s best to keep the combined size of your indices below 80% of the total allocated RAM. When the index size exceeds the RAM capacity, the solid-state hard drive is used, which is much slower.

Your index is usually between two and three times larger than the raw size of your data. The exact factor depends on the structure of your data and your index configuration.

Data upload

Batch indexing jobs

Algolia’s API clients have saveObjects helper methods for uploading records in batches. This is more efficient than uploading one record after the other. Batches between 1,000 and 100,000 records tend to be optimal, depending on the average record size. Each batch should be smaller than 10 MB. The API can handle batches of up to 1 GB, but smaller batches lead to faster indexing.

Multi-thread your indexing

You can use several parallel workers to make multiple indexing requests in parallel.

Large datasets

If you have large indexing jobs, you might run into limitations which you can avoid by optimizing the settings of the API clients:

Adjust the batch size when using the saveObjects method. The optimal size depends on your average record size, so you may need to experiment with different values to find what works best.

Compress the record.

You should also inspect the Algolia HTTP error to decide whether it’s too big to process.

Search and Discovery platform

Optimization and Personalization

AI-powered experiences

Production and scale

Large indexing jobs

Preparation

Configure indices before uploading records

Ensure the data fits on your servers

Data upload

Batch indexing jobs

Multi-thread your indexing

Large datasets

​Preparation

​Configure indices before uploading records

​Ensure the data fits on your servers

​Data upload

​Batch indexing jobs

​Multi-thread your indexing

​Large datasets

Preparation

Configure indices before uploading records

Ensure the data fits on your servers

Data upload

Batch indexing jobs

Multi-thread your indexing

Large datasets