Guides / Sending and managing data

Format and Structure Your Data

Sending data to Algolia

Before you can search your content with Algolia, you need to send your data to Algolia. Algolia doesn’t search in your original data source, but in the data you submit, which we host on our servers.

Here’s what the data workflow looks like:

  1. You fetch data from your data source (e.g., a database, static files).
  2. You transform that data into JSON records.
  3. You send the records to Algolia, using one of our API clients or the Algolia dashboard. This step is what we call indexing your data.

1. Fetching data from your data source

Algolia doesn’t search directly into your data source. You need to send the data to our servers so we can search into it. Whether your data is in a database, a collection of XML files, spreadsheets, or any other format, it doesn’t matter. What you need to do first is extract data from one or several sources and format it in a way that Algolia recognizes.

You don’t need to extract everything. You should be selective about what goes in the record, gathering only information that’s useful for building a search experience.

2. Transforming the extracted data

You need to transform the extracted data into a format that Algolia recognizes: JSON records.

Formatting and structuring your data are two of the most critical aspects in creating excellent search and relevance. Therefore, in addition to turning your data into JSON records, you need to refine these records by reworking their content, adding or computing new attributes, creating filters, restructuring record relationships, and more.

3. Sending your records to Algolia

Finally, you need to send your records to Algolia, using one of our API clients. Records are then stored into an Algolia index. This is all you need to do to start searching into your data.

To get started, you can use the Algolia dashboard, which allows you to paste in JSON records directly. You can also write a script to send your data using our API. This script runs on your computer or server, not on Algolia’s. You can write the script in any of the 11 languages that we cover with our official API clients. Check out our quick start guide to learn more.

If the data you plan to send to Algolia lives on various websites, consider using our Algolia Crawler. With a little configuration, the Algolia Crawler directly extracts and uploads records from your sites to Algolia indices.

Algolia records

An Algolia record (or object) is composed of key/value pairs called attributes. Attributes don’t have to respect a schema and can change from one object to another.

You want your records to contain any information that facilitates search, display on the front end, filtering, or relevance. You can leave everything else out.

Here is an example record of all four kinds of attributes.

1
2
3
4
5
6
7
8
9
{
  "title": "Blackberry and blueberry pie",
  "description": "A delicious pie recipe that combines blueberries and blackberries.",
  "image": "https://yourdomain.com/blackberry-blueberry-pie.jpg",
  "likes": 1128,
  "sales": 284,
  "categories": ["pie", "dessert", "sweet"],
  "gluten_free": false
}

Attributes for searching

Attributes for searching are the ones that contain the terms that your end users look for. If you want to search for “blueberry pie recipe”, you need attributes that contain those words—in our example, title and description.

Any textual, descriptive attribute that contains searchable keywords, such as summaries, brands, or colors, can be useful for searching.

All attributes are searchable by default, which lets you search in your records right from the start. However, for better relevance and performance, you want to be more selective by setting only some attributes as searchable. You can do this with the searchable attributes feature. You can also use this setting to prioritize your searchable attributes, making some more relevant than others.

Attributes for displaying

If you want to display images in your results, you need an attribute that contains their URLs. This way, Algolia can return them within search results, and you can use use them directly in the front end.

Display attributes include anything that can be useful to see in the results, such as images, titles, and descriptions, or even attributes that you would typically use for filtering and custom ranking, such as likes count or categories. Some display attributes can also be searchable, like title and description, some shouldn’t, like image or likes.

Attributes for filtering

If you want to search for only a subset of records based on a category (e.g., only pie recipes, only gluten-free desserts, etc.), you can set some attributes as filters. In our example, this would include categories and gluten_free.

Filterable attributes include boolean attributes (e.g., whether an item is public), lists (e.g., categories), numeric attributes (e.g., price, rounded rating), and normalized text (e.g., color).

Attributes for customizing ranking

If you want the most popular recipes to appear first in your results, you can add business-metric attributes such as the number of likes, ratings, or sales. In our example, this includes likes, sales, and gluten_free.

Attributes for custom ranking are either numeric or boolean.

Custom ranking strengthens and individualizes Algolia’s default ranking formula. Ranking contributes to the relevance of your search results. You can improve upon Algolia’s default ranking by including your own business metrics into the mix. To do this, you can use the custom ranking feature.

Simplifying your records

When creating a searchable index, you want to simplify your record structure as much as possible.

Each record should contain enough information to be discoverable on its own. You don’t have to follow relational database principles, such as not repeating data or creating hierarchical structures with primary and foreign keys. The Algolia engine returns records as results, so each object in your index should contain enough information to be found and to allow a full display of its content.

Take a book dataset. You can have one record per book, which contains everything about the book, including chapters. The problem is, a search for a common word like “boat” would retrieve too many books, most of which aren’t about boats.

If you want to get better, more relevant matches, you need to break up chapters into individual records. This way, you can search for books on boats with far more relevance by searching through their chapters.

Algolia index

An index is a collection of records that you create as soon as you send records to Algolia. You can create several indices that contain different sets of objects. All indices reside on Algolia’s servers.

Once you’ve pushed your data to Algolia, you can think of how to organize your indices. It includes how many to have and how to configure each one. You can put all your records into a single index, or spread them across several indices. How you organize your indices depends on how you want to search and display your objects.

Did you find this page helpful?