Guides / Sending and managing data

Format and structure your data

Before you can search your content, you need to send your data to Algolia. Algolia doesn’t search in your original data source, but in the data you submit, which Algolia hosts on its servers.

Here’s what the data workflow looks like:

  1. You fetch data from your sources, such as a database or static files.
  2. You transform that data into JSON records.
  3. You send the records to Algolia using an integration, an API client, the Algolia CLI, or the Algolia dashboard. This is the indexing step.

Fetching data from your data source

Algolia doesn’t directly search your data source: instead, you must send the data to the Algolia servers so the engine can search it. Whether your data is in a database, a collection of XML files, spreadsheets, or any other format, it doesn’t matter. First, extract data from one or several sources and format it in a way that Algolia recognizes.

You don’t need to extract everything: be selective about what goes in the record and only gather the information that helps build your search experience.

Transforming the extracted data

You need to transform the extracted data into a format that Algolia recognizes: JSON records.

Formatting and structuring your data is one of the most critical aspects of creating excellent search and relevance. Along with turning your data into JSON records, you also need to refine them. This includes reworking their content, adding new or computed attributes, creating filters, and restructuring record relationships.

Sending data to Algolia

Once your records are ready, send them to Algolia using one of the Algolia API clients. Records are then stored in an Algolia index, and this is all you need to start searching your data.

To get started, you can use the Algolia dashboard, which allows you to paste in JSON records directly. You can also write a script to send your data using the Algolia API. This script runs on your computer or server, not on Algolia’s. You can write the script in any of the 11 languages that Algolia covers with the official API clients. Check out the quick start guide to learn more.

If the data you send to Algolia lives on various websites, consider using the Algolia Crawler. With a little configuration, the Crawler directly extracts and uploads records from your sites to Algolia indices.

Algolia records

An Algolia record (or object) is a set of key-value pairs called attributes. Attributes don’t have to respect a schema and can change from one object to another.

You want your records to contain information that facilitates search, display on the frontend, filtering, or relevance. You can leave everything else out.

Here is an example record of all four kinds of attributes.

1
2
3
4
5
6
7
8
9
{
  "title": "Blackberry and blueberry pie",
  "description": "A delicious pie recipe that combines blueberries and blackberries.",
  "image": "https://yourdomain.com/blackberry-blueberry-pie.jpg",
  "likes": 1128,
  "sales": 284,
  "categories": ["pie", "dessert", "sweet"],
  "gluten_free": false
}

Attributes for searching

Attributes for searching are the ones that contain the terms that your users look for. For instance, to search for “blueberry pie recipe”, you need attributes that contain those words—in this example, title and description.

Any textual, descriptive attribute that contains searchable keywords, such as summaries, brands, or colors, can be useful for searching.

All attributes are searchable by default, which lets you search in your records right from the start. Yet, for better relevance and performance, be more selective by setting only some attributes as searchable. You can do this with the searchable attributes feature. You can also use this setting to rank your searchable attributes, making some more relevant than others.

Attributes for displaying

To display images in your results, you need an image URL attribute in your records. This way, Algolia can return them within search results, and you can use them directly in the frontend.

Display attributes include anything that can be useful to see in the results. These can be images, titles, descriptions, or even attributes that you typically use for filtering and custom ranking, such as the number of likes or categories. Some display attributes can also be searchable, like title and description (some shouldn’t, like image or likes).

Attributes for filtering

To search for a subset of records based on a category, for example, pie recipes or gluten-free desserts, you can set some attributes as filters. In this example, it would include categories and gluten_free.

Filterable attributes include:

  • Booleans (like whether an item is public)
  • Lists (categories, tags)
  • Numeric attributes (price, rounded rating)
  • Normalized text (colors, types, or enumerated types).

Attributes for customizing ranking

For the most popular recipes to appear first in your results, you can add business-metric attributes such as the number of likes, ratings, or sales. In the recipe example, this includes likes, sales, and gluten_free.

Custom ranking strengthens and individualizes Algolia’s default ranking formula. Ranking contributes to the relevance of your search results. You can improve upon Algolia’s default ranking by including your business metrics.

Attributes for custom ranking are either numeric or boolean.

Simplifying your records

When creating a searchable index, you want to simplify your record structure as much as possible.

Each record should contain enough information to be discoverable on its own. You don’t have to follow relational database principles, such as not repeating data or creating hierarchical structures with primary and foreign keys. The Algolia engine returns records as results. Each object in your index should contain enough information for users to find it and to allow a full display of its content.

Take a book dataset. You can have one record per book, which contains everything about the book, including chapters. The problem is a search for a common word like “boat” would retrieve too many books, most of which aren’t about boats.

Break up chapters into individual records to get better, more relevant matches. This way, you can search for books on boats with far more relevance by searching through their chapters.

Algolia index

An index is where the data used by Algolia’s search and discovery engine is stored. It’s the equivalent of a table in a database but optimized for search and discovery operations. An index is created when you send records to Algolia. You can create several indices that contain different sets of objects. All indices live on Algolia’s servers.

Once you’ve pushed your data to Algolia, you can start thinking about organizing your indices. This includes how many indices to have and how to configure each one. You can put all your records into a single index or spread them across several indices. How you organize your indices depends on how you want to search and display your objects.

Did you find this page helpful?