Prepare your index structure
On this page
AI Personalization is a beta feature according to Algolia’s Terms of Service (“Beta Services”).
To optimize personalization, you should configure your existing index’s structure to align with AI Personalization requirements. A well-prepared index structure enhances AI Personalization ability to deliver personalized search experiences within your website or app.
Use categorical attributes
Categorical attributes are attributes with a fixed number of possible values.
Incorporating such attributes into your index structure categorizes your data into well-defined, relatively smaller buckets.
1
2
3
4
5
6
7
8
9
10
11
[
{ "objectID": "ID01", "title": "Chair", "color": "red", "categories": ["Furniture", "Outdoors"] },
{ "objectID": "ID02", "title": "Table", "color": "green", "categories": ["Furniture", "Outdoors"] },
{ "objectID": "ID03", "title": "Water bottle", "color": "blue", "categories": ["Outdoors"] },
{ "objectID": "ID04", "title": "Book", "color": "red", "categories": ["Books"] },
{ "objectID": "ID05", "title": "Headphones", "color": "green", "categories": ["Electronics"] },
{ "objectID": "ID06", "title": "Phone", "color": "blue", "categories": ["Electronics"] },
{ "objectID": "ID07", "title": "Television", "color": "red", "categories": ["Electronics"] },
{ "objectID": "ID08", "title": "Can", "color": "green", "categories": ["Cooking & Dining", "Kitchenware"] },
{ "objectID": "ID09", "title": "Bowl", "color": "blue", "categories": ["Cooking & Dining", "Kitchenware"] }
]
Good categorical attributes
An attribute like color
with finite values like red, green, and blue is considered a good categorical attribute because it organizes your index into three distinct buckets.
Other good categorical attributes include gender
, brand
or categories
.
Bad categorical attributes
Attributes that are unique for each record, such as objectID
and title
, are poor choices because they don’t offer any grouping into buckets.
Other poor categorical attributes include description
, sku
, or price
.
Avoid nesting attributes
When organizing data within records, prioritize a flat structure. Reserve deeper nesting for hierarchical facets.
Consider the following example where several keys are being nested.
1
2
3
4
5
6
7
{
"key1": {
"key2": {
"key3": "value"
}
}
}
To improve this index structure, the attribute-value pair must be simplified, making it easier for AI Personalization to process the index.
1
2
3
{
"key1_key2_key3": "value"
}
To optimize performance, AI Personalization sets a maximum of 50 key-value pairs for nested attributes. Adhering to this limit ensures your nested attributes are processed efficiently. If you need to exceed these constraints, contact the Algolia support team.
Avoid mixing attribute types
Maintaining a consistent type for attributes is an important step in preparing your index structure.
1
2
3
4
5
[
{ "color": "red" },
{ "color": ["red", "blue"] },
{ "color": 250000000 }
]
Using this index as it is would lead to unexpected results.
This is because the attribute color
can be defined as a string, an array, and an integer.
A structure like this usually indicates an underlying issue with your data. You must evaluate your index and ensure a consistent type for your attributes.
1
2
3
4
5
[
{ "color": "red" },
{ "color": "green" },
{ "color": "blue" }
]
Avoid mixing attributes from different domains
When preparing your index structure, you must ensure that attributes are relevant to a single domain.
For example, an index for articles shouldn’t contain attributes relevant to product information and vice versa.
Language is a common domain which could also lead to a mix of attributes within your index.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[
{
"objectID": "MH001",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH002",
"color": "rouge",
"title": "Sac"
},
{
"objectID": "MH001",
"color": "blue",
"title": "Chair"
}
]
The second record is out of place within this index because it contains attributes in the French language. You can fix this by reviewing the index and ensuring that attributes from different domains haven’t been mixed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[
{
"objectID": "MH001",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH002",
"color": "red",
"title": "Bag"
},
{
"objectID": "MH001",
"color": "blue",
"title": "Chair"
}
]
How AI Personalization validates attributes
It’s also important to understand how AI Personalization validation process works.
AI Personalization prioritizes attributes that significantly contribute to personalization, ensuring your user profiles are built on meaningful data. It filters out attributes based on user interactions.
-
An attribute-value pair must show significant user interaction. It doesn’t set a fixed benchmark for checks but instead looks for a significant number of interactions. For the attribute-pair,
color:red
, if AI Personalization randomly picks a set of users from last month (say 200), it would want to see at least some of them interacting with red products. -
Attributes with minimal relevance are likely to be disregarded. An attribute is likely to be filtered out if it only applies to a tiny fraction of your products since it won’t generate enough user interactions.
-
Attributes with excessive diversity are likely to be disregarded. For example, an attribute like
brand
is likely to be filtered out if AI Personalization notices that there are thousands of different brands across millions of products and no single brand is getting enough user interaction to be considered important. -
Unusual attribute values will be disregarded. Attributes with lots of exotic values (such as
color:pink_with_brown_dots
) aren’t excluded entirely but the exotic values are filtered out and won’t be processed.