Deduplicate recommendations
On this page
With Recommend deduplication, you can refine your recommendations by removing item variants from your AI Model training.
The benefits of using Recommend deduplication are:
- More accurate recommendations and faster training times by removing variants directly before the training.
- Improved recommendation quality by using events from all item variants.
- Independent deduplication of recommendations without impacting your search experience. For more information, see Recommend deduplication vs Search deduplication.
How Recommend deduplication works
Recommend deduplication adds two processes to your AI model training:
- Pre-training process: Generates a training dataset with only one variant per item. It also merges all events from variants of the same item.
- Post-training process: Add all the variants that were dropped during the pre-training process to the generated recommendations.
Variants of the same item share the same recommendations.
Set up the deduplication for a model
To deduplicate your recommendations, you must first declare an attribute for distinguishing variants, then turn on deduplication when configuring a Recommend model. After that, you should verify that the recommendations are deduplicated.
Configure an attribute for distinguishing variants
First, choose which attribute defines records as variants:
- Go to the Algolia dashboard and select your Algolia application.
- On the left sidebar, select Search.
-
Select your Algolia index:
- On the Configuration tab, go to the Deduplication and Grouping page.
-
In the Attribute for Distinct box, select or enter the attribute name you want to use to define variants.
Don’t set distinct
unless you want to also deduplicate search results.
Enable Recommend deduplication on your model
- Go to the Algolia dashboard and select your Algolia application.
-
On the left sidebar, select Recommend.
- Create a new Recommend model or edit an existing one for the index you used when setting the
attributeForDistinct
. - In the section Distinct and deduplication of recommendations, select the Deduplicate recommendations option. The attribute you selected for defining variants is shown.
-
Continue to configure your Recommend model and click Save.
Verify the recommendations
To check that the deduplication is working on your recommendations, go back to the model configuration after the training is completed.
Go to the Preview section. Use the Search for a record box to search for an item which should have variants.
This will display the list of recommendations for the selected item. They shouldn’t contain any variants.
Recommend deduplication vs Search deduplication (distinct
)
Before introducing this new deduplication feature in Recommend, it was already possible to filter out variants from recommendations.
This was a side effect of enabling the Search distinct
feature and setting an attributeForDistinct
in your index settings.
If enabled, your AI models would be trained with all variants. The recommendations would then be filtered by the API before sending you the results to remove variants.
This solution was not without issues and could result in inaccurate recommendations or even no recommendations at all in some cases.
Examples
The following examples illustrate how Recommend deduplication works. The index has records for t-shirts in different colors and sizes:
- One red t-shirt in one size (XS)
- Two green t-shirts in two sizes (S, M)
- Three blue t-shirts in three sizes (L, XL, XXL)
This example uses the Related Products model to recommend the top 3 similar items with and without deduplication (with the color
attribute configured as attributeForDistinct
).
Without deduplication
Without deduplication, recommendations include variants, such as blue (XXL) or green (M), except for the red t-shirt which doesn’t have any.
Base Item | Recommendation 1 | Recommendation 2 | Recommendation 3 |
---|---|---|---|
red (XS) | green (S) | blue (L) | blue (XL) |
green (S) | green (M) | red (XS) | blue (L) |
blue (L) | blue (XL) | blue (XXL) | green (M) |
With deduplication
With deduplication, the recommendations don’t include any variants. But since this example dataset only includes three records, only two recommendations can be generated. If you add a new t-shirt, such as, an orange t-shirt, it will be included as the third recommendation.
Base Item | Recommendation 1 | Recommendation 2 | Recommendation 3 |
---|---|---|---|
red (XS) | green (S) | blue (L) | N/A |
green (S) | red (XS) | blue (L) | N/A |
blue (L) | red (XS) | green (M) | N/A |