🎉 Try the public beta of the new docs site at algolia.com/doc-beta! 🎉
Guides / Algolia Recommend

Deduplicate recommendations

With Recommend deduplication, you can refine your recommendations by removing item variants from your AI Model training.

The benefits of using Recommend deduplication are:

  • More accurate recommendations and faster training times by removing variants directly before the training.
  • Improved recommendation quality by using events from all item variants.
  • Independent deduplication of recommendations without impacting your search experience. For more information, see Recommend deduplication vs Search deduplication.

How Recommend deduplication works

Recommend deduplication adds two processes to your AI model training:

  • Pre-training process: Generates a training dataset with only one variant per item. It also merges all events from variants of the same item.
  • Post-training process: Add all the variants that were dropped during the pre-training process to the generated recommendations.

Variants of the same item share the same recommendations.

Set up the deduplication for a model

To deduplicate your recommendations, you must first declare an attribute for distinguishing variants, then turn on deduplication when configuring a Recommend model. After that, you should verify that the recommendations are deduplicated.

Configure an attribute for distinguishing variants

First, choose which attribute defines records as variants:

  1. Go to the Algolia dashboard and select your Algolia application.
  2. On the left sidebar, select Algolia Search Search.
  3. Select your Algolia index:

    Select your Algolia application and index

  4. On the Configuration tab, go to the Deduplication and Grouping page.
  5. In the Attribute for Distinct box, select or enter the attribute name you want to use to define variants.

    Set the `attributeForDistinct` in your index settings

Don’t set distinct unless you want to also deduplicate search results.

Enable Recommend deduplication on your model

  1. Go to the Algolia dashboard and select your Algolia application.
  2. On the left sidebar, select Algolia Recommend Recommend.

  3. Create a new Recommend model or edit an existing one for the index you used when setting the attributeForDistinct.
  4. In the section Distinct and deduplication of recommendations, select the Deduplicate recommendations option. The attribute you selected for defining variants is shown.
  5. Continue to configure your Recommend model and click Save.

    Enable deduplication in your model training configuration

Verify the recommendations

To check that the deduplication is working on your recommendations, go back to the model configuration after the training is completed.

Go to the Preview section. Use the Search for a record box to search for an item which should have variants.

This will display the list of recommendations for the selected item. They shouldn’t contain any variants.

Recommend deduplication vs Search deduplication (distinct)

Before introducing this new deduplication feature in Recommend, it was already possible to filter out variants from recommendations. This was a side effect of enabling the Search distinct feature and setting an attributeForDistinct in your index settings.

If enabled, your AI models would be trained with all variants. The recommendations would then be filtered by the API before sending you the results to remove variants.

This solution was not without issues and could result in inaccurate recommendations or even no recommendations at all in some cases.

Examples

The following examples illustrate how Recommend deduplication works. The index has records for t-shirts in different colors and sizes:

  • One red t-shirt in one size (XS)
  • Two green t-shirts in two sizes (S, M)
  • Three blue t-shirts in three sizes (L, XL, XXL)

This example uses the Related Products model to recommend the top 3 similar items with and without deduplication (with the color attribute configured as attributeForDistinct).

Without deduplication

Without deduplication, recommendations include variants, such as blue (XXL) or green (M), except for the red t-shirt which doesn’t have any.

Base Item Recommendation 1 Recommendation 2 Recommendation 3
red (XS) green (S) blue (L) blue (XL)
green (S) green (M) red (XS) blue (L)
blue (L) blue (XL) blue (XXL) green (M)

With deduplication

With deduplication, the recommendations don’t include any variants. But since this example dataset only includes three records, only two recommendations can be generated. If you add a new t-shirt, such as, an orange t-shirt, it will be included as the third recommendation.

Base Item Recommendation 1 Recommendation 2 Recommendation 3
red (XS) green (S) blue (L) N/A
green (S) red (XS) blue (L) N/A
blue (L) red (XS) green (M) N/A
Did you find this page helpful?