Picture the e-commerce landscape in its early days. Businesses were just beginning to discover the power of personalization. They'd divide their customers into segments based on shared traits like age, location, or past shopping habits. And then, companies would adjust their marketing strategies to suit these segments.
If a customer falls into the 'young adults' segment, the business might prioritize showing products popular among other young adults. This can make the search results more relevant to the customer and increase the likelihood of a purchase.
Similarly, if a customer belongs to a 'luxury shoppers' segment, the e-commerce platform might recommend high-end, luxury products to them. And if a customer belongs to a 'discount seekers' group, the business might recommend items currently on sale or with significant discounts.
It was simple, efficient, and it worked - to an extent. Just like we'd group our friends based on common interests: some might be great companions to go on a summer vacation with their kids and others might be more free-spirited, down to clubbing all night long. Try putting them together and you might have a perfect storm.
Or, consider this: you share your Netflix account with your sibling or roommate. Even though you both enjoy watching shows on Netflix, your preferences could be wildly different. You might be a fan of thrilling crime documentaries, while they might lean towards light-hearted romantic comedies. Since you’re both “grouped” under the same account, the recommendations would be a mishmash of your combined viewing habits.
Suddenly, your Netflix homepage might be suggesting you "Bridgerton" when you're really in the mood for "Mindhunter". This illustrates how segment-based personalization can fall short in providing truly individualized recommendations. It's akin to trying to make a one-size-fits-all shirt fit perfectly on everyone—it's not going to work.
Now, let's fast forward a bit. With the help of machine learning, businesses today are creating a shopping experience that is as unique as each customer. Imagine walking into a store where everything on the shelves was picked out just for you. That's hyper-personalization, 1:1 personalization, everything just for you - a virtual personal shopper who knows your tastes and shopping habits down to a tee. It is a game-changer, completely reshaping how businesses interact first time visitors, returning users or repeat customers.
AI is the superhero that makes this transition possible. Imagine having a superpower that allowed you to detect patterns from a pile of user data. That's what deep learning does for your e-commerce website or application, seeking to truly understand user intents and act on that understanding by providing a tailored experience.
But here's the catch, personalization needs to:
Not really a walk in the park, is it? It requires investment in technology and skills to analyze data, but first and foremost ability to collect the right type of data. In the context of personalization systems we’re mostly referring to user-centric data: behavioral, events, clickstreams – that’s what’s needed to build a centralized user profile that can be then leveraged to personalize user experiences across channels.
True personalization starts with the user. Not with the content (pages, products, etc.). That’s why, I’d like to invite you on a journey … the end-user journey and understand the 5Ws of orchestrating a modern personalization experience: who, what, when, where and why.
The first question to ask is: What is the new vs. returning visitors rate?
This is the key starting point for understanding where we should focus our personalization strategy. There’s this (misguided) expectation that new users should have the same level of personalized experience as returning users, when in fact new users have little to no interactions. How can you personalize an experience if you don't actually have the slightest indication of their preferences?
At the minimum, location and device data can be collected for first time visitors: city, browser, device brand, operating system, screen resolution. And when their session is a bit longer or they have a rich session in terms of interactions (clicks, views, conversions, etc.) then real-time in-session personalization can be applied.
Yes, new visitors might come to your website in greater numbers but returning visitors are more likely to be responsive to personalization and their conversion rates or average order value (AOV) tend to be higher. That’s why, especially when starting from scratch, devising a personalization plan that targets first and foremost returning users is a smart strategy.
Let’s take another step forward and dissect the returning user types.
A visitor can interact with your website or application repeatedly in any given timeframe (hours, days, weeks, months, etc.), even if that individual didn’t create an account with your business. That in itself might be a signal of their weak intention to engage with your business. Or it might be just that the user didn’t authenticate out of commodity.
Regardless, this is the first state of the visitor that can actually feed valuable signals to our personalization system: time spent on the website, number of sessions, number of pageviews, bounce rate, etc. Of course, depending on how your website is built, they might even convert without creating an account.
The minute visitors authenticate, they become users. They might convert or not, they might be first time visitors that decided to trust your business and create an account - a great indicator of their intent for more: to return to your website/app in the future or even to place an order.
What’s certain is that we now have even more signals that we can utilize to orchestrate a meaningful experience for our newly acquired user: name, mobile number, e-mail address, birthdate, occupation, gender, etc.
This is where we’re stepping into the weeds of personal data and how it should be treated in the context of respecting user privacy laws. While an important topic, it’s not something that we’ll be covering in this article. (See this article for more on user privacy.)
Finally we have users that purchased and, thus, graduated to the “buyer cum laude” status. Visitors can convert in their first session or in any of the next ones and it’s important to understand the richness of their profile and what it can mean for their subsequent interactions with our website/application.
At this stage they’ve explicitly indicated strong preferences when it comes to the products/items we’re selling. Based on their previous orders, especially if they’re repeat buyers, we can tell with decent certainty if they like a certain brand, the size they’re wearing, favorite colors/patterns, etc. Or in the case of groceries, any dietary preferences.
Properties we might know at the moment a new or returning user lands on the website

Now that we have a better understanding of the type of users we’re dealing with and their distribution, let’s move to the next question: what is the best personalization strategy for each of these high-level segments?

From simple to complex, the personalization maturity model consists of: content based strategies (non-personalization), orchestrating experiences for segments of users (weak personalization) and finally AI-powered personalization or 1:1 personalization, also known as hyper-personalization, the pinnacle of customer experience in any industry.
This is not a personalization strategy per se because, in fact, we’re not doing any of that at this step. Meaning, we’re not targeting the individual user, at the right time, with the right content. Instead, the experience is based on popularity and/or manually curated content. This approach is mostly suited for scenarios where we do not have enough data on our users at the individual level, but we’re still looking to tailor the experience of our first time visitors as much as possible.
We can divide content-based mechanisms in two categories: (1) those that only need the product catalog (images, titles, descriptions or attributes) and (2) those that need additional aggregated user data (clicks, views, purchases, etc).

Output similar images, articles or products based on a given image, article or product. In this case there's no user interactions (clicks, views, conversions) being taken into account.
Python
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
# Initialize data
data = {
'title': [
'Product 1', 'Product 2', 'Product 3', 'Product 4', 'Product 5',
'Product 6', 'Product 7', 'Product 8', 'Product 9', 'Product 10'],
'description': [
'This is the first product.',
'This is the second product.'
'This is the third product.'
'This is the fourth product.
'This is the fifth product.',
'This is the sixth product.',
'This is the seventh product."
'This is the eighth product.
'This is the nineth product.',
'This is the tenth product.'
],
'attributes': [
'Size: Small, Color: Blue, Material: Cotton',
'Size: Medium, Color: Red, Material: Silk',
'Size: Large, Color: Green, Material: Wool',
'Size: Extra Large, Color: Black, Material: Leather',
'Size: Small, Color: White, Material: Linen',
'Size: Medium, Color: Blue, Material: Linen',
'Size: Small, Color: Black, Material: Silk',
'Size: Large, Color: White, Material: Leather',
'Size: Extra Large, Color: Red, Material: Cotton',
'Size: Medium, Color: Green, Material: Wool'
]
}
# Create DataFrame
df = pd.DataFrame(data)
# Print DataFrame
print(df)
# Assuming you have a DataFrame df with 'title', 'description', and
'attributes' as columns
# In a realistic scenario, the data could be more complex and you might need to
perform data cleaning.
# Combine the 'title', 'description', 'attributes' columns into a single
'content' column
df['content'] = df['title']+' '+df['description']+' '+df['attributes']
# Initialize the TfidfVectorizer
tfidf = TfidfVectorizer (stop_words='english')
# Construct the required TF-IDF matrix by fitting and transforming the data
tfidf_matrix = tfidf.fit_transform(df['content'])
# Compute the cosine similarity matrix
cosine_sim = cosine_similarity (tfidf_matrix, tfidf_matrix)
# Function that takes in product title as input and outputs most similar
products
def get_recommendations (title, cosine_sim-cosine_sim):
# Get the index of the product that matches the title
indices = pd.Series (df.index, index=df['title']) .drop_duplicates()
idx = indices [title]
# Get the pairwise similarity scores of all products with that product
sim_scores = list(enumerate (cosine_sim[idx]))
# Sort the products based on the similarity scores
sim_scores = sorted (sim_scores, key-lambda x: x[1], reverse=True)
# Get the scores of the 3 most similar products
sim_scores = sim_scores [1:4]
# Get the product indices
product_indices = [i[0] for i in sim_scores]
# Get the titles of the top 3 most similar products
titles = df['title'].iloc[product_indices]
# Create a DataFrame that contains the indices, scores, and titles
result = pd.DataFrame({
'index': product_indices,
'Score': sim_scores,
'title': titles
})
# Return the result
return result
# Test the function
print(get_recommendations ('Product 3'))
On the other hand, in an item-based collaborative filtering approach we have to determine the set of items that are most similar to the target item based on aggregated user interactions. The idea is to calculate the similarity between each pair of items according to how many clicks/views/purchases users have generated when interacting with both of them.

Python
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
# Given data
data = {
'user_id': ['user1', 'user2', 'user', 'user3', 'user2', 'user', 'user3', 'user1', 'user4', 'user2'],
'product_title': ['Product A', 'Product B', 'Product C', 'Product A',
'Product C', 'Product B', 'Product D', 'Product D', 'Product A', 'Product D'],
'clicks': [5, 6, 3, 8, 5, 7, 4, 6, 8, 7]
}
df = pd.DataFrame(data)
# Create a user-item matrix
user_item_matrix = df.pivot_table(index='user_id', columns='product_title',
values='clicks').fillna(0)
# Compute item similarity matrix
item_similarity = cosine_similarity (user_item_matrix.T) # .T transposes the matrix to switch users with items
item_similarity_df = pd.DataFrame(item_similarity,
index=user_item_matrix.columns, columns=user_item_matrix.columns)
def recommend_products (product_title, item_similarity=item_similarity_df):
# Get similarity scores for the product
similarity_scores = item_similarity[product_title]
# Sort in descending order
similarity_scores = similarity_scores.sort_values (ascending=False)
# Return the top 3 most similar products return similarity_scores [1:4]
print (recommend_products('Product A'))

Of course, there’s also the scenario where both content & aggregated user features are taken into consideration to generate a list of similar images, articles or products. A more comprehensive explanation of popular models and techniques for recommender systems can be found in one of our previous blog posts.
Remember, when we described above the new visitors we concluded that unless they have interacted in a meaningful way with our website (clicks, views, purchases, etc.) we cannot truly kick off real-time in session personalization for them.
In fact, in case of purely content-based scenarios, real-time is not even a factor that plays into the quality of the experience for new users because much of the content that is going to be displayed is popularity-based or manually curated which doesn’t change in a meaningful way when real-time interactions are included. Of course, that’s not the case for segment-based or AI-based scenarios, which will be detailed in the next section.
So, we can rely on the following content-based methods of engaging new users:
When no user-data is available (see code sample #1 above):
On product detail pages:
When aggregated user-data is included (see code sample #2 above):
On product detail pages:
On the search results page:
Content-based strategies can be applied for returning users whether they’re authenticated or not or whether they previously bought something or not. However, because we know more about these users at the individual level, it makes more sense to explore segment-based and AI-based personalization strategies as they can produce better results.
As a rule of thumb, applying only content-based strategies to returning users is a sub-optimal strategy because it does not utilize the explicit/implicit information that users have provided.
Let's consider a bookstore owner who knows some of her regular customers very well. A content-based strategy would be like recommending books solely based on bestsellers, regardless of whether that customer is a regular or a newcomer.
For example, the owner might recommend fantasy novels by George R. R. Martin just because he’s a trending author that week. This strategy can work for all customers, regulars or newcomers, since it doesn't rely on knowing anything more about them individually other than the fact that they entered the bookstore.
However, for the regular customers, the owner has a lot more information - she knows their tastes, their preferred authors, and genres, how often they buy new books, and so on. Using a segment-based or AI-based personalization strategy would be like using this additional information to tailor her recommendations. For example, she might know that a particular regular customer enjoys mystery novels but also has a passion for historical non-fiction. So, she could recommend a historical mystery novel that a pure content-based strategy might miss.
While we should not completely disregard it, we must understand that relying solely on the content-based methods for these repeat customers would be a sub-optimal strategy. It's like the bookstore owner ignoring all the additional information she knows about her regular customers and treating everyone as new visitors, as strangers. While she might still recommend a book the customer enjoys, she's missing out on the chance to give an even more tailored recommendation that the customer might appreciate more.
Once we have access to individual users’ data (clicks, views, purchases, etc.) we can begin thinking about offering a more personalized experience. The first level of personalization is the segmentation-based approach.
Associating a segment with a new user that is landing on the website/app implies that there’s already a predefined segment (or more) in our system. Once the required user attributes have been correctly identified, the user can be linked to one or more segments and different actions can be triggered based on that.
Let’s take the below user profile as an example; as the user123 navigates around and interacts with products, it generates clicks, views and even purchases.
JavaScript
{
"user_id": "user123",
"activity": {
"clicks": [
{
"product_id": "product567",
"title": "Smartphone X",
"description": "Latest model with 5G connectivity and high resolution camera",
"price": 999.99,
"timestamp": "2823-87-24T14:30:00"
},
{
"product_id": "product891",
"title": "Sports Watch Y",
"description": "Water resistant with multiple sport modes",
"price": 199.99,
"timestamp": "2023-07-24T15:00:00"
}
],
"views": [
{
"product_id": "product234",
"title": "Novel Z",
"description": "Bestselling fiction novel by renowned author",
"price": 14.99,
"timestamp": "2023-87-24T13:45:00"
},
{
"product_id": "product567",
"title": "Smartphone X",
"description": "Latest model with 56 connectivity and high resolution camera",
"Price":999.99,
"timestamp": "2023-07-24T14:15:00"
}
],
"purchases" : [
{
"product_id": "product891",
"title": "Sports Watch Y",
"description": "Water resistant with multiple sport modes",
"price": 199.99,
"timestamp": "2023-87-24T15:38:00"
}
]
},
"affinities":["Electronics", "Books", "Fashion", "Sports", "Home Decor"]
}
Say we’ve previously defined the following segments:
JavaScript
{
"segment1": {
"name": "High Value Electronics Enthusiasts",
"description": "Users who have viewed or purchased high-priced electronic items",
"definition": {
"affinities": ["Electronics"],
"activity": {
"views": {
"price_range": [500, 1000]
},
"purchases" : {
"price_range": [500, 1000]
}
}
}
},
"segment2": {
"name": "Budget-Conscious Book Readers",
"description": "Users who have viewed or purchased medium-priced books” ,
"definition": {
"affinities": [ "Books"],
"activity": {
"views": {
"price_range":[10, 25]
},
"purchases": {
"price_range":[10, 25]
}
}
}
}
}
Given the above segment1 and segment2 definitions let’s say that the following actions are set to be triggered on the website:
The user profile is empty the first time a new user lands on the website. Once there’s more activity (clicks & views) the user can be associated with a certain segment and an action can be triggered.
JavaScript
{
"user_id": "user123",
"Segments":[
{
"segment_id":"segment1"
}
]
}
In its first session our user123 has been clicking on a few products priced between $500 and $1,000 which are associated with segment1. When that happens, based on the actions defined above, the user is invited to sign-up and receive a discount for the first purchase. As a result user123 ends up purchasing an item.
No other users have seen this prompt unless they were satisfying the condition of being new users part of segment1. At the same time, because user123 is not a returning user yet, the 2nd action described above (free shipping banner) is not triggered.
However, upon revisiting the “Books” category books priced $10 - $25 are boosted on top for user123 because the 3rd action described above is triggered.
| user123 | Action #1 | Action #2 | Action #3 |
|---|---|---|---|
| Condition | new user & segment 1 | returning buyer & segment1 | segment2 |
| Triggered | Yes | No |
Yes |
A few days later user123 returns to the website and this time he’s greeted with a free shipping banner because he’s part of segment1.
| user123 | Action 1 | Action 2 | Action 3 |
|---|---|---|---|
| Condition | new user & segment1 | returning buyer & segment1 | segment2 |
| Triggered | No | Yes | Yes |
Additionally, when he's revisiting the "Books" section he's once again presenting with a list of books that have been boosted based on his pricing preferences. Notice that segment2 is not dependent on the type of user (new or returning), thus this action is triggered for both scenarios.
A question you might ask is: can we embed the user type within the definition of our segments? Short answer: of course, that's a valid approach. The only nuance is that we'd end up with three segments instead of two and of course the 3rd action should be tied to segment3.
AI/ML makes hyper-personalization possible and in turn 1:1 personalization makes human intervention optional. As opposed to manual segmentation, AI-powered personalization doesn’t require a merchandiser to define the conditions under which personalization kicks in.
Take for example the same user as above (user123). In the context of 1:1 personalization, when conducting a search the results would immediately be re-ranked based on his affinities. The same would be true for the rest of the users that have a valid user profile. Thus, even when they use the exact same keywords to search, each user will see slightly different results. Or to be more precise: the results will be re-ranked based on their preferences. That’s true personalization.
As we’ve seen with the other strategies, there are nuances when it comes to applying personalization for new users vs. returning users.
1:1 personalization for new users implies real-time personalization. But there’s a caveat to that: as we’ve seen in the previous sections, there’s not much you could know about the user in the first second they landed on your website as far as 1:1 personalization goes. Yes, you might have access to location & device, but that doesn’t say anything about the user's individual preferences.
Would a 60-second session make any difference in the richness of the user profile?
Maybe: if the user actually interacts with the website. Otherwise, it’s back to content-based strategies instead of 1:1 personalization. Then, the question is: how many interactions are enough to build a decent user profile?
Can we start personalizing after the 1st click?
Not really because that’s when you bump across over-personalization - content that’s being boosted is overly narrow and lacks diversity. This can lead to several issues:
One way to take into consideration the end-user explicit feedback on whether they’re ready to receive a personalized experience or not is to actually include the option in the search bar, a pattern that is used by Instagram:
Based on our experience, it would seem that the best user profiles in the context of real-time personalization are those that balance session duration with volume & depth of interactions. That’s what we’d call a personalization-ready session capable of generating decent user profiles that in turn can represent the basis of a 1:1 personalization strategy.
Here’s a simple user profile that is generated based on a user's product clicks and views. It's important to note that this is a simple representation and real-world scenarios would require more sophisticated techniques:
JavaScript
// Sample user activity
data.let userActivity = {
"user123": {
"clicks": [
{"productId": "product567","attributes":["Electronics", "Smartphone"]},
{"productId": "product891","attributes":["Sports", "Watch"]
}],
"views": [
{"productId": "product234","attributes": [ "Books", "Novel"]},
{"productId": "product567","attributes":["Electronics", "Smartphone"]}
]
}
}
// Calculate user profile
function calculateUserProfile(userId) {
let userProfile = {"userId": userId,"affinities": []};
let user = userActivity[userId];
if(user) {
let activity = [...user.clicks, ...user.views];
let attributeCount = {};
for(let action of activity) {
for (let attribute of action.attributes)
if(attributeCount[attribute]) {
attributeCount[attribute]++;
}
else {
attributeCount [attribute] = 1;
}
}
}
userProfile.affinities = Object.entries (attributeCount)
.sort((a, b) => b[1] - a[1])
.map (item => item[0]);
}
return userProfile;
}
console.log(calculateUserProfile("user123"));
The resulted user profile can be then passed to Algolia for personalizing search results in real time:
JavaScriptconst
algoliasearch = require('algoliasearch');
// Connect to Algolia
const client = algoliasearch('YourApplicationID', 'YourAdminAPIKey");
const index = client.initIndex('products');
// User profile from previous code
const userProfile = {
"userId": "user123",
"affinities": ["Electronics", "Smartphone", "Books", "Novel", "Sports","Watch"]};
// Generate optional filters based on user profilelet optional
Filters = userProfile.affinities.map(affinity =>`category:$(affinity}');
// Search Algolia with user filters
index.search(
'query string',
{optionalFilters: optionalFilters}
).then((( hits }) => {
console.log(hits);
}).catch(err => {console.log(err):});
1:1 personalization truly shines for non-authenticated, authenticated and repeat buyers because these types of users tend to convert better than new users. In the previous section I gave the example of personalizing search results. Let’s look at how we might personalize recommendations for returning users.
recommend = require('@algolia/recommend');
// Algolia credentials and settings
const appId = 'Your ApplicationID";
const apiKey='Your APIKey";
const indexName = "Your IndexName';
// Initiate Algolia client
const client = algoliasearch (appId, apikey);
const recommendClient = recommend(client);
// User profile from previous code
const userProfile = {
"userId": "user123",
"affinities": [
"Electronics",
"Smartphone",
"Books",
"Novel",
"Sports",
"Watch"]};
recommendClient.getRelatedProducts({
indexName,model: 'bought-together', // The model can be bought-together or viewed-together
objectID: [currentObjectID], //Replace with an actual objectID from yourdata
maxRecommendations: 5,
queryParameters: (
// Only display products matching the selected affinities
facetFilters: userProfile.affinities.map(affinity => `category:${affinity}`)}}
).then(({recommendations }) => {
console.log(recommendations);
}).catch(err => {
console.log(err);
});
1:1 personalization mostly applies to search, browse or recommendations. But it’s not limited to that. You can make use of it for sending intelligent push notifications or email messages. Or even in conversational interfaces like chatbots - explore some of these use-cases in the Algolia DevCon ‘23 live coding session on a generative AI e-commerce framework to assist with “long tail” merchandising.