What is online retail merchandising? An introduction
Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...
Sr. SEO Web Digital Marketing Manager
Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...
Sr. SEO Web Digital Marketing Manager
It’s hard to imagine having to think about Black Friday less than 4 months out from the previous one ...
Chief Strategic Business Development Officer
What happens if an online shopper arrives on your ecommerce site and: Your navigation provides no obvious or helpful direction ...
Search and Discovery writer
In part 1 of this blog-post series, we looked at app interface design obstacles in the mobile search experience ...
Sr. SEO Web Digital Marketing Manager
In part 1 of this series on mobile UX design, we talked about how designing a successful search user experience ...
Sr. SEO Web Digital Marketing Manager
Welcome to our three-part series on creating winning search UX design for your mobile app! This post identifies developer ...
Sr. SEO Web Digital Marketing Manager
National No Code Day falls on March 11th in the United States to encourage more people to build things online ...
Consulting powerhouse McKinsey is bullish on AI. Their forecasting estimates that AI could add around 16 percent to global GDP ...
Chief Revenue Officer at Algolia
How do you sell a product when your customers can’t assess it in person: pick it up, feel what ...
Search and Discovery writer
It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the ...
Chief Product Officer
This 2-part feature dives into the transformational journey made by digital merchandising to drive positive ecommerce experiences. Part 1 ...
Director of Product Marketing, Ecommerce
A social media user is shown snapshots of people he may know based on face-recognition technology and asked if ...
Search and Discovery writer
How’s your company’s organizational knowledge holding up? In other words, if an employee were to leave, would they ...
Search and Discovery writer
Recommendations can make or break an online shopping experience. In a world full of endless choices and infinite scrolling, recommendations ...
Algolia sponsored the 2023 Ecommerce Site Search Trends report which was produced and written by Coleman Parkes Research. The report ...
Chief Strategic Business Development Officer
You think your search engine really is powered by AI? Well maybe it is… or maybe not. Here’s a ...
Chief Revenue Officer at Algolia
You looked at this scarf twice; need matching mittens? How about an expensive down vest? You watched this goofy flick ...
Sr. SEO Web Digital Marketing Manager
“I can’t find it.” Sadly, this conclusion is often still part of the modern enterprise search experience. But ...
Sr. SEO Web Digital Marketing Manager
May 5th 2016 product
Search relevance is always top of mind at Algolia. It is one of our differentiating factors, and we are always pushing ourselves to make sure our search engine is as relevant as possible.
One of the ways we ensure relevance is with our custom ranking feature, which is a very powerful tool if you know how to use it. One issue you may run into, however, is that custom ranking attributes that span a wide range of values may have too fine a granularity. Think for example of the number of views a photo might have. If you want to take multiple custom ranking attributes into consideration in order to get a good mix of results, you need to reduce the precision of this attribute or the other attributes may never be used.
To understand why, it’s important to revisit Algolia’s tie-breaking algorithm. Every time multiple records have the same score, we bucket them out based on the currently examined ranking factor and then create smaller and smaller buckets until we have exhausted our ranking factors.
If records have gotten through all of the textual relevance factors and are still tied, we take a look at custom ranking factors. Let’s say that our custom ranking is set up like this:
1. Photo views (descending), with our most popular photos having millions of views and new photos having 0
2. Number of likes (descending), with values ranging from 0 to thousands
3. Date added (descending)
Since we want the most popular photos to be displayed first, we will achieve this with our first factor. But this will, in most cases, be the only factor considered because the values for this attribute are so precise. Think about this—we have six videos tied in textual relevance with the following custom ranking attributes:
[{ objectId: 1, views: 1000, likes: 51, created_at: 1455473199 }, { objectId: 2, views: 5341, likes: 5, created_at: 473623681 }, { objectId: 3, views: 1000, likes: 10, created_at: 348862182 }, { objectId: 4, views: 10, likes: 0, created_at: 1447351798 }, { objectId: 5, views: 25, likes: 0, created_at: 1455905431 }, { objectId: 6, views: 9768, likes: 0, created_at: 1771524665 }]
In this case, the photos would now be ranked descending, based on number of views (9768, 5341, 1000, 1000, 25, 10). And since only two of them are tied in the same bucket (views equal to 1000), we only examine the second custom ranking criteria for those two photos. And, because the number of likes for those two photos is different, we never actually look at the created date at all.
If you just want like count and created date to be in the custom ranking as tie breakers, it doesn’t. But it matters a lot if you want your results to display a good mix of well-viewed photos, well-liked photos and new photos.
Because of the precision of the number of views attribute, you’re not much better off in this case than if you had only used this one attribute for your custom ranking.
Quite simply, we need to decrease the range of values by converting continuous values into discrete ones. We can do this in a few different ways, each with their benefits and drawbacks.
The first way to do this is to create tiers of these values. What this means is that you take your values and separate them into deciles, quartiles, centiles or any other equal tier that you desire. From there, you send to Algolia the tier each record belongs to. So our record would then look this (with 10 being the highest tier):
{ objectID: 1, views_tier: 10, likes_tier: 10, created_at: 1455473199 }
This can be done in-memory or in common databases and is best done with values that don’t change often.
Another easy way of creating tiers is to reduce the precision of the data itself in isolation of other values. For example, a date could be sent with values by day (20160119) or by hour (2016011922).
Another option is to take the log of the values, rounding down to the nearest integer. Whether it’s a natural log, log10 or anything else doesn’t matter much, which makes the calculation much simpler.
This also creates larger buckets at the high-end, which is valuable because there’s a much larger difference between 10 views and 1000 views than there is between 1,000,010 views and 1,001,010 views.
A final option is to create a custom score at indexing time. This isn’t really a great option because you lose a lot of what makes Algolia so powerful. We will go into the pros and cons of this approach in an upcoming blog post.
So what’s the right approach for your situation? It really depends on how often your data changes and how many pieces of data there are. With data that changes very often or with a large set of records, a logarithm might make more sense. For records where values are clumped closely together, perhaps a tiering system would work best. In general, we go with the logarithmic system, but give both a try and see what works best for you!
How to apply business logic for more relevant search results.
Powered by Algolia Recommend