Introducing new developer-friendly pricing
Hey there, developers! At Algolia, we believe everyone should have the opportunity to bring a best-in-class search experience ...
VP of Product Growth
Hey there, developers! At Algolia, we believe everyone should have the opportunity to bring a best-in-class search experience ...
VP of Product Growth
Eye-catching mannequins. Bright, colorful signage. Soothing interior design. Exquisite product displays. In short, amazing store merchandising. For shoppers in ...
Search and Discovery writer
Ingesting data should be easy, but all too often, it can be anything but. Data can come in many different ...
Staff Product Manager, Data Connectivity
Everyday there are new messages in the market about what technology to buy, how to position your company against the ...
Chief Strategic Business Development Officer
Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...
Sr. SEO Web Digital Marketing Manager
It’s hard to imagine having to think about Black Friday less than 4 months out from the previous one ...
Chief Strategic Business Development Officer
What happens if an online shopper arrives on your ecommerce site and: Your navigation provides no obvious or helpful direction ...
Search and Discovery writer
In part 1 of this blog-post series, we looked at app interface design obstacles in the mobile search experience ...
Sr. SEO Web Digital Marketing Manager
In part 1 of this series on mobile UX design, we talked about how designing a successful search user experience ...
Sr. SEO Web Digital Marketing Manager
Welcome to our three-part series on creating winning search UX design for your mobile app! This post identifies developer ...
Sr. SEO Web Digital Marketing Manager
National No Code Day falls on March 11th in the United States to encourage more people to build things online ...
Consulting powerhouse McKinsey is bullish on AI. Their forecasting estimates that AI could add around 16 percent to global GDP ...
Chief Revenue Officer at Algolia
How do you sell a product when your customers can’t assess it in person: pick it up, feel what ...
Search and Discovery writer
It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the ...
Chief Product Officer
This 2-part feature dives into the transformational journey made by digital merchandising to drive positive ecommerce experiences. Part 1 ...
Director of Product Marketing, Ecommerce
A social media user is shown snapshots of people he may know based on face-recognition technology and asked if ...
Search and Discovery writer
How’s your company’s organizational knowledge holding up? In other words, if an employee were to leave, would they ...
Search and Discovery writer
Recommendations can make or break an online shopping experience. In a world full of endless choices and infinite scrolling, recommendations ...
Mar 7th 2023 ai
It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the vast amount of data they must manage: millions of stock-keeping units (SKUs) divided into hundreds of different categories; terabytes of data related to product descriptions and images; hundreds, if not thousands, of different vendors publishing content; and user-generated content (e.g., reviews) that has minimal quality control.
This is one of the areas in which machine learning can really shine by improving content discovery for even the largest B2C or B2B marketplace catalogs. In this blog, I’ll share why we’re so bullish on how AI (artificial intelligence) can power search and optimize discovery across marketplaces.
Often shoppers know exactly what they want, sometimes they’re searching for ideas for a possible solution. On marketplaces especially, the journey begins on the search bar.
The Baymard Institute, which studies ecommerce UX, identified various types of queries that buyers input including:
With such a broad array of query types, search engines can struggle to interpret them and suggest the most relevant or correct choices. To add even more complexity, language is often ambiguous. “Bank” can mean a financial institution or the side of a river. The order of words in a query can matter, too. Buyers don’t always type words in the sequence they intended, use misspelled words, or use long tail phrases to describe what they want. A search for “fly fishing” is very different from “fishing fly”.
This problem is further compounded because items in marketplaces are often placed in one or more categories, such as the safety vests shown below. Search engines would ideally deliver the right product(s) from the correct categories, but for that, they need to understand intent.
The search bar needs to handle all these kinds of queries and return results instantly. Features like autocomplete can improve the process, but autocomplete is only as good as the algorithms it’s trained on.
Broadly speaking, there are three stages of search: understanding the query, retrieving the data, and ranking the results.
Search engines have introduced very sophisticated techniques for query understanding using natural language processing (NLP). On the back end of the process, ranking is influenced by a variety of factors, from merchandising and personalization, to clicks, recent conversions and other signals. The hardest part to date, however, has been retrieval.
For the last two decades, retrieval has been managed through keyword search techniques that perform lookups to match keywords to their location in the search index. More sophisticated information retrieval techniques such as BM25 improve relevance for a given query. Even with improved query matching, however, site owners were still required to add rules, synonyms, keywords, and language packs to manage errors and avoid the dreaded no-results page.
Take the simple example of a search for a USB-C cable. The query can be submitted as “usbc” vs “usb-c” or “usb c”. Searches for these different variations can return many results, no results, or mismatched results. There are workarounds to these problems, but they can be time-consuming and never-ending.
A recent innovation to address these problems has been the introduction of vectors embeddings and the use of deep learning to better understand the searcher’s intent. Keywords (and their associated tokens) are relatively binary in respect to search, particular words either exist or they do not. In contrast to keywords, AI search uses the mathematics of vectors to allow for the measurement of closeness (e.g., terminal devices and HVAC are in close proximity in vector space), thus the relationship of text is no longer binary but rather a distribution. Vectors can use hundreds, sometimes thousands, of dimensions to determine meaning.
Whereas vector search is very good at delivering results based on similarity, keyword search still provides tremendous value for certain types of queries. When we marry these two technologies together into a single query result, this “hybrid search” offers even more relevance. Results are ordered from most to least relevant based on the combination of keyword and AI scoring. The optimized results are then sent into the ranking stage for a final sort based on machine and user-defined rules. However, to do this at scale requires a fresh approach to the data.
I’ve left out one of the most important pieces to this: optimizing vectors for marketplace scale. Scale and speed are absolutely vital for great results. Amazon found that every 100ms of latency cost them 1% in sales, Akamai found a similar result where 100 millisecond delays hurt conversion rates as much as 7%. (source)
Vectors are large floating point numbers that must be processed by specialist GPUs or high-end servers. Some of the most prominent methods for finding similarity between vectors include HNSW (Hierarchical Navigable Small World), IVF (Inverted File), and PQ (Product Quantization, a technique to reduce the number of dimensions of a vector). Each technique is designed to enhance a specific performance attribute, such as memory reduction with PQ or rapid but accurate search times with HNSW and IVF. To get best performance for a given use case, it is common practice to combine numerous components to create a ‘composite’ index.
While these techniques are quite good, they can be extremely expensive to implement and run, as well as ‘brittle’ with any changes to the index. We took a different approach. We use neural hashing to compress, or binarize, vectors to 1/10th their normal size. The new binary vectors are 500 times faster to compute than non-optimized vectors making them as fast, sometimes faster, than simple keyword search. It can be run on commodity hardware and CPUs so marketplaces can deliver results to their customers instantly without surprise fees.
The advantage to this approach is particularly evident when it comes to marketplace scale. Ecommerce businesses and marketplaces often update their search index daily or even hourly with new products, customer reviews, inventory changes, changing search trends, etc. Thousands, sometimes millions of changes occur. With our approach, results are automatically adjusted and re-optimized for conversion in near real-time.
Neural hashes combined with keywords offers the best of both worlds by optimizing all queries, from head queries to the long tail. With tight budgets, thin margins, and a global scramble to find machine learning engineers and build data science teams, most businesses can’t afford to build this type of hybrid AI retrieval in-house. We designed our solution as a composable, API-first solution for any business to deliver great results out-of-the-box. Sign up here to learn more.
After retrieving relevant results, a search engine needs to rank them. Learning-to-rank (LTR) is a type of machine learning that improves ranking and assists with precision. It includes supervised, unsupervised, and reinforcement learning. There are also variations like semi-supervised learning. Each of these solutions offers AI ranking capabilities to deliver improved results over more simpler statistical methods.
Positive reinforcement such as clicks, conversions, signups, ratings, etc., can be used to improve ranking automatically. At Algolia, our Dynamic Re-ranking (DRR) capability mainly uses clicks as there is significantly more available data (faster to get to higher confidence), but we also use later events if there is sufficient data. Results can be further refined with personalization, merchandising, and other curated results. While customers can rely on the automation, they still have control to display or rerank results however they want.
We highly encourage organizations to connect as much business data as possible so search results can be optimized automatically. In fact, true AI search is also continually learning from clickstream data. There’s a smart interplay between AI-powered retrieval and ranking.
It took Amazon, eBay, and marketplace providers decades and thousands of specialist engineers to build scalable artificial intelligence for search. With our hybrid AI solution, Algolia NeuralSearch™ Search and Discovery, it can now be accomplished by anyone in minutes. Better results improve user experience, overall conversion rates, customer lifetime value, segment growth, and other KPIs.
Algolia NeuralSearch™ Search and Discovery is coming soon! Sign up today to be the first to evaluate Algolia NeuralSearch for your marketplace.
Powered by Algolia Recommend