Product

How do I look? — Image recommendation AI in practice
facebooklinkedintwittermail

Hey there! We’re Paul-Louis Nech and Jaden Baptista, a few of the team members here at Algolia. We’d like to show you something cool the team at Algolia has been working on.

Searching by vibe

To get things rolling, let us pose you this question: when you search for something on Google, do you expect that every result contains every single word of your query in that exact order? Probably not. Why? Because you’re not searching necessarily for keywords — that’s not how humans think, that’s how computers think. You’re expecting Google to find the most relevant result to your query, which doesn’t necessarily have to match any of the keywords in your query. For example, if you Google “soccer cups” in America, you’re not exactly expecting drinking vessels for a birthday party with little soccer balls on them. Yet, that’d probably be the first result if Google used a strict keyword search, since that’s the literal meaning of the query. Instead, you’re probably looking for something like the Wikipedia article called “List of association football competitions”, which has no actual keyword matches in the title.

What an AI came up with for a soccer cup… which is technically not far off.

What’s the point of that little exercise? Here’s the realization all search experts come to sometime: we’re almost never searching for exact keyword matches. Most of the time, we’re searching by vibe, and that involves quantifying the “vibe” of potential results by picking out patterns among the content they contain. Thanks to a lot of the technology Algolia has pioneered, this is a problem we know how to solve.

Applying what we know to image data

The next logical step, though, is to ask: Is textual content the only type of content where we can pick out these patterns and let our users “search by vibe”? There’s no reason why it should be. It makes sense that those patterns should exist in image data, and your own experience probably backs that up. Have you ever been looking for something online, but you didn’t quite have the right query to express it, so you scrolled for a bit through product pictures until you saw what you were looking for and it just clicked? If so, you experienced that type of pattern recognition: you saw in the product image some vague notion of similarity with the thought you had in your head. That’s common among users; they often report not knowing how to express what they’re looking for until they see it, suggesting that something latent in those images matches the image in their mind.

As you can imagine, the main barrier is that the technology to find those patterns in image data isn’t as well-developed as it is for textual data. Well, over the last few years, our colleagues at Algolia have invested in understanding more than textual data. From initial efforts on neural vectors at Search.io to the recent integration of image models, we worked on offering new applications for computer vision — the type of tech that quantifies those types of patterns in image data. Since, we’ve released that as a finished product that’s 100% available to you right now as a part of Algolia Recommend with no extra configuration. Add image URLs in your product or content data, and we’ll use that to generate the most effective product or content recommendations in your app.

Our goal was to make it so that you don’t have to be one of the world’s biggest e-commerce firms to make use of computer vision. It’s an expertise- and resource-intensive process, but we wanted to make it easily available to every small business and even every hobbyist. After all, humans are mostly visual creatures, so taking the intuitiveness of “search-by-vibe” and applying it to the medium we understand best represents a huge leap forward in technology, and our whole thing is passing on that technical advancement to everyone.

An example from Algolia

Here’s an example of what we’re talking about: this is a tool that lets you select pieces of art and find other pieces that match their vibe. It’s an Algolia in-house project, but it’s made with the publicly available Algolia Recommend.

Let’s look a bit about how that tool works. Take this example first:

Maria Maddalena by Giovanni Bellini

This is the thoughtful portrait of Mary Magdalene (Maria Maddalena in the original Italian) painted around 1500 by the Italian artist Giovanni Bellini as part of a larger religious image. What stands out to you about this piece? Even without a background in art, you might come up with adjectives like “dark”, “pensive”, even “brooding”. Sure enough, if you click on this painting in our discovery tool, you’ll see a good few recommendations that you’d probably describe the same way:

Anna Ónody, A Dancer Of The Ballet Choir Of The National Theater by Karoly Lotz, La Bella by Palma Vecchio, and Tzarina Natalia Alekseevna by Ivan Nikitin

It’s clear to everyone, not just trained professionals, what these images have in common. But it’d be tedious and prone to a whole host of issues to describe in detailed text what these images contain so as to make a text-based search and discovery system for these images. The patterns that make these images similar are going to be far more accurately deduced by a well-trained AI, whereas to us, it’s just vibes✨

You can see how those images differ from something more like this impressionist take of a street on New York’s Lower East Side in 1917:

Houston Street by George Luks

It’s beautiful, but in a completely different way. Sure enough, the recommendations don’t need textual matches to identify the group this piece belongs in — all of the recommendations just feel right:

Portdogue by Salvador Dali, Flower Market At La Madeleine by Edouard Cortes, The Rag Pickers by Robert Spencer

A note from Paul: Back when we first made this, I wanted to include it in a conference talk I was planning, but I actually lost several of the hours I had set aside to plan just playing around with this art discovery tool. The talk ended up being shorter than I was hoping.

Examples in the wild

This technology isn’t just for building cool demos, though. Take a look at Mosaic Natural — they’re a company founded by Syrian war refugees who develops beautiful stone mosaic art. They’re using Algolia to power their search and recommendations… but their chosen style of recommendations is customized to their specific use-case. Think about it: they’re selling art, the furthest thing from a generic product. The visual appeal of each individual piece is the entire selling point, so if a customer has shown interest in a particular piece, what kind of recommendations would you show to them? The answer is simple: visually similar options would capitalize on their interest and increase the chances of a conversion.

That is exactly what Mosaic Natural did. Take a look at this piece:

A mosaic of roses in a vase, framed and intended to be used as wall art.

Now scroll down to the Visually Similar Mosaics section.

The suggested recommendations include a wall art mosaic of white peonies in a vase, a wall art mosaic of pink peonies in a vase, and a tricolor leaves mosaic rug, as well as three more pages of similar options. Clearly, someone who was looking at the first product would be more likely to be interested in the recommendations. Why? Well, it’s sometimes hard for us to articulate. Looking through the options, they mostly include flowers. Still, that’s not a definitive rule, since not all recommendations contain flowers and not all non-recommended products don’t contain flowers. The similarity is the vibe, and Mosaic Natural really loves this one.

They do something similar at DanGuitar.dk, a Danish musical instrument vendor. They’re using Algolia to power their search and discovery as well, and they’re using this type of image recommendation AI too.

Note from Jaden: I really appreciate how DanGuitar is using this tech in a more subtle way, not advertising that the recommendations are based on the image. As a musician who hangs out with a lot of other musicians, I’ve learned that it’s more common than you’d think that people buy instruments based almost completely on outward appearance. The color and finish of an electric guitar, for example, is much more likely to be interesting to an entry-level buyer than, say, the electronics (the part that actually contributes to the sound). That’s especially true if the buyer is actually getting the instrument as a gift for someone else, a fairly common scenario for online instrument stores like this. It’s only really when a musician becomes very advanced that they become particular about the distinct feel and sound of an instrument. Basing recommendations off of visual similarity instead of technical or auditory similarity is a really smart marketing move on their part.

Take a look at this product:

The DanGuitar.dk listing for a Les Paul-style electric guitar. What’s most significant to our research here is the product color, a transparent fade from red to orange called a “sunburst”, very common on guitars.

And the recommendations:

While those recommendations are of different brands (Santana, Sant, Epiphone, Taylor), styles (electric, acoustic, and even classical Spanish), and price points (from $145 USD to over $2,000 USD), they share one feature: their finish. In fact, the first recommendation that isn’t sunburst is way down at spot 11 in the list.

See how useful this tech can be in the real world? Whenever you’re selling something where looks are a key factor, your recommendation system needs to take that into account. It’s a matter of taking one of our existing models, already pretrained to find patterns and categorize data into efficient structures, and use it in your product and customer data. As long as you include image URLs in your product listings in your search index, Algolia Recommend will take care of all of this for you. The best part is that it even auto-updates, since we don’t store those images on our servers anywhere. If you change the URL in the product listing, or if you update the image at that address, Recommend will automatically use the new image.

A more philosophical reasoning

When we talked about writing this article, one point that kept coming up was the psychological and sociological impact of searching by vibe. After all, we humans aren’t built to process huge amounts of data in a reasonable timeframe, so if it wasn’t for search and discovery systems trying to match our human way of thinking, how would we ever find anything on the Internet? Amazon alone gives you access to 350 million products — we’d be completely helpless.

In such a complex world where we’re constantly overloaded with information, image-based AI recommendation systems have a humanistic impact. They connect us to what we want to find quicker, and they show us content or products that we didn’t even know we wanted until it was shown to us. Sure, it’s a great revenue-boosting tool for e-commerce companies, but it also raises the quality of life of the average user by juuuuust a little bit every time, because the time they saved can be put to better use.

Ready to get started using this in your app? We’d love to give you a hand. Just sign up here and follow the instructions to set up your first search index and start churning out recommendations. And if you’re curious about what else our AI product recommendation engine can do, check out the docs here.

About the authorsPaul-Louis Nech

Paul-Louis Nech

Senior ML Engineer
Jaden Baptista

Jaden Baptista

Recommended Articles

Powered by Algolia AI Recommendations

Unlock the power of image-based recommendation with Algolia’s LookingSimilar
Engineering

Unlock the power of image-based recommendation with Algolia’s LookingSimilar

Raed Chammam

Raed Chammam

Senior Software Engineer
4 questions to ask for relevant search results
Product

4 questions to ask for relevant search results

Jaden Baptista

Jaden Baptista

Technical Writer
How and why events drive search ROI
Product

How and why events drive search ROI

Jaden Baptista

Jaden Baptista

Technical Writer
Ben Franz

Ben Franz

Sales Engineering and Product Leader at Algolia