Search by Algolia
Feature Spotlight: Query Rules
product

Feature Spotlight: Query Rules

You’re running an ecommerce site for an electronics retailer, and you’re seeing in your analytics that users keep ...

Jaden Baptista

Technical Writer

An introduction to transformer models in neural networks and machine learning
ai

An introduction to transformer models in neural networks and machine learning

What do OpenAI and DeepMind have in common? Give up? These innovative organizations both utilize technology known as transformer models ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What’s the secret of online merchandise management? Giving store merchandisers the right tools
e-commerce

What’s the secret of online merchandise management? Giving store merchandisers the right tools

As a successful in-store boutique manager in 1994, you might have had your merchandisers adorn your street-facing storefront ...

Catherine Dee

Search and Discovery writer

New features and capabilities in Algolia InstantSearch
engineering

New features and capabilities in Algolia InstantSearch

At Algolia, our business is more than search and discovery, it’s the continuous improvement of site search. If you ...

Haroen Viaene

JavaScript Library Developer

Feature Spotlight: Analytics
product

Feature Spotlight: Analytics

Analytics brings math and data into the otherwise very subjective world of ecommerce. It helps companies quantify how well their ...

Jaden Baptista

Technical Writer

What is clustering?
ai

What is clustering?

Amid all the momentous developments in the generative AI data space, are you a data scientist struggling to make sense ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is a vector database?
product

What is a vector database?

Fashion ideas for guest aunt informal summer wedding Funny movie to get my bored high-schoolers off their addictive gaming ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Unlock the power of image-based recommendation with Algolia’s LookingSimilar
engineering

Unlock the power of image-based recommendation with Algolia’s LookingSimilar

Imagine you're visiting an online art gallery and a specific painting catches your eye. You'd like to find ...

Raed Chammam

Senior Software Engineer

Empowering Change: Algolia's Global Giving Days Impact Report
algolia

Empowering Change: Algolia's Global Giving Days Impact Report

At Algolia, our commitment to making a positive impact extends far beyond the digital landscape. We believe in the power ...

Amy Ciba

Senior Manager, People Success

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve
e-commerce

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve

In today’s post-pandemic-yet-still-super-competitive retail landscape, gaining, keeping, and converting ecommerce customers is no easy ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Algolia x eTail | A busy few days in Boston
algolia

Algolia x eTail | A busy few days in Boston

There are few atmospheres as unique as that of a conference exhibit hall: the air always filled with an indescribable ...

Marissa Wharton

Marketing Content Manager

What are vectors and how do they apply to machine learning?
ai

What are vectors and how do they apply to machine learning?

To consider the question of what vectors are, it helps to be a mathematician, or at least someone who’s ...

Catherine Dee

Search and Discovery writer

Why imports are important in JS
engineering

Why imports are important in JS

My first foray into programming was writing Python on a Raspberry Pi to flicker some LED lights — it wasn’t ...

Jaden Baptista

Technical Writer

What is ecommerce? The complete guide
e-commerce

What is ecommerce? The complete guide

How well do you know the world of modern ecommerce?  With retail ecommerce sales having exceeded $5.7 trillion worldwide ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Data is king: The role of data capture and integrity in embracing AI
ai

Data is king: The role of data capture and integrity in embracing AI

In a world of artificial intelligence (AI), data serves as the foundation for machine learning (ML) models to identify trends ...

Alexandra Anghel

Director of AI Engineering

What are data privacy and data security? Why are they  critical for an organization?
product

What are data privacy and data security? Why are they critical for an organization?

Imagine you’re a leading healthcare provider that performs extensive data collection as part of your patient management. You’re ...

Catherine Dee

Search and Discovery writer

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit
e-commerce

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit

In an era where customer experience reigns supreme, achieving digital excellence is a worthy goal for retail leaders. But what ...

Marissa Wharton

Marketing Content Manager

AI at scale: Managing ML models over time & across use cases
ai

AI at scale: Managing ML models over time & across use cases

Just a few years ago it would have required considerable resources to build a new AI service from scratch. Of ...

Benoit Perrot

VP, Engineering

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

It’s common to search by color on an ecommerce website. Unfortunately though, a purely text-based search index might not have all the information it needs to return the most relevant results when a user wants to search by color.

For example, searching for a “white t-shirt” might return results with thumbnails of clearly red or blue t-shirts just because their descriptions mention that the same cut also comes in white. Whether it’s technically correct or not, including those results at least gives the user the impression that our search engine is broken. What can we do about this?

The logical next step is to create some system that can automatically identify the color of the object in the thumbnail. There are some open source scripts that exist specifically for this, like josip/node-colour-extractor or lokesh/color-thief.

However, they largely work by finding the most common pixel value in an image, which comes with a few problems:

  1. That just gives us the color of the background in most cases because it takes up more space in the image.
  2. It also returns the value in RGB, which is too precise to be valuable in our context. We need general words that a user might include in a search query.

So instead in this article, we’re going to make something closer to Vue.ai, which identifies the foreground color in a queryable word. That commercial application is going to work better than what we come up with in this article, but if you’d like to see the process or whether our approach will fit your project’s requirements, read on!

Our chosen method

This problem is relatively complex, so the potential solutions are numerous. Most state-of-the-art computer vision frameworks nowadays go down the Deep Learning path, classifying images with Convolutional Neural Networks. This approach leads to some astounding results, but the huge dataset and specialized hardware places it slightly out of the scope of an experiment like this.

Deep learning frameworks are also notoriously hard to set up and run, and we wanted to release this as open-source so you can take a stab at it. Here’s the process we settled on:

  1. Preprocessing
  2. Focusing on what we’re classifying
  3. Finding clusters of similar pixels
  4. Picking names for the colors

Preprocessing

Since we were imagining using this on a fashion ecommerce website, we thought that it might make sense to crop the image to just the foreground. That should make it easier for our algorithm to identify which part of the image really matters for our usecase, since it’ll take up more of the image. Also, because we only cared about the primary color of the main object in the picture, detail was probably just going to confuse our algorithm and lengthen the processing time.

So our next step was to shrink all of the images down to about 100x100px. Our tests showed that this was close to optimal for our case. Here’s what that process looked like for this image:

 

color-crop
Resizing and cropping (original on the left)

Focusing on what we’re classifying

The background could be plain white or very complex, but either way, we don’t want to get any data from it. How do we separate it from the data that we care about? It’d be easy to make rules or assumptions, like mandating that the background be a plain color or where the main object should be in the picture.

But to allow for broader use cases, we’ll try to combine several more common algorithms to handle reasonably complex backgrounds.

Let’s start by thresholding over the image, which involves taking the delta E distance between every pixel and the four corner pixels of the image, and considering them part of the background if the result is below some threshold. This step is particularly useful on more complex backgrounds because it doesn’t use edge detection, but that makes it vulnerable to picking up on gradients and shadows incorrectly.

 

Global thresholding struggling with shadows
Global thresholding struggling with shadows

To fix that, we’ll use flood filling and edge detection to remove any shadows and gradients. The technique is largely inspired by this article on the Lyst engineering blog. The trick here is that backgrounds are usually defined by the absence of sharp edges. So if we smooth the image and apply a Scharr filter, we’ll usually get a clear outline of our foreground.

The caveat is just that complex backgrounds usually fool this test, so we’ll need to combine it with the thresholding from earlier for the best results.

Shadows are handled by edge detection and flooding
Shadows are handled by edge detection and flooding

Sometimes one of these two steps messes up and removes way too many pixels and we have to ignore the result of our background detection. They should work well on clean input, though, so we’ll call that part done.

The last thing we’ll want to remove from our image before trying to classify the color is skin. Images of clothing worn by models (especially swimwear) will often contain more pixels representing skin than representing the clothing item, so we’ll need to get it out of the image or our algorithm will just return that as the primary foreground color.

This is another rabbit hole we could easily dive down, but the simpler answer is to just filter out pixels in that general range. We chose a filter with a decent false-positive rate to reduce the risk of seeing orange-tinted clothes entirely as skin, but because that’s a possibility, we made this step completely optional in the final script.

 

Detecting skin pixels
Detecting skin pixels

Finding clusters of similar pixels

Earlier we mentioned that RGB values weren’t going to do; we need textual output. We need something that a user might search for on a fashion ecommerce website. But now that we’ve isolated the clothing item in the image, how do we isolate such a subjective label from it?

In the blouse the woman is wearing in the image above, you can see how the precise color might change from pixel to pixel because of how it is folding in the wind, where the light source is, and how many layers of fabric are visible at that exact spot. We need to find a cluster of pixels of similar color, average them together, and figure out which color category they belong to.

We don’t know how many clusters we’ll need though, since different pictures could have different amounts of distinct colors in the foreground. We should be able to find that using the jump method, where we set some error threshold and only include pixels in a cluster if they’re (a) connected to that cluster, and (b) below the threshold amount of color distance when compared to the pixel at the center of the cluster. This will create as many clusters as appears to be necessary, and then (if we wanted to) we could go along the borders of the clusters and reevaluate which group they should belong to.

That step would give us finer edges, but it’s not really necessary for our use case, and it would just waste processing time.

This process gives us few enough clusters that similar colors are grouped together (for easy categorization):

Clustering a gray T-shirt
Clustering a gray T-shirt (from left to right: original, background, skin, clusters)

 

While also, distinctly different colors in the foreground are still given separate clusters:

 

Rainbow clusters
Rainbow clusters (from left to right: original, background, clusters)

Picking names for the colors

The last step is categorizing our cluster average colors into readable English names. That sounds like a difficult problem, given the subjectivity (not to mention cultural implications) of what counts under each color category. We turned to a K-Nearest-Neighbors algorithm to give color names to RGB values, thanks to the XKCD Color Survey. The survey consists of 200,000 RGB values labeled with 27 different color names (e.g. black, green, teal, etc.) that we use to train a scikit-learn KNeighborsClassifier.

It works fairly well, but there is one big edge case in which it fails: gray. In the RGB system (where each channel is a dimension), the shades of gray form a plane through the space. That makes it hard to define a region where every value we want to be categorized as gray is closer to the center of the gray region than to the center of the regions that represent other colors.

We ended up sticking an extra step in to compute the distance between RGB values and their projection on the gray plane and if that distance is less than a certain threshold, we’ll call it gray and skip the classifier. The overall process isn’t super elegant and it has its drawbacks, but it works well for this case.

The final result

This experiment is available for you to mess around with here on GitHub! It’s a standalone Python script that you can use to enrich your search index with extracted color tags. We also made a library and a CLI for your convenience. Everything is completely configurable – we’re using sane defaults but left the option open to tweak all of the steps above. You can even add support for other languages by messing with the color dataset. We’re not using machine learning (yet), but we managed to hit ~84% accuracy even without the dramatic boost that route would get us.

Given this tool is designed to be used in conjunction with textual search, boosting images whose colors match what was searched for, this seems like a great place to leave version 1. Check back in next time for version 2!

About the author
Léo Ercolanelli

Software Engineer

Recommended Articles

Powered byAlgolia Algolia Recommend

Visual Shopping & Visual Discovery – How image search reimagines online shopping
ai

Julien Lemoine

Co-founder & former CTO at Algolia

Taking documentation search to new heights with Algolia and Autocomplete
ux

Sarah Dayan

Principal Software Engineer

Introducing our new navigation
product

Craig Williams

Director of Product Design & Research