Search by Algolia
Feature Spotlight: Query Rules
product

Feature Spotlight: Query Rules

You’re running an ecommerce site for an electronics retailer, and you’re seeing in your analytics that users keep ...

Jaden Baptista

Technical Writer

An introduction to transformer models in neural networks and machine learning
ai

An introduction to transformer models in neural networks and machine learning

What do OpenAI and DeepMind have in common? Give up? These innovative organizations both utilize technology known as transformer models ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What’s the secret of online merchandise management? Giving store merchandisers the right tools
e-commerce

What’s the secret of online merchandise management? Giving store merchandisers the right tools

As a successful in-store boutique manager in 1994, you might have had your merchandisers adorn your street-facing storefront ...

Catherine Dee

Search and Discovery writer

New features and capabilities in Algolia InstantSearch
engineering

New features and capabilities in Algolia InstantSearch

At Algolia, our business is more than search and discovery, it’s the continuous improvement of site search. If you ...

Haroen Viaene

JavaScript Library Developer

Feature Spotlight: Analytics
product

Feature Spotlight: Analytics

Analytics brings math and data into the otherwise very subjective world of ecommerce. It helps companies quantify how well their ...

Jaden Baptista

Technical Writer

What is clustering?
ai

What is clustering?

Amid all the momentous developments in the generative AI data space, are you a data scientist struggling to make sense ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is a vector database?
product

What is a vector database?

Fashion ideas for guest aunt informal summer wedding Funny movie to get my bored high-schoolers off their addictive gaming ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Unlock the power of image-based recommendation with Algolia’s LookingSimilar
engineering

Unlock the power of image-based recommendation with Algolia’s LookingSimilar

Imagine you're visiting an online art gallery and a specific painting catches your eye. You'd like to find ...

Raed Chammam

Senior Software Engineer

Empowering Change: Algolia's Global Giving Days Impact Report
algolia

Empowering Change: Algolia's Global Giving Days Impact Report

At Algolia, our commitment to making a positive impact extends far beyond the digital landscape. We believe in the power ...

Amy Ciba

Senior Manager, People Success

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve
e-commerce

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve

In today’s post-pandemic-yet-still-super-competitive retail landscape, gaining, keeping, and converting ecommerce customers is no easy ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Algolia x eTail | A busy few days in Boston
algolia

Algolia x eTail | A busy few days in Boston

There are few atmospheres as unique as that of a conference exhibit hall: the air always filled with an indescribable ...

Marissa Wharton

Marketing Content Manager

What are vectors and how do they apply to machine learning?
ai

What are vectors and how do they apply to machine learning?

To consider the question of what vectors are, it helps to be a mathematician, or at least someone who’s ...

Catherine Dee

Search and Discovery writer

Why imports are important in JS
engineering

Why imports are important in JS

My first foray into programming was writing Python on a Raspberry Pi to flicker some LED lights — it wasn’t ...

Jaden Baptista

Technical Writer

What is ecommerce? The complete guide
e-commerce

What is ecommerce? The complete guide

How well do you know the world of modern ecommerce?  With retail ecommerce sales having exceeded $5.7 trillion worldwide ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Data is king: The role of data capture and integrity in embracing AI
ai

Data is king: The role of data capture and integrity in embracing AI

In a world of artificial intelligence (AI), data serves as the foundation for machine learning (ML) models to identify trends ...

Alexandra Anghel

Director of AI Engineering

What are data privacy and data security? Why are they  critical for an organization?
product

What are data privacy and data security? Why are they critical for an organization?

Imagine you’re a leading healthcare provider that performs extensive data collection as part of your patient management. You’re ...

Catherine Dee

Search and Discovery writer

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit
e-commerce

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit

In an era where customer experience reigns supreme, achieving digital excellence is a worthy goal for retail leaders. But what ...

Marissa Wharton

Marketing Content Manager

AI at scale: Managing ML models over time & across use cases
ai

AI at scale: Managing ML models over time & across use cases

Just a few years ago it would have required considerable resources to build a new AI service from scratch. Of ...

Benoit Perrot

VP, Engineering

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

Locating hard-to-find information without using a search bar may sound paradoxical, but take the example of package labels. Delivery companies structure a label’s information in many different and unpredictable ways. Normally, we just read the labels ourselves and find the relevant information — in this case, who the package is for. But we’d prefer to integrate OCR into search.

But what if there are thousands of labels, all differently structured — different content, sometimes handwritten, in shades of nearly unreadable grays, in odd directions of text, with random lines, graphics, smudges, and torn-off corners? Can OCR help?

Optical character recognition (OCR) makes a best effort to extract text from images, but the resulting text will often be large and unstructured. Additionally, if the OCR software can’t decipher some letters, the text will have typos.

This is where an integrated search technology comes in. A search engine with a robust, adaptable relevance can match an OCR’s unstructured text against a structured set of data and return the correct results. 

Integrate OCR into search

We integrated our search engine into two technologies:

Essentially, we scanned a label and used Google Cloud Vision API to convert the label to text. We then fed the unpredictable output into our search engine, which matched it against the structured data of BambooHR, finding and returning the recipient’s name. Importantly, we didn’t need to pre-process or parse the input data. This workflow can also work with stickers, stamps, and even movie posters on a wall. 

Online retailers and media companies are leveraging this OCR + search integration to query their back-end systems.

Our story: Why we needed to integrate OCR into search

On a daily basis, Algolia employees receive loads of packages at the Paris office. Kumiko, our office coordinator, had been taking care of them. Every time a new package arrived, Kumiko would search the label to find who it’s for, then find the person on Slack and let them know their package is waiting at the front desk.

But Algolia was rapidly growing. Handling package distribution by hand started taking more and more time for Kumiko. During the holiday seasons, it got really out of hand:

image of packages with different labels

Obviously, manual handling doesn’t scale. I thought there should be a faster, easier, scalable way to help dispatch packages. I decided to build a web application for it. My goal was to automate the process as much as possible, from scanning the label to notifying people on Slack.

I initially thought of using the barcode. Unfortunately, I quickly discovered that a barcode doesn’t contain the same kind of data as you have in QR codes. Most of the time, they only contain EAN identifiers. These numbers are intended to query private carrier APIs to fetch the packages’s details.

So I decided to read the package label with an optical character recognition engine (OCR) and send the OCR text to the search engine as-is, matching it against the correct record in the index.

How to integrate OCR into search

Step 1: Finding the right OCR software

There are several open source libraries for handling the OCR part. The most popular one is Tesseract. However, you typically need to perform some pre-processing on the image before sending it to Tesseract to recognize the characters (e.g., desaturation, contrast, de-skewing) Also, some of the package labels we receive are handwritten! Tesseract is not good at reading handwritten words. 

Google’s Vision API offers OCR capabilities, so I decided to give it a go. Among other things, it provides:

  • 1,000 free API calls per month (which is more than enough to start)
  • Handwritten characters detection

We’ll see how this works in step 3. First, let’s look at the code that integrated Algolia search with OCR.

Step 2: Creating the React app

I created a React app, and installed the React Webcam component to access the device’s camera. Internally, this React component leverages the getUserMedia API.

Once the user captures a label using their phone, the app sends it to an Express backend. This takes care of proxying the base64-encoded image to the Google Vision API. Vision then returns a JSON payload containing the data as text.

// Initialize the Google Cloud Vision client
visionClient = new vision.ImageAnnotatorClient();

// Ask Vision API to return the text from the label
// https://cloud.google.com/vision/docs/ocr
const [result] = await visionClient.textDetection({
  image: {
    content: labelImage.data, // Uploaded image data
  },
});

const detections = result.textAnnotations; // This contains all the text
const labelText = detections[0].description.replace(new RegExp("\n", "g"), " "); // Replace the line breaks by a space

Step 3: Reading the label with Google Vision API 

Here’s what Google Vision gave us (and what we will eventually send as the query to the search engine):

ORY1\n0.7 KG\nDENOIX Clément\nALGOLIA\n55 rue d'Amsterdam\n75008 Paris, France\nC20199352333\nDIF4\nCYCLE\nlove of boo\nAnod

As you can see, labels aren’t pretty. They contain a lot of noise. The relevant information is baked somewhere in there, surrounded by other data. They contain characters relevant to the delivery person, such as label numbers, the sender’s address, etc. Additionally, the order isn’t consistent and the information isn’t always complete, so we can’t rely on word ordering or element position to extract relevant sections before sending them to Algolia. We’ll do that in Step 5. First, let’s take a look at the back-end data we’ll be searching.

Step 4: Data indexing BambooHR’s back-end data

There’s no need to provide any code for this part. Indexing data from other systems is the basis of all search engines. The idea is to take relevant data from one or more systems and push it all into a separate data source called an index. This is run on the back end with a frequency that matches the changing nature of your data. Note that the search engine only needs some data which is relevant to search purposes, for querying, display, sorting, and filtering.

Algolia’s API provides update methods to achieve this. Our documentation offers tutorials on how to send data.

Step 5: Searching with Algolia

As you saw, Google’s Vision API gave us great information. But how does the search engine locate the name? 

Fortunately, the Algolia search API has an interesting parameter: removeWordsIfNoResults.

When you set this parameter to allOptional and the engine fails to find any results with the original query, it makes a second attempt, treating all words as optional. This is equivalent to transforming the implicit AND operators between words to OR.

// Initialize the Algolia client and the Algolia employees index.
const algoliaClient = algoliaearch(process.env.ALGOLIA_APP_ID, process.env.ALGOLIA_API_KEY)
const index = algoliaClient.initIndex(process.env.ALGOLIA_INDEX_NAME);

// Search our employees index for a match, using the `removeWordsIfNoResults=allOptional` option.
// https://www.algolia.com/doc/api-reference/api-parameters/removeWordsIfNoResults/
const algoliaResult = await index.search(labelText, {
    'removeWordsIfNoResults': 'allOptional'
})

Note that labelText contains the exact string that Google Vision API sends back without any preprocessing (except to strip away the '\n’). I’ve highlighted the name (DENOIX Clément) which the search engine pulls out from the noise on the label –  the cherished needle in the haystack:

ORY1 0.7 KG DENOIX Clément ALGOLIA 55 rue d'Amsterdam 75008 Paris, France C20199352333 DIF4 CYCLE love of boo Anod

Usually, this parameter is helpful to improve results when a query is too restrictive. In my case, it allowed me to send the extracted data unprocessed. I was able to trust the Algolia engine to “ignore” the extraneous words from my query and only take the important ones into account.

{
  "displayName": "Clement Denoix",
  "firstName": "Clement",
  "lastName": "Denoix",
  "location": "Paris",
  "slack": {
    "id": "U0000000",
    "handle": "clement.denoix",
    "image": "https://avatars.slack-edge.com/2018-04-03/340713326613_2890719b5a8d4506f30c_512.jpg"
  },
}

This left only a few steps: extracting the first hit from the list of Algolia search results and displaying it. From there, our office manager could confirm the result, and automatically send a Slack message to the right employee.

Here’s a diagram of the App’s complete process:

Image of oct label reading process

As seen here: We take a picture of the package label. The app sends it to the Google Vision API through the Express backend. Google Vision returns a JSON payload with the recognized text, which the back end sends to Algolia as a search query. The search engine uses the removeWordsIfNoResults option to ensure a successful match. Algolia then returns a list of matching records, from which the back end extracts the first hit and returns it to the React app.

Conclusion & next steps

Algolia’s powerful search engine isn’t limited to a search box. With imagination, you can push the usage of Algolia far beyond the box and solve a variety of problems.

Label reading is only one kind of OCR integration. There’s image recognition, where online retailers can recognize the type, style, color, and size of clothing from images. There’s also voice recognition, where a website can interact with the unstructured ways people speak.

There are many ways to do this. In this case, we do this with the search engine’s built-in features that enable it to adapt its relevance algorithm to the variety and unpredictability of unstructured query data. The next step is to couple that with AI and machine learning, making a search engine’s adaptability and use-case scope even greater.

About the author
Clément Denoix

Software Engineer

Recommended Articles

Powered byAlgolia Algolia Recommend

What is a search query and how is it processed by a search engine?
product

Catherine Dee

Search and Discovery writer

Visual Shopping & Visual Discovery – How image search reimagines online shopping
ai

Julien Lemoine

Co-founder & former CTO at Algolia

Algolia's top 10 tips to achieve highly relevant search results
product

Julien Lemoine

Co-founder & former CTO at Algolia