Search by Algolia
What is ecommerce merchandising? Key components and best practices
e-commerce

What is ecommerce merchandising? Key components and best practices

A potential customer is about to land on the home page of your ecommerce platform, curious to see what cool ...

Catherine Dee

Search and Discovery writer

AI-powered search: From keywords to conversations
ai

AI-powered search: From keywords to conversations

By now, everyone’s had the opportunity to experiment with AI tools like ChatGPT or Midjourney and ponder their inner ...

Chris Stevenson

Director, Product Marketing

Vector vs Keyword Search: Why You Should Care
ai

Vector vs Keyword Search: Why You Should Care

Search has been around for a while, to the point that it is now considered a standard requirement in many ...

Nicolas Fiorini

Senior Machine Learning Engineer

What is AI-powered site search?
ai

What is AI-powered site search?

With the advent of artificial intelligence (AI) technologies enabling services such as Alexa, Google search, and self-driving cars, the ...

John Stewart

VP Corporate Marketing

What is a B2B marketplace?
e-commerce

What is a B2B marketplace?

It’s no secret that B2B (business-to-business) transactions have largely migrated online. According to Gartner, by 2025, 80 ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago
e-commerce

3 strategies for B2B ecommerce growth: key takeaways from B2B Online - Chicago

Twice a year, B2B Online brings together industry leaders to discuss the trends affecting the B2B ecommerce industry. At the ...

Elena Moravec

Director of Product Marketing & Strategy

Deconstructing smart digital merchandising
e-commerce

Deconstructing smart digital merchandising

This is Part 2 of a series that dives into the transformational journey made by digital merchandising to drive positive ...

Benoit Reulier
Reshma Iyer

Benoit Reulier &

Reshma Iyer

The death of traditional shopping: How AI-powered conversational commerce changes everything
ai

The death of traditional shopping: How AI-powered conversational commerce changes everything

Get ready for the ride: online shopping is about to be completely upended by AI. Over the past few years ...

Aayush Iyer

Director, User Experience & UI Platform

What is B2C ecommerce? Models, examples, and definitions
e-commerce

What is B2C ecommerce? Models, examples, and definitions

Remember life before online shopping? When you had to actually leave the house for a brick-and-mortar store to ...

Catherine Dee

Search and Discovery writer

What are marketplace platforms and software? Why are they important?
e-commerce

What are marketplace platforms and software? Why are they important?

If you imagine pushing a virtual shopping cart down the aisles of an online store, or browsing items in an ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is an online marketplace?
e-commerce

What is an online marketplace?

Remember the world before the convenience of online commerce? Before the pandemic, before the proliferation of ecommerce sites, when the ...

Catherine Dee

Search and Discovery writer

10 ways AI is transforming ecommerce
e-commerce

10 ways AI is transforming ecommerce

Artificial intelligence (AI) is no longer just the stuff of scary futuristic movies; it’s recently burst into the headlines ...

Catherine Dee

Search and Discovery writer

AI as a Service (AIaaS) in the era of "buy not build"
ai

AI as a Service (AIaaS) in the era of "buy not build"

Imagine you are the CTO of a company that has just undergone a massive decade long digital transformation. You’ve ...

Sean Mullaney

CTO @Algolia

By the numbers: the ROI of keyword and AI site search for digital commerce
product

By the numbers: the ROI of keyword and AI site search for digital commerce

Did you know that the tiny search bar at the top of many ecommerce sites can offer an outsized return ...

Jon Silvers

Director, Digital Marketing

Using pre-trained AI algorithms to solve the cold start problem
ai

Using pre-trained AI algorithms to solve the cold start problem

Artificial intelligence (AI) has quickly moved from hot topic to everyday life. Now, ecommerce businesses are beginning to clearly see ...

Etienne Martin

VP of Product

Introducing Algolia NeuralSearch
product

Introducing Algolia NeuralSearch

We couldn’t be more excited to announce the availability of our breakthrough product, Algolia NeuralSearch. The world has stepped ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

AI is eating ecommerce
ai

AI is eating ecommerce

The ecommerce industry has experienced steady and reliable growth over the last 20 years (albeit interrupted briefly by a global ...

Sean Mullaney

CTO @Algolia

Semantic textual similarity: a game changer for search results and recommendations
product

Semantic textual similarity: a game changer for search results and recommendations

As an ecommerce professional, you know the importance of providing a five-star search experience on your site or in ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

The theme of our 2022 Algolia Developer Conference was “Index the world and put your data in motion” so naturally, as soon as the last video was uploaded to YouTube, talk turned to how we could put all of this great new content in motion for our customers.

I knew I wanted the videos to be searchable by title and description and discoverable by category, but I wanted to do more. I wanted to be able to help developers find the exact spot in the video that matched their search. This meant indexing the transcripts of the videos. We’ve tried to do this in the past using YouTube’s captioning capabilities with mixed results. Fortunately for us, at almost the exact same time, the team at OpenAI released a new neural network called Whisper for automatic speech recognition.

In the rest of this post, I will describe the toolchain I used to build the A/VSearch CLI, an integrated command line for generating and indexing transcripts from a YouTube channel or playlist.

You can check out the results on our demo site here!

Machine Learning – Whisper

As mentioned above, OpenAI released Whisper recently which is a general-purpose speech recognition model. It can recognize many different languages and even has the ability to translate between them. Since it is natively exposed in Python, it was the perfect candidate to team up with Algolia’s Python API Client. Overall, I was thoroughly impressed with the transcription quality. Even using the medium model, minus some specific technology names, it transcribed videos without a single mistake.

One big feature I personally wish Whisper had was speaker diarization, where the model identifies different speakers throughout the recording and determines when that person spoke. Right now, you would have to manually clean up and assign segments to a speaker. It’s possible to combine Whisper with another tool, like PyAnnote to do this though, which I’d like to add as a feature in the future. I also believe there is a limitation with Whisper on multiple languages spoken in the same audio file, but this could be improved as time goes on.

As sometimes the segments provided by Whisper are quite short, it can be hard to determine the true context of the segment. Due to this, we added a context field that contains the previous segment and the proceeding one. This way it is clear what is being discussed in that particular segment, which should lead to a higher success rate for finding a specific clip.

How I built it

Since Whisper requires an audio file to run the transcription, I needed a method to take a YouTube video and convert it to an audio file. YouTube-DL was my choice as it is well supported in Python and I can have it only download the audio of the video, saving me from having to convert any downloads before transcription. Since some users would want to use the program from the command line, I added the Click library to power a CLI interface.

Sometimes there are words (or company names) that Whisper can’t detect, so I made it so you can supply patterns to perform search/replace logic on. My colleague, Chuck, had a great idea to also add a categorization feature where you can provide keywords for A/VSearch to detect during transcription and automatically apply predefined categories. To use these features, you can simply pass a JSON file with the patterns defined within and A/VSearch will parse and use them during the process.

How to use it

Using A/VSearch is super simple, you can download a release from GitHub and install it that way, or just use pip to install it via the GitHub URL which will load the latest release. Since it has a fully-featured CLI, you can just export your Algolia credentials as environment variables and get into the action! The CLI accepts URLs for playlists, channels, and individual videos and writes the transcript records to the Algolia index name you provide.

Whisper’s transcription speed can be sped up with access to a GPU. Using an NVIDIA Tesla T4 GPU transcribing a three-minute video took 25 seconds, while the same video on a 32-vCPU VM took 45 seconds. This increase is especially helpful for longer videos as it can take a fraction of the time to process.

# Create and activate a virtualenv
python3 -m venv av-search-test && cd av-search-test
source bin/activate

# Install via Pip or grab a release from GitHub
python3 -m pip install git+https://github.com/algolia-samples/avsearch

export ALGOLIA_APP_ID=AAAAA12345
export ALGOLIA_INDEX_NAME=transcriptions
export ALGOLIA_API_KEY=6c4dba625a960b4cc54b7b5312f9117d

# Transcribe a video, playlist, channel, etc.
av-search --targets "https://www.youtube.com/watch?v=epSVL87_sqA"

More information on advanced usage can be found in the GitHub repository.

How to automate it

The best way to automate A/VSearch would be to integrate it into a Python application. This way you can handle any errors gracefully and easily integrate any other solutions that may be required (like event notification for example.)

from avsearch import AVSearch
import os

avs = AVSearch(app_id='AAAAA12345', api_key=os.environ.get('ALGOLIA_API_KEY'), ...)
result = avs.transcribe([
    "https://www.youtube.com/watch?v=qSBm7d3McRI"
])

print(result)
# [
#    {
#      "objectID": "zOz-Sk4K-64-0",
#      "videoID": "zOz-Sk4K-64",
#      "videoTitle": "Welcome to Algolia DevCon! Keynote and product demos",
#      "videoDescription": "...",
#      "url": "https://youtu.be/zOz-Sk4K-64?t=0",
#      "thumbnail": "https://i.ytimg.com/...",
#      "text": "Hi everyone and welcome to DevCon 2022.",
#      "start": 0,
#      "end": 12,
#      "categories": [],
#      "context": {
#        "before": {
#          "start": 0,
#          "text": ""
#        },
#        "after": {
#          "start": 12,
#          "text": "I'm thrilled to be here with you today at Algolia's first ever developer conference"
#        }
#      }
#    },
#    ...
# ]

Configuration

Once you have some data in the index, there are some settings you should adjust to deliver the best search experience. We can do this via the Algolia Dashboard, or my personal favorite, the Algolia CLI! We’ve prepared a configuration file you can upload directly to your newly created Index to have the best settings out of the box:

# Download the settings file from the repository or fetch it manually
wget https://github.com/algolia-samples/avsearch/blob/main/examples/settings.json.example

# Transcription index name
export MY_INDEX_NAME=''

# Overwrite index settings
algolia settings settings set $MY_INDEX_NAME -F settings.json.example

If you are new to our CLI, you can find some more information on it here. If the Dashboard is more your speed, you can also upload the configuration by navigating to your Index, clicking ‘Manage Index’, and selecting ‘Import Configuration’.

Building a frontend

I built an autocomplete search experience (including cmd-K binding) to simplify integrating with the existing Algolia Developer Conference homepage. Making the search interface a modal lets me provide a rich UX with space for previews and thumbnails without needing to redesign the whole home page. Algolia’s AutcompleteJS library is great for building this type of autocomplete experience. I used our own documentation search as a model for inspiration. The large preview pane gives users more context from the video transcripts to help them find the right clip.

A/VSearch Example Frontend

I leaned into AutocompleteJS’s plugin architecture, including the official plugins for query suggestions and click events. I also created a custom plugin to load the selected video into an embedded iFrame on the same webpage (createLoadVideoPlugin).

You can see this example front end in the examples directory in the code repo or try out the live demo.

Wrap Up

If you have any questions about A/VSearch such as how it works, implementation questions, or feature requests, feel free to drop us a line over at our Discourse forum! Our team would love to hear from you about A/VSearch or about any other Algolia-related questions you may have.

Want to get started transcribing your own content? Head over to the GitHub repository and grab the latest release!

Algolia Code Exchange


We hope that you enjoyed this in-depth look at A/VSearch and how we used it to power the DevCon Session search feature. If you’re new to Algolia, you can try us out by signing up for a free tier account.

About the author
Michael King

Developer Advocate

githublinkedinmediumtwitter

Recommended Articles

Powered byAlgolia Algolia Recommend

Good API Documentation Is Not About Choosing the Right Tool
engineering

Maxime Locqueville

DX Engineering Manager

Post-Exit Year in Review
algolia

Ciprian Borodescu

AI Product Manager | On a mission to help people succeed through the use of AI

DevTool Intro: The Algolia CLI!
engineering

Khalid Elassaad

Sr. TPM - Developer Experience