Search by Algolia
How to increase your ecommerce conversion rate in 2024
e-commerce

How to increase your ecommerce conversion rate in 2024

2%. That’s the average conversion rate for an online store. Unless you’re performing at Amazon’s promoted products ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

How does a vector database work? A quick tutorial
ai

How does a vector database work? A quick tutorial

What’s a vector database? And how different is it than a regular-old traditional relational database? If you’re ...

Catherine Dee

Search and Discovery writer

Removing outliers for A/B search tests
engineering

Removing outliers for A/B search tests

How do you measure the success of a new feature? How do you test the impact? There are different ways ...

Christopher Hawke

Senior Software Engineer

Easily integrate Algolia into native apps with FlutterFlow
engineering

Easily integrate Algolia into native apps with FlutterFlow

Algolia's advanced search capabilities pair seamlessly with iOS or Android Apps when using FlutterFlow. App development and search design ...

Chuck Meyer

Sr. Developer Relations Engineer

Algolia's search propels 1,000s of retailers to Black Friday success
e-commerce

Algolia's search propels 1,000s of retailers to Black Friday success

In the midst of the Black Friday shopping frenzy, Algolia soared to new heights, setting new records and delivering an ...

Bernadette Nixon

Chief Executive Officer and Board Member at Algolia

Generative AI’s impact on the ecommerce industry
ai

Generative AI’s impact on the ecommerce industry

When was your last online shopping trip, and how did it go? For consumers, it’s becoming arguably tougher to ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What’s the average ecommerce conversion rate and how does yours compare?
e-commerce

What’s the average ecommerce conversion rate and how does yours compare?

Have you put your blood, sweat, and tears into perfecting your online store, only to see your conversion rates stuck ...

Vincent Caruana

Senior Digital Marketing Manager, SEO

What are AI chatbots, how do they work, and how have they impacted ecommerce?
ai

What are AI chatbots, how do they work, and how have they impacted ecommerce?

“Hello, how can I help you today?”  This has to be the most tired, but nevertheless tried-and-true ...

Catherine Dee

Search and Discovery writer

Algolia named a leader in IDC MarketScape
algolia

Algolia named a leader in IDC MarketScape

We are proud to announce that Algolia was named a leader in the IDC Marketscape in the Worldwide General-Purpose ...

John Stewart

VP Corporate Marketing

Mastering the channel shift: How leading distributors provide excellent online buying experiences
e-commerce

Mastering the channel shift: How leading distributors provide excellent online buying experiences

Twice a year, B2B Online brings together America’s leading manufacturers and distributors to uncover learnings and industry trends. This ...

Jack Moberger

Director, Sales Enablement & B2B Practice Leader

Large language models (LLMs) vs generative AI: what’s the difference?
ai

Large language models (LLMs) vs generative AI: what’s the difference?

Generative AI and large language models (LLMs). These two cutting-edge AI technologies sound like totally different, incomparable things. One ...

Catherine Dee

Search and Discovery writer

What is generative AI and how does it work?
ai

What is generative AI and how does it work?

ChatGPT, Bing, Bard, YouChat, DALL-E, Jasper…chances are good you’re leveraging some version of generative artificial intelligence on ...

Catherine Dee

Search and Discovery writer

Feature Spotlight: Query Suggestions
product

Feature Spotlight: Query Suggestions

Your users are spoiled. They’re used to Google’s refined and convenient search interface, so they have high expectations ...

Jaden Baptista

Technical Writer

What does it take to build and train a large language model? An introduction
ai

What does it take to build and train a large language model? An introduction

Imagine if, as your final exam for a computer science class, you had to create a real-world large language ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

The pros and cons of AI language models
ai

The pros and cons of AI language models

What do you think of the OpenAI ChatGPT app and AI language models? There’s lots going on: GPT-3 ...

Catherine Dee

Search and Discovery writer

How AI is transforming merchandising from reactive to proactive
e-commerce

How AI is transforming merchandising from reactive to proactive

In the fast-paced and dynamic realm of digital merchandising, being reactive to customer trends has been the norm. In ...

Lorna Rivera

Staff User Researcher

Top examples of some of the best large language models out there
ai

Top examples of some of the best large language models out there

You’re at a dinner party when the conversation takes a computer-science-y turn. Have you tried ChatGPT? What ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What are large language models?
ai

What are large language models?

It’s the era of Big Data, and super-sized language models are the latest stars. When it comes to ...

Catherine Dee

Search and Discovery writer

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

The theme of our 2022 Algolia Developer Conference was “Index the world and put your data in motion” so naturally, as soon as the last video was uploaded to YouTube, talk turned to how we could put all of this great new content in motion for our customers.

I knew I wanted the videos to be searchable by title and description and discoverable by category, but I wanted to do more. I wanted to be able to help developers find the exact spot in the video that matched their search. This meant indexing the transcripts of the videos. We’ve tried to do this in the past using YouTube’s captioning capabilities with mixed results. Fortunately for us, at almost the exact same time, the team at OpenAI released a new neural network called Whisper for automatic speech recognition.

In the rest of this post, I will describe the toolchain I used to build the A/VSearch CLI, an integrated command line for generating and indexing transcripts from a YouTube channel or playlist.

You can check out the results on our demo site here!

Machine Learning – Whisper

As mentioned above, OpenAI released Whisper recently which is a general-purpose speech recognition model. It can recognize many different languages and even has the ability to translate between them. Since it is natively exposed in Python, it was the perfect candidate to team up with Algolia’s Python API Client. Overall, I was thoroughly impressed with the transcription quality. Even using the medium model, minus some specific technology names, it transcribed videos without a single mistake.

One big feature I personally wish Whisper had was speaker diarization, where the model identifies different speakers throughout the recording and determines when that person spoke. Right now, you would have to manually clean up and assign segments to a speaker. It’s possible to combine Whisper with another tool, like PyAnnote to do this though, which I’d like to add as a feature in the future. I also believe there is a limitation with Whisper on multiple languages spoken in the same audio file, but this could be improved as time goes on.

As sometimes the segments provided by Whisper are quite short, it can be hard to determine the true context of the segment. Due to this, we added a context field that contains the previous segment and the proceeding one. This way it is clear what is being discussed in that particular segment, which should lead to a higher success rate for finding a specific clip.

How I built it

Since Whisper requires an audio file to run the transcription, I needed a method to take a YouTube video and convert it to an audio file. YouTube-DL was my choice as it is well supported in Python and I can have it only download the audio of the video, saving me from having to convert any downloads before transcription. Since some users would want to use the program from the command line, I added the Click library to power a CLI interface.

Sometimes there are words (or company names) that Whisper can’t detect, so I made it so you can supply patterns to perform search/replace logic on. My colleague, Chuck, had a great idea to also add a categorization feature where you can provide keywords for A/VSearch to detect during transcription and automatically apply predefined categories. To use these features, you can simply pass a JSON file with the patterns defined within and A/VSearch will parse and use them during the process.

How to use it

Using A/VSearch is super simple, you can download a release from GitHub and install it that way, or just use pip to install it via the GitHub URL which will load the latest release. Since it has a fully-featured CLI, you can just export your Algolia credentials as environment variables and get into the action! The CLI accepts URLs for playlists, channels, and individual videos and writes the transcript records to the Algolia index name you provide.

Whisper’s transcription speed can be sped up with access to a GPU. Using an NVIDIA Tesla T4 GPU transcribing a three-minute video took 25 seconds, while the same video on a 32-vCPU VM took 45 seconds. This increase is especially helpful for longer videos as it can take a fraction of the time to process.

# Create and activate a virtualenv
python3 -m venv av-search-test && cd av-search-test
source bin/activate

# Install via Pip or grab a release from GitHub
python3 -m pip install git+https://github.com/algolia-samples/avsearch

export ALGOLIA_APP_ID=AAAAA12345
export ALGOLIA_INDEX_NAME=transcriptions
export ALGOLIA_API_KEY=6c4dba625a960b4cc54b7b5312f9117d

# Transcribe a video, playlist, channel, etc.
av-search --targets "https://www.youtube.com/watch?v=epSVL87_sqA"

More information on advanced usage can be found in the GitHub repository.

How to automate it

The best way to automate A/VSearch would be to integrate it into a Python application. This way you can handle any errors gracefully and easily integrate any other solutions that may be required (like event notification for example.)

from avsearch import AVSearch
import os

avs = AVSearch(app_id='AAAAA12345', api_key=os.environ.get('ALGOLIA_API_KEY'), ...)
result = avs.transcribe([
    "https://www.youtube.com/watch?v=qSBm7d3McRI"
])

print(result)
# [
#    {
#      "objectID": "zOz-Sk4K-64-0",
#      "videoID": "zOz-Sk4K-64",
#      "videoTitle": "Welcome to Algolia DevCon! Keynote and product demos",
#      "videoDescription": "...",
#      "url": "https://youtu.be/zOz-Sk4K-64?t=0",
#      "thumbnail": "https://i.ytimg.com/...",
#      "text": "Hi everyone and welcome to DevCon 2022.",
#      "start": 0,
#      "end": 12,
#      "categories": [],
#      "context": {
#        "before": {
#          "start": 0,
#          "text": ""
#        },
#        "after": {
#          "start": 12,
#          "text": "I'm thrilled to be here with you today at Algolia's first ever developer conference"
#        }
#      }
#    },
#    ...
# ]

Configuration

Once you have some data in the index, there are some settings you should adjust to deliver the best search experience. We can do this via the Algolia Dashboard, or my personal favorite, the Algolia CLI! We’ve prepared a configuration file you can upload directly to your newly created Index to have the best settings out of the box:

# Download the settings file from the repository or fetch it manually
wget https://github.com/algolia-samples/avsearch/blob/main/examples/settings.json.example

# Transcription index name
export MY_INDEX_NAME=''

# Overwrite index settings
algolia settings settings set $MY_INDEX_NAME -F settings.json.example

If you are new to our CLI, you can find some more information on it here. If the Dashboard is more your speed, you can also upload the configuration by navigating to your Index, clicking ‘Manage Index’, and selecting ‘Import Configuration’.

Building a frontend

I built an autocomplete search experience (including cmd-K binding) to simplify integrating with the existing Algolia Developer Conference homepage. Making the search interface a modal lets me provide a rich UX with space for previews and thumbnails without needing to redesign the whole home page. Algolia’s AutcompleteJS library is great for building this type of autocomplete experience. I used our own documentation search as a model for inspiration. The large preview pane gives users more context from the video transcripts to help them find the right clip.

A/VSearch Example Frontend

I leaned into AutocompleteJS’s plugin architecture, including the official plugins for query suggestions and click events. I also created a custom plugin to load the selected video into an embedded iFrame on the same webpage (createLoadVideoPlugin).

You can see this example front end in the examples directory in the code repo or try out the live demo.

Wrap Up

If you have any questions about A/VSearch such as how it works, implementation questions, or feature requests, feel free to drop us a line over at our Discourse forum! Our team would love to hear from you about A/VSearch or about any other Algolia-related questions you may have.

Want to get started transcribing your own content? Head over to the GitHub repository and grab the latest release!

Algolia Code Exchange


We hope that you enjoyed this in-depth look at A/VSearch and how we used it to power the DevCon Session search feature. If you’re new to Algolia, you can try us out by signing up for a free tier account.

About the author
Michael King

Developer Advocate

githublinkedinmediumtwitter

Recommended Articles

Powered byAlgolia Algolia Recommend

Good API Documentation Is Not About Choosing the Right Tool
engineering

Maxime Locqueville

DX Engineering Manager

Post-Exit Year in Review
algolia

Ciprian Borodescu

AI Product Manager | On a mission to help people succeed through the use of AI

Mobile search done right: Common pitfalls and best practices
ux

Alexandre Collin

Staff SME Business & Optimization - UI/UX