It’s time for one of our favorite annual traditions: revealing the holiday gift we built for the developer community. This year, we’ve made a tool that helps users discover key moments at conferences by searching into video transcripts. It’s called TalkSearch, and we’d like to share a few details about why we built it and how it works.
What is TalkSearch?
TalkSearch is an interactive search and video experience that any conference or meetup organizer can offer to their community. All the event needs is a YouTube channel or playlist to get started. TalkSearch indexes video titles, descriptions and transcripts and serves up an instant search experience that plays the right video at the right time based on the search results.
Conference and meetup organizers are invited to fill out the TalkSearch request form and get the process started. Once the indexing is complete, a standalone video search page will be built just for that conference. The search experience can then be embedded on the conference’s website (coming soon).
Past community holiday gifts
Every year since 2014, Algolia has built something to address a pain point or opportunity in the dev community. Here are a few past examples:
- 2014: GitHub Awesome Autocomplete, with recently improved UX
- 2015: DocSearch, now used by 450+ open source documentation sites and API portals, including Stripe, webpack and React
- 2016: Yarn package search, now at 700,000 user searches per month!
The inspiration for TalkSearch
We wanted to give back to an especially important community for Algolia — the organizers of events. We at Algolia go to over a hundred events every year, whether as sponsors, speakers, or participants. We benefit greatly from the communities that conference organizers bring together and the opportunities they afford us.
We’re also scratching our own itch here. When we return from a conference, we often want to watch talks again or see the ones that we missed. Or we want to leap right to the moments that were the most relevant to us. We might also want to share with our co-workers a particular moment in a talk or a particular slide. TalkSearch was conceived and built to make this process easier, and to help unlock more value from all of the time and energy that goes into writing talks and producing videos.
How does it work?
As with any Algolia implementation, TalkSearch can be broken down into two main parts: the indexing process (crawling or scraping the data) and the search interface that users will interact with.
Indexing – Youtube API
TalkSearch uses the YouTube API to loop through a conference’s channel or playlist of videos. It extracts the essential information from each video — title, author, transcript (YouTube “captions”), tags — and pushes that data into the index.
Each conference’s TalkSearch data is structured as follows:
- There is one record per “phrase” of the talk, all stored in one index.
- Each phrase contains between 5-10 words on average.
- Each phrase has a start time and duration.
- The title, author, and description of the video are added to each record.
Here’s what an example record looks like:
Search – standalone or embedded; single or multiple videos
The TalkSearch user interface is hosted automatically on a standalone page by Algolia when a new conference is indexed. We do recommend, however, that conferences use an embeddable version to put it directly inside of their site, which will create a more seamless experience for their users.
When the user first lands on the standalone page or embedded widget, they can search through all videos at once. Up to 3 locations in each video will be shown that match the search query. When the user clicks a search result, a full-size panel will appear containing just the video selected, which will start playing at the right start time according to the transcript data. Within this view, the user can search into just the current video to find other moments of interest.
Open source on GitHub
The search is built with the React InstantSearch library. React InstantSearch and the family of InstantSearch libraries provide a set of components and building blocks that make it easy to build dynamic, full-page search experiences like TalkSearch.
All of the TalkSearch code is open source and you can also run it on your own. See algolia/talksearch-scraper and algolia/talksearch on GitHub. Whether you’re a new developer or an experienced pro, we welcome your feedback and contribution in the form of issues and PRs.
We sincerely hope you’ll find TalkSearch useful as a conference-goer or organizer. If you’d like to create a TalkSearch experience for your event, please start by filling out the request form.