Algolia DevCon
Oct. 2–3 2024, virtual.
Guides / Solutions / Ecommerce

Build a voice to text search

Your app needs three things to build a voice search experience:

  • A voice input using a speech-to-text service
  • An output to display results
  • The suggested Algolia settings for optimizing your voice search.

Sample app

More examples

Input - the speech-to-text layer

Since Algolia only handles text searching, you must convert your user’s speech to text. If you’re building on top of a voice assistant like Amazon Alexa, you get built-in speech-to-text support. This is also the case if you’re building iOS or Android native apps or explicitly targeting the Chrome browser. For all other web apps, you’ll need an external service. Some options are Google Cloud Speech to Text, Azure Cognitive Services, or AssemblyAI.

You need to send the user’s speech to a speech-to-text service, receive the text, and then send that text to Algolia as a search query. For example, the sample app uses Google’s Speech-to-text API to interrogate the voice input:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
function startRecognitionStream(io) {
  this.recognizeStream = this.gcpClient
    .streamingRecognize(this.request)
    .on("error", console.error)
    .on("data", data => {
      process.stdout.write(
        data.results[0] && data.results[0].alternatives[0]
          ? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
          : `\n\nReached transcription time limit, press Ctrl+C\n`
      );

      io.emit("dataFromGCP", data.results[0].alternatives[0].transcript);

      //Stop speech recognition wheb user stops talking
      if (data.results[0] && data.results[0].isFinal) {
        io.emit("endSpeechRecognition", {});
      }
    });
  }

Output

You can convert the text you receive from the speech-to-text layer into speech or display them as text (as in the sample app).

If you do need text-to-speech support, your choices are:

For example, the following code uses the SpeechSynthesis API to announce the titles of search results:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// The search input
const searchInput = document.querySelector('#search-input');

// The search results
const searchResults = document.querySelector('#search-results');

// Create an Algolia InstantSearch instance
const search = instantsearch({
  appId: 'YourApplicationID',
  apiKey: 'YourSearchOnlyAPIKey',
  // Replace with the name of the index you want to search
  indexName: 'YourIndexName',
  // Bind the search input to the InstantSearch instance
  searchParameters: {
    query: searchInput.value
  }
});

// Initialize the search
search.start();

// Listen for search results
search.on('render', () => {
  // Get the title of search results
  const titles = searchResults
    .querySelectorAll('.Hits-item')
    .map(item => item.querySelector('.title').textContent);

  // Speak the titles using the speechSynthesis API
  titles.forEach(title => {
    const msg = new SpeechSynthesisUtterance(title);
    window.speechSynthesis.speak(msg);
  });
});

Customize this code to fit your specific needs. For example, you could add a button to control when the titles are spoken, or change the output to include information from other attributes.

Algolia settings

  • Set removeStopWords to true or the appropriate language code (for example, en). This will remove words like “a,” “an,” or “the” that don’t add value to the query.
  • Set ignorePlurals to true or the appropriate language code. This makes words like “car” and “cars” equivalent.
  • Send the entire query string as optionalWords. When searching conversationally, searchers might use words that aren’t in your index. Making all words optional means that records don’t need to match every word, but records matching more words will rank higher.
  • Use analyticsTags if you want to identify a search as being voice-driven

As an alternative to setting removeStopWords and ignorePlurals individually, you can use the naturalLanguages parameter to set both these behaviors in one call.

Add dynamic filters with Rules

To help users refine their search to find more relevant results, consider adding rules to apply filters based on what they say.

Further reading

Did you find this page helpful?