In our last post on this topic, we covered Astro Starlight, Algolia DocSearch, and how to use DocSearch with a Starlight built documentation site. Now we’ve made it even easier to use the two together by creating a default crawler template for Starlight out-of-the-box. In this update, I’m going to cover more of the steps it takes to get started using DocSearch, what crawler templates are, how crawler templates reduce time-to-first-search-result, and I’ll use the Astro Starlight template as the shining example.
Algolia provides a free plan for technical documentation and blogs. It comes with two components; a front-end UI and a back-end crawler hosted by Algolia. At one time DocSearch was only available for OSS projects but we opened it up to all technical documentation and blogs! To learn more take a look through the docs. Let’s review the process by looking at:
Before you can use DocSearch, you must fill out the short application form. It usually only takes a day or two for us to review the submitted forms and get back to you. We’ll simply check your provided URL against the criteria of the program (essentially, technical documentation/blog, it’s public, and you own it) and if accepted you will get an email with the necessary information on how to get started as well as additional resources. If we find your site is not in line with the criteria, we will let you know along with guidance on how you can still use Algolia even if you’re not using DocSearch.
The most important pieces of information in the email will be:
These will come in your acceptance email and look similar to this:
Note: Not actual production information.
The DocSearch front-end is built on top of the Algolia Autocomplete library, which is typically the easiest way to get started. This will provide you with a search box and a modal dialog for results. You can use the other information in the email to deploy a DocSearch Autocomplete UI in seconds:
If you’re using a documentation framework that has a native integration with Algolia, it’ll be easier if you use their ready-made integration. For example, if you’re using Astro Starlight, we covered how to use their plugin. Another example would be the Docusaurus Algolia config.
The Algolia hosted crawler populates your index with records based on the data found on your documentation site. Each crawler has its own configuration, it contains metadata like where to crawl, when, the settings to use on the index, and critically, how to correctly create records from your site. We have ready-made configurations, called templates, that apply to many of the most popular documentation frameworks. When your application is approved, we attempt to identify your sites documentation framework. If identified the application provisioning process automatically applies the proper crawler template.
If for any reason you want to use a different template they can be found in our documentation. Just find the one you want, go into crawler.algolia.com->editor, and update the json. If you want to suggest changes to existing templates or submit your own template, we would love to see feedback or additions from the community via a GitHub PR.
Now, getting back to the previous blog post on using Starlight with Algolia, there was one little piece of the puzzle that was missing to make the end to end process dead easy – the crawler template! By now you should see that having a crawler template for Starlight means that by default, if your site uses Starlight, your crawler will be automatically configured to create records in your index that are optimized for Starlight sites. The same is true if your site is based on any of our pre-configured templates such as Docusaurus, Vite, and others. In this way, pre-configured crawler templates designed for your documentation framework reduce the time-to-first-search-result. Below is a peek at the Starlight record extractor. This is the magic behind pulling the right data from your website and creating records in your index.
Both the use of Cheerio and CSS selectors are supported and here we use both to build a hierarchy for the records. I chose to grab the top level sidebar menu item as the lvl0. For lvl1, the h1 of the article seemed like an easy choice. You could also use the selected item from the sidebar, however usually these are the same. For more details on this process, the record extractor is covered in our docs.
Getting started with DocSearch is already easy but it’s even easier when you’re using a documentation framework that has a customized crawler template and a native plugin, like with Astro Starlight. We welcome any input on crawler templates and are happy to include any new ones. Just submit a PR as described above. If you have any trouble getting started with DocSearch, you can always reach us on our Discord.
James Gray
Senior Program Manager at AlgoliaPowered by Algolia AI Recommendations