Documentation Index
Fetch the complete documentation index at: https://algolia.com/llms.txt
Use this file to discover all available pages before exploring further.
To provide the most accurate answers, Ask AI relies on cleanly structured content.
To achieve this, it’s best to create a separate text-only for Ask AI.
This is especially important for documentation sites where layout elements, such as navigation components,
might dilute your content.
To split plain text into chunks,
you can use the helpers.splitTextIntoRecords helper in your crawler configuration.
This works with many plain-text formats, but Markdown is especially well-suited for this purpose.
Update your crawler configuration
Add the following code to the actions array in your crawler configuration:
// actions: [ ...,
{
indexName: "my-markdown-index",
pathsToMatch: ["https://example.com/docs/**"],
recordExtractor: ({ $, url, helpers }) => {
// Target only the main content, excluding navigation
const text = helpers.markdown(
"main > *:not(nav):not(header):not(.breadcrumb)",
);
if (text === "") return [];
const language = $("html").attr("lang") || "en";
const title = $("head > title").text();
// Get the main heading for better searchability
const h1 = $("main h1").first().text();
return helpers.splitTextIntoRecords({
text,
baseRecord: {
url,
objectID: url,
title: title || h1,
heading: h1, // Add main heading as separate field
lang: language,
},
maxRecordBytes: 100000, // Higher = fewer, larger records. Lower = more, smaller records.
// Note: Increasing this value may increase the token count for LLMs, which can affect context size and cost.
orderingAttributeName: "part",
});
},
},
// ...],
Update the index settings:
// initialIndexSettings: { ...,
"my-markdown-index": {
attributesForFaceting: ["lang"],
ignorePlurals: true,
minProximity: 1,
removeStopWords: false,
searchableAttributes: ["title", "heading", "unordered(text)"],
removeWordsIfNoResults: "lastWords",
attributesToHighlight: ["title", "text"],
typoTolerance: false,
advancedSyntax: false,
},
// ...},
Run the crawler
After updating the crawler configuration:
- Publish the configuration in the Crawler dashboard to save and activate it.
- Run the crawler to index your Markdown content.
Integrate the Markdown index with Ask AI
Once your Crawler and index are configured,
set up your frontend to use both your main keyword index and your markdown index for AskAI.
Here’s how you might configure DocSearch to use your main keyword index for search and your markdown index for AskAI:
docsearch({
indexName: "INDEX_NAME", // Main DocSearch keyword index
apiKey: "ALGOLIA_SEARCH_API_KEY",
appId: "ALGOLIA_APPLICATION_ID",
askAi: {
indexName: "INDEX_NAME-markdown", // Markdown index for AskAI
apiKey: "ALGOLIA_SEARCH_API_KEY", // (or a different key if needed)
appId: "ALGOLIA_APPLICATION_ID",
assistantId: "ALGOLIA_ASSISTANT_ID",
},
});
indexName refers to your main DocSearch index
askAi.indexName refers to the dedicated Markdown index
Best practices
-
Use clear consistent titles and headings for better discoverability
-
Structure your content with headings and lists for better chunking
-
Add facets to support filtering in your search UI or the Ask AI assistant.
For example, you can add attributes like
lang, version, tags to your and declare them as attributesForFaceting.
-
Adjust record size by changing
maxRecordBytes.
- If your answers seem too broad or fragmented, increase
maxRecordBytes to create fewer, larger records.
This might increase the token count for LLMs,
which can affect the size of the context window and the cost of each Ask AI response.
-
If you have large Markdown files, decrease
maxRecordBytes to create smaller, more focused records.
Example configuration
// In your Crawler config:
// actions: [ ...,
{
indexName: "my-markdown-index",
pathsToMatch: ["https://example.com/**"],
recordExtractor: ({ $, url, helpers }) => {
// Target only the main content, excluding navigation
const text = helpers.markdown(
"main > *:not(nav):not(header):not(.breadcrumb)",
);
if (text === "") return [];
const language = $("html").attr("lang") || "en";
const title = $("head > title").text();
// Get the main heading for better searchability
const h1 = $("main h1").first().text();
return helpers.splitTextIntoRecords({
text,
baseRecord: {
url,
objectID: url,
title: title || h1,
heading: h1, // Add main heading as separate field
lang: language,
},
maxRecordBytes: 100000, // Higher = fewer, larger records. Lower = more, smaller records.
// Note: Increasing this value may increase the token count for LLMs, which can affect context size and cost.
orderingAttributeName: "part",
});
},
},
// ...],
// initialIndexSettings: { ...,
"my-markdown-index": {
attributesForFaceting: ["lang"], // Recommended if you add more attributes outside of objectID
ignorePlurals: true,
minProximity: 1,
removeStopWords: false,
searchableAttributes: ["title", "heading", "unordered(text)"],
removeWordsIfNoResults: "lastWords",
attributesToHighlight: ["title", "text"],
typoTolerance: false,
advancedSyntax: false,
},
// ...},
See also