Skip to main content
Ask AI’s token usage, and your costs, depend on how much content it sends to the large language model (LLM). To reduce tokens while maintaining response quality, apply these strategies.
StrategyHow it helpsHow to apply
Reduce hits per LLM requestAsk AI includes 7 search results (β€œhits”) by default in each LLM request. While this provides useful context, it also increases the number of tokens sent, raising costs. Reducing the number of hits lowers token usage.In your Ask AI assistant configuration, change Set a maximum number of search hits per LLM request.
Split records into smaller chunksSmaller record chunks ensure the LLM receives only relevant context. For example, split long documentation pages into smaller records based on headings.See Markdown indexing.
Shorten large recordsReturn only the most relevant excerpts from records instead of the full content.Use the attributesToSnippet parameter or configure it in the Algolia dashboard.
Limit record sizePrevent overly large records from inflating token usage. Set a maximum record size to truncate long records before indexing.Use the maxRecordBytes parameter when indexing content.