Guides / Sending and managing data / Format and structure your data

Data Sanitization

Algolia accepts all data, without any alteration. Same goes with the response; Algolia returns all data in your index as is. It includes, for example, HTML and XML tags, and their properties.

However, Algolia’s search algorithm ignores HTML and XML. Users can’t search tag content.

For example, Algolia can save a record that contains the HTML tag <strong>.

1
2
3
{
  "description": "She is amazingly <strong>powerful</strong>, deeply visionary."
}

However, because the engine strips tags during search, searching for the word “strong” wouldn’t return this record.

Some characters are systematically removed (not escaped) from the API’s response:

Cleaning your indices

Since Algolia doesn’t sanitize your data and returns it as is, you need to manage sanitization yourself. Otherwise, you run the risk of an XSS attack.

To avoid it, you have two options for escaping or stripping dangerous characters: doing it before indexing, or when displaying results.

Cleaning your users’ search input

It’s also essential for you to sanitize what users type in the search input. Any HTML or code they may enter in the search bar exposes you to an XSS attack because Algolia sends the query back in the API response. Therefore, you want to escape or strip tags and code before displaying them.

Did you find this page helpful?