Ruby Documentation

Algolia Search is a search API that provides hosted full-text, numerical and faceted search. Algolia's Search API makes it easy to deliver a great search experience in your apps & websites providing:

  • REST and JSON-based API
  • search among infinite attributes from a single searchbox
  • instant-search after each keystroke
  • relevance & popularity combination
  • typo-tolerance in any language
  • faceting
  • 99.99% SLA
  • first-class data security

This Ruby client wraps our REST API and let you easily use the Algolia Search API from your app.

Notes: Our API clients are open-source, check the reference documentation, source code and unit tests on Github

Setup

Install algoliasearch gem.

gem install algoliasearch

Initialization

The client must be initialized with your Algolia credentials. You can find them in your Algolia Account page.

require 'rubygems'
require 'algoliasearch'

Algolia.init :application_id => "YourApplicationID",
             :api_key        => "YourAPIKey"

Initial import

To index your existing data for the first time, you need to iterate through your database and send each row to Algolia's servers. You can use our API clients or one of our connectors.

Using the API

The following lines load all your database records and send them to Algolia's servers.

Notes: Are you a Ruby on Rails user? You should rather have a look at our Rails integration.

Algolia.init :application_id => "YourApplicationID", :api_key => "YourAPIKey"

def load_data_from_database
  records = []
  # [...]
  return records
end

index = Algolia::Index.new("YourIndexName")
# `load_data_from_database` must return an array of Hash representing your objects
load_data_from_database.each_slice(1000) do |batch|
  index.add_objects(batch)
end

Notes: We recommend sending your records by batches of 1,000 or 10,000 records (depending on their size): it will reduce the number of network calls and increase the overall indexing performance.
Please name your records' identifier objectID. When an object already exists for the specified objectID, the object is replaced: existing attributes that are not replaced are deleted.

Sample record

{
  "objectID": 42,                 // record identifier (needed for updates)
  "title": "Breaking Bad",        // string attribute
  "episodes": [                   // array of strings attribute
    "Crazy Handful of Nothin'",
    "Gray Matter"
  ],
  "like_count": 978,              // numerical attribute
  "avg_stuff": 1.23456,
  "actors": [                     // nested objects attribute
    {
      "name": "Walter White",
      "portrayed_by": "Bryan Cranston"
    },
    {
      "name": "Skyler White",
      "portrayed_by": "Anna Gunn"
    }
  ]
}

Using the JDBC connector beta

This connector synchronizes your existing SQL database with Algolia's indexes without requiring you to write a single line of code in your application or website. MySQL, PostgreSQL and Sqlite3 are currently supported.

First, it will list all your rows with the selectQuery and perform the initial indexing. Then, every refreshRate seconds, it will look for updated rows using the updateQuery and send the updates to Algolia. Deletions will be detected every deleteRate minutes performing a full scan of both your database and index.

The full documentation is available on Github: algolia/jdbc-java-connector

For example, to index the whole content of table mytable from a MySQL database MYDB, use the following commands:

$> curl -fsSL https://raw.github.com/algolia/jdbc-java-connector/master/dist/jdbc-connector.sh > jdbc-connector.sh
$> ./jdbc-connector.sh --source "jdbc:mysql://localhost/MYDB" --username USER --password PASSWORD --selectQuery "SELECT * FROM mytable" --updateQuery "SELECT * FROM mytable WHERE updated_at > _$" --primaryField id --updatedAtField updated_at --applicationId APPLICATION_ID --apiKey API_KEY --index INDEX_NAME

Using the MongoDB connector beta

This connector synchronizes your existing MongoDB database with Algolia's indexes without requiring you to write a single line of code in your application or website.

The full documentation is available on Github: algolia/mongo-connector

Warning: Your mongod server must be started with --replSet enabling the connector to act as a replica and allowing it to read the operations log.

$> git clone https://github.com/algolia/mongo-connector.git
$> cd mongo-connector && git checkout algolia
$> virtualenv ./env # we recommend to use virtualenv to have a clean environment
$> ./env/bin/python setup.py install
$> ./env/bin/python ./mongo_connector/connector.py -m localhost:27017 -n myDb.myCollection -d ./mongo_connector/doc_managers/algolia_doc_manager.py -t MyApplicationID:MyApiKey:MyIndex

Note: If you're using a hosted MongoDB instance (MongoLab, ...), you'll probably need to be on a dedicated cluster so that you have access to the admin collection. In fact, the user used by the connector has to be in the admin table (-a and -p options).

Ranking & Relevance

Each index is configured through a set of settings. Index settings can either modify the way records are indexed or set default query parameters.

We expose several settings allowing you to tune your overall index relevancy, but the most important ones are:

  • attributesToIndex: the ordered subset of indexed attributes, used to query your records.
  • and customRanking: the list of attributes representing the record popularity (can be a rating, a number of views, an amount of revenue, ...) and their associated sort order.

Warning: you must edit these settings to get results from the first keystroke.

Using the admin interface

  1. Log into your Algolia Dashboard
  2. Go to the Indexes menu
  3. Select the index using the dropdown menu
  4. Go to the Settings tab
Index-settings

Using the API

To do it programmatically using the API client, use the following code:

index.set_settings({
  :attributesToIndex => ["name", "description", "url"],
  :customRanking => ["desc(vote_count)", "asc(name)"]
})

Index Settings

For a detailed overview of all available index settings, please refer to the following table:

Indexing parameters
Name Type Description
attributesToIndex string array The list of fields you want to index. If set to null, all textual attributes of your objects are indexed, but you should update it to get optimal results. This parameter has two important uses:
  • Limit the attributes to index. For example if you store a binary image in base64, you want to store it and be able to retrieve it but you don't want to search in the base64 string.
  • Control part of the ranking (see the ranking parameter for full explanation). Matches in attributes at the beginning of the list will be considered more important than matches in attributes further down the list. In one attribute, matching text at the beginning of the attribute will be considered more important than text after, you can disable this behavior if you add your attribute inside unordered(AttributeName), for example attributesToIndex:["title", "unordered(text)"].
Notes: All numerical attributes are automatically indexed as numerical filters. If you don't need filtering on some of your numerical attributes, please consider sending them as strings to speed up the indexing.
You can decide to have the same priority for several attributes by passing them in the same string using comma as separator. For example title and alternative_title have the same priority in this example: attributesToIndex:["title,alternative_title", "text"]
attributesForFaceting string array The list of fields you want to use for faceting. All strings in the attribute selected for faceting are extracted and added as a facet. By default, no attribute is used for faceting.
attributeForDistinct string The attribute name used for the Distinct feature. This feature is similar to the SQL "distinct" keyword: when enabled in query with the distinct=1 parameter, all hits containing a duplicate value for this attribute are removed from results. For example, if the chosen attribute is show_name and several hits have the same value for show_name, then only the best one is kept and others are removed.
Note: This feature is disabled if the query string is empty and there isn't any tagFilters, nor any facetFilters, nor any numericFilters parameters.
ranking string array Controls the way results are sorted. We have nine available criteria:
  • typo: sort according to number of typos,
  • geo: sort according to decreasing distance when performing a geo-location based search,
  • words: sort according to the number of query words matched by decreasing order. This parameter is useful when you use optionalWords query parameter to have results with the most matched words first.
  • proximity: sort according to the proximity of query words in hits,
  • attribute: sort according to the order of attributes defined by attributesToIndex,
  • exact:
    • if the user query contains one word: sort objects having an attribute that is exactly the query word before others. For example if you search for the "V" TV show, you want to find it with the "V" query and avoid to have all popular TV show starting by the v letter before it.
    • if the user query contains multiple words: sort according to the number of words that matched exactly (and not as a prefix).
  • custom: sort according to a user defined formula set in customRanking attribute.
  • asc(attributeName): sort according to a numeric attribute by ascending order. attributeName can be the name of any numeric attribute of your records (integer, a double or boolean).
  • desc(attributeName): sort according to a numeric attribute by descending order. attributeName can be the name of any numeric attribute of your records (integer, a double or boolean).
The standard order is ["typo", "geo", "words", proximity", "attribute", "exact", "custom"].
customRanking string array Lets you specify part of the ranking. The syntax of this condition is an array of strings containing attributes prefixed by asc (ascending order) or desc (descending order) operator. For example "customRanking": ["desc(population)", "asc(name)"]
separatorsToIndex string Specify the separators (punctuation characters) to index. By default, separators are not indexed. Use "+#" to be able to search Google+ or C#.
slaves string array The list of indexes on which you want to replicate all write operations. In order to get response times in milliseconds, we pre-compute part of the ranking during indexing. If you want to use different ranking configurations depending of the use-case, you need to create one index per ranking configuration. This option enables you to perform write operations only on this index, and to automatically update slave indexes with the same operations.
Query expansion
Name Type Description
synonyms array of strings array An array of array of words considered as equal. For example, you may want to retrieve your black ipad record when your users are searching for dark ipad, even if the dark word is not part of the record: so you need to configure dark as a synonym of black. For example "synomyms": [ [ "black", "dark" ], [ "small", "little", "mini" ], ... ].
placeholders hash(string, array of strings) This is an advanced use case to define a token substitutable by a list of words without having the original token searchable. It is defined by a hash associating placeholders to lists of substitutable words. For example "placeholders": { "<streetnumber>": ["1", "2", "3", ..., "9999"]} placeholder to be able to match all street numbers (we use the < > tag syntax to define placeholders in an attribute). For example:
  • Push a record with the placeholder: { "name" : "Apple Store", "address" : "<streetnumber> Opera street, Paris" }
  • Configure the placeholder in your index settings: "placeholders": { "<streetnumber>" : ["1", "2", "3", "4", "5", ... ], ... }.
disableTypoToleranceOn string array Specify a list of words on which the automatic typo tolerance will be disabled.
altCorrections object array Specify alternative corrections that you want to consider. Each alternative correction is described by an object containing three attributes:
  • word the word to correct
  • correction the corrected word
  • nbTypos the number of typos (1 or 2) that will be considered for the ranking algorithm (1 typo is better than 2 typos)
For example "altCorrections": [ { "word" : "foot", "correction": "feet", "nbTypos": 1},
{ "word": "feet", "correction": "foot", "nbTypos": 1}]
.

Warning: The synonyms and placeholders features only support single-word expansions.

Default query parameters (can be overridden at query-time)
Name Type Description
minWordSizefor1Typo integer The minimum number of characters to accept one typo (default = 3).
minWordSizefor2Typos integer The minimum number of characters to accept two typos (default = 7).
hitsPerPage integer The number of hits per page (default = 10).
attributesToRetrieve string array Default list of attributes to retrieve in objects. If set to null, all attributes are retrieved.
attributesToHighlight string array Default list of attributes to highlight. If set to null, all indexed attributes are highlighted.
attributesToSnippet string array Default list of attributes to snippet alongside the number of words to return (syntax is 'attributeName:nbWords'). If set to null, no snippet is computed.
queryType string Select how the query words are interpreted, it can be one of the following value:
  • prefixAll : all query words are interpreted as prefixes,
  • prefixLast : only the last word is interpreted as a prefix (default behavior),
  • prefixNone : no query word is interpreted as a prefix. This option is not recommended.
highlightPreTag string Specify the string that is inserted before the highlighted parts in the query result (default to "<em>").
highlightPostTag string Specify the string that is inserted after the highlighted parts in the query result (default to "</em>").
optionalWords array of string Specify a list of words that should be considered as optional when found in the query.

Incremental updates

To keep your index up-to-date with the latest additions and changes on your application of website, you need to call the update object function for new or changed objects, and the delete object function for deleted objects.

To save your new or updated objects, use the following code:

# update the record with objectID="myID1"
# the record is created if it doesn't exist
index.save_object({ :name => "Jimmy", :company => "Paint Inc." }, "myID1")

To update a subset of the attributes of an object, use the following code:

# update (or add if it doesn't exist) the attribute "firstname" of object "myID1"
index.partial_update_object({ :firstname => "Jimmie" :objectID => "myID1" })

To delete your deleted objects, use the following code:

# delete the record with objectID="myID1"
index.delete_object("myID1")

Reindexing

Clear an index

Clearing an index removes all its records and keep the its settings.

To clear an index, use the following code:

index.clear_index

Atomical re-indexing

In some cases, you may want to totally change the way your index is structured and need to reindex all your data. In order to keep your existing service running while re-importing your data, you recommend the usage of a temporary index plus an atomical move.

# import all your data to a temporary `YourIndexName_temp` index
# [...]

# rename the tempory index to its final name
client.move_index('YourIndexName_temp', 'YourIndexName')

Search

A search is composed by a full-text query and optional query parameters. If the query is empty, all records will match. Query parameters can be used to override default query-time index settings.

We recommend storing all the attributes needed to display a search result (hit) directly in your index (even if those attributes are not used for search) in order to bypass your database to display the results page. The resulting performance will be increased and your database offloaded.

We recommend the usage of our JavaScript client to perform queries directly from the end-user's browsers without going through your servers. It will both reduce the overall search latency and offload your servers.

Using the JavaScript Client

The latest version of the JavaScript client can be downloaded from Github: algolia/algoliasearch-client-js/master/dist/algoliasearch.min.js (Do not use this URL but rather copy the file to your project, Github set a text/plain content type breaking any JavaScript include from raw.github.com).

The full documentation is available on Github: algolia/algoliasearch-client-js

Security: Since your Algolia credentials will be accessible through your JavaScript source code, you must use the search only API_KEY we have provided to you. This API_KEY doesn't allow any write operations on your indexes.

Setup the JavaScript API client and index using the following code:

<script type="text/javascript" src="//rawgithub.com/algolia/algoliasearch-client-js/master/dist/algoliasearch.min.js"></script>
<script type="text/javascript">
  var client = new AlgoliaSearch('YourApplicationID', 'YourSearchOnlyAPIKey'); // public credentials
  var index = client.initIndex('YourIndexName');
</script>

Notes: All answers are received asynchronously, that's why the JavaScript client makes intensive use of callbacks.

You can perform queries using the following code:

<script type="text/javascript">
  function searchCallback(success, content) {
    if (success) {
      for (var i = 0; i < content.hits.length; ++i) {
        console.log(content.hits[i]);
      }
    }
  }

  // perform query "jim"
  index.search("jim", searchCallback);

  // the last optional argument can be used
  // to addd search parameters
  index.search("jim", searchCallback, { hitsPerPage: 5, facets: '*' });
</script>

The content variable contains the JSON answer:

{
  // array of matched hits
  "hits": [
    {
      "objectID": "1",
      "name": "Jim",
      // [... other attributes ...]
      "_highlightResult": {
        // [...]
        // See "Highlighting" section
      }
    },
    {
      "objectID": "2",
      "name": "Jimmie",
      // [... other attributes ...]
      "_highlightResult": {
        // [...] 
        // See "Highlighting" section
      }
    },    
  ],
  // current page number
  "page": 0,
  // total number of matched hits in the index
  "nbHits": 2,
  // total number of accessible pages
  "nbPages": 1,
  // number of hits per page
  "hitsPerPage": 20,
  // backend processing time (in milliseconds)
  "processingTimeMS": 1,
  // full-text query
  "query": "jim",
  // query parameters
  "params": "query=jim"
}

Using the Ruby Client

To perform queries from your backend, use the following code:

index = Algolia::Index.new("YourIndexName")
answer = index.search("jim")
answer = index.search("jim", { hitsPerPage: 5 })
answer = index.search("jim", { hitsPerPage: 5, facets: '*' })
# answer object contains a "hits" attribute that contains all results
# each result contains your attributes and a _highlightResult attribute that contains highlighted version of your attributes

Query multiple indexes

You can query multiple indexes with a single API call building a batch of queries.

<script type="text/javascript">
  function searchMultiCallback(success, content) {
    if (success) {
      var categories = content.results[0];
      for (var i = 0; i < categories.hits.length; ++i) {
        console.log(categories.hits[i]);
      }

      var products = content.results[1];
      for (var i = 0; i < products.hits.length; ++i) {
        console.log(products.hits[i]);
      }
    }
  }

  var query = "appl"; // Your query

  client.startQueriesBatch();
  client.addQueryInBatch("categories", query, { hitsPerPage: 3 });
  client.addQueryInBatch("products", query, { hitsPerPage: 5 });
  client.sendQueriesBatch(searchMultiCallback);
</script>

Highlighting

Highlighting is an important part of a good search experience. It participates to the feeling of interactivity when typing and also explains the resulting hits, even when a match includes a typo. Users must have the feeling they obtained the hits they asked for. When you perform a query, Algolia automatically adds a _highlightResult attribute to the result objects. This attribute replicates each object structure with highlighting information. Each string attribute is replaced by a JSON object containing:

  • value: the original string with matched elements between highlighting tags (<em> and </em> by default) - note that prefixes and typos are also highlighted,
  • matchLevel: a value that is set to full if all the query terms were found in the attribute, partial if only some of the query terms were found or none if none of the query terms were found,
  • matchedWords: an array of strings that indicates which query terms was found in the attribute.

For example, if your index contains the following objects:

{
  "title": "Apple MacBook Pro 15.4-Inch Laptop with Retina Display",
  "technical_details": [
    "2.3 GHz Intel Core-i7 quad-core processor (Turbo Boost up to 3.5GHz) with 6MB shared L3 cache",
    "512 GB PCIe-based flash storage; 16GB 1600MHz DDR3L onboard memory",
    "15.4-inch (diagonal) Retina display, 2880x1800 pixel Resolution; LED-backlit with IPS technology",
    "NVIDIA GeForce GT 750M with 2GB of GDDR5 memory",
    "Mac OS X Mavericks, Up to 8 hours of battery life"
  ],
  "url": "http://www.amazon.com/Apple-MacBook-ME294LL-15-4-Inch-Display/dp/B0096VD85I"
},
{
  "title": "Apple iPhone 5S - 32GB",
  "technical_details": [
    "4-inch Retina display",
    "A7 chip with M7 motion coprocessor",
    "Touch ID fingerprint sensor",
    "New 8MP iSight camera with True Tone flash and 1080p HD video recording"
  ],
  "url": "http://www.amazon.com/Apple-iPhone-5s-32GB-Space/dp/B00F3J4KYA"
},
...

The iphone finger query will return one result with the _highlightResult attribute containing the following highlighting information:

{
  "hits": [
    {
      "title": "Apple iPhone 5S - 32GB",
      "technical_details": [
        "4-inch Retina display",
        "A7 chip with M7 motion coprocessor",
        "Touch ID fingerprint sensor",
        "New 8MP iSight camera with True Tone flash and 1080p HD video recording"
      ],
      "url": "http://www.amazon.com/Apple-iPhone-5s-32GB-Space/dp/B00F3J4KYA",
      "objectID": "2",
      "_highlightResult": {
        "title": {
          "value": "Apple <em>iPhone</em> 5S - 32GB", // highlighted
          "matchLevel": "partial",
          "matchedWords": [
            "iphone"
          ]
        },
        "technical_details": [
          {
            "value": "4-inch Retina display",
            "matchLevel": "none",
            "matchedWords": []
          },
          {
            "value": "A7 chip with M7 motion coprocessor",
            "matchLevel": "none",
            "matchedWords": []
          },
          {
            "value": "Touch ID <em>finger</em>print sensor", // highlighted
            "matchLevel": "partial",
            "matchedWords": [
              "finger"
            ]
          },
          {
            "value": "New 8MP iSight camera with True Tone flash and 1080p HD video recording",
            "matchLevel": "none",
            "matchedWords": []
          }
        ],
        "url": {
          "value": "http://www.amazon.com/Apple-<em>iPhone</em>-5s-32GB-Space/dp/B00F3J4KYA", // highlighted
          "matchLevel": "partial",
          "matchedWords": [
            "iphone"
          ]
        }
      }
    }
  ],
  "page": 0,
  "nbHits": 1,
  "nbPages": 1,
  "hitsPerPage": 20,
  "processingTimeMS": 1,
  "query": "iphone finger",
  "params": "query=iphone+finger"
}

Query parameters

Each query can be configured setting query parameters. Query parameters control:

  • the way full-text search queries are interpreted,
  • the results pagination,
  • the applied filters (either numerical or category-based),
  • the faceting,
  • the grouping,
  • or the geo-localization.

For a detailed overview of all available query parameters, please refer to the following table:

Full Text Search parameters
Name Type Description
query string The instant-search query string, all words of the query are interpreted as prefixes (for example "John Mc" will match "John Mccamey" and "Johnathan Mccamey"). If no query parameter is set, retrieves all objects.
queryType string Selects how the query words are interpreted:
  • prefixAll: all query words are interpreted as prefixes,
  • prefixLast: only the last word is interpreted as a prefix (default behavior),
  • prefixNone: no query word is interpreted as a prefix. This option is not recommended.
typoTolerance boolean If set to false, disable the typo-tolerance. Defaults to true.
minWordSizefor1Typo integer The minimum number of characters in a query word to accept one typo in this word. Defaults to 3.
minWordSizefor2Typos integer The minimum number of characters in a query word to accept two typos in this word. Defaults to 7.
allowTyposOnNumericTokens boolean If set to false, disable typo-tolerance on numeric tokens (numbers). Default to true.
restrictSearchableAttributes string List of attributes you want to use for textual search (must be a subset of the attributesToIndex index setting). Attributes are separated with a comma (for example "name,address" ), you can also use a JSON string array encoding (for example encodeURIComponent("[\"name\",\"address\"]") ). By default, all attributes specified in attributesToIndex settings are used to search.
advancedSyntax boolean Enable the advanced query syntax. Defaults to 0 (false).
  • Phrase query: a phrase query defines a particular sequence of terms. A phrase query is build by Algolia's query parser for words surrounded by ". For example, "search engine" will retrieve records having search next to engine only. Typo-tolerance is disabled on phrase queries.
  • Prohibit operator: The prohibit operator excludes records that contain the term after the - symbol. For example search -engine will retrieve records containing search but not engine.
analytics boolean If set to false, this query will not be taken into account in analytics feature. Default to true.
synonyms boolean If set to false, this query will not use synonyms defined in configuration. Default to true.
replaceSynonymsInHighlight boolean If set to false, words matched via synonyms expansion will not be replaced by the matched synonym in highlight result. Default to true.
optionalWords string Specify a list of words that should be considered as optional when found in the query. The list of words is comma separated. This list will be appended to the one defined in your index settings.
Pagination parameters
Name Type Description
page integer Pagination parameter used to select the page to retrieve. Page is zero-based and defaults to 0. Thus, to retrieve the 10th page you need to set page=9
hitsPerPage integer Pagination parameter used to select the number of hits per page. Defaults to 20.
Parameters to control results content
Name Type Description
attributesToRetrieve string List of object attributes you want to retrieve (let you minimize the answer size). Attributes are separated with a comma (for example "name,address" ), you can also use a JSON string array encoding (for example encodeURIComponent("[\"name\",\"address\"]") ). By default, all attributes are retrieved. You can also use * to retrieve all values when an attributesToRetrieve setting is specified for your index.
attributesToHighlight string List of attributes you want to highlight according to the query. Attributes are separated by a comma. You can also use a JSON string array encoding (for example encodeURIComponent("[\"name\",\"address\"]") ). If an attribute has no match for the query, the raw value is returned. By default all indexed text attributes are highlighted. You can use * if you want to highlight all textual attributes. Numerical attributes are not highlighted. A matchLevel is returned for each highlighted attribute and can contain:
  • full:
    if all the query terms were found in the attribute,
  • partial:
    if only some of the query terms were found,
  • none:
    if none of the query terms were found.
attributesToSnippet string List of attributes to snippet alongside the number of words to return (syntax is'attributeName:nbWords').Attributes are separated by a comma (Example:attributesToSnippet=name:10,content:10). You can also use a JSON string array encoding (Example:encodeURIComponent("[\"name:10\",\"content:10\"]")) By default no snippet is computed.
getRankingInfo integer If set to 1, the result hits will contain ranking information in
_rankingInfo
attribute.
Numeric search parameters
Name Type Description
numericFilters string The list of numeric filters you want to apply separated by a comma. The syntax of one filter is AttributeName followed by Operator followed by Value .
Supported operators are: < <= = > >= != .

You can easily perform range queries via the : operator (equivalent to combining >= and <= operators), for example numericFilters=price:10 to 1000.

You can also mix OR and AND operators. The OR operator is defined with a parenthesis syntax. For example (code=1 AND (price:[0-100] OR price:[1000-2000])) translates in encodeURIComponent("code=1,(price:0 to 10,price:1000 to 2000)").

You can also use a JSON string array encoding. For example: encodeURIComponent("[\"price>100\",\"code=1\"]")
Category search parameter
Name Type Description
tagFilters string Filter the query by a set of tags. You can AND tags by separating them by commas. To OR tags, you must add parentheses. For example:tagFilters=tag1,(tag2,tag3)means tag1 AND (tag2 OR tag3) . At indexing, tags should be added in the
_tags
attribute of objects (for example{"_tags":["tag1","tag2"]}). You can also use a JSON string array encoding, for exampleencodeURIComponent("[\"tag1\",[\"tag2\",\"tag3\"]]")means tag1 AND (tag2 OR tag3) . Negations are supported via the - operator, prefixing the value. For example:tagFilters=tag1,-tag2.
Distinct parameter
Name Type Description
distinct boolean Enable the distinct feature (disabled by default) if the attributeForDistinct index setting is set. This feature is similar to the SQL "distinct" keyword: when enabled in a query with the distinct=1 parameter, all hits containing a duplicate value for the attributeForDistinct attribute are removed from results. For example, if the chosen attribute is show_name and several hits have the same value for show_name, then only the best one is kept and others are removed.
Note: This feature is disabled if the query string is empty and there isn't any tagFilters, nor any facetFilters, nor any numericFilters parameters.
Faceting parameters
Name Type Description
facets string List of object attributes that you want to use for faceting. Attributes are separated with a comma (for example "category,author" ). You can also use a JSON string array encoding (for example encodeURIComponent("[\"category\",\"author\"]") ). Only attributes that have been added in attributesForFaceting index setting can be used in this parameter. You can also use * to perform faceting on all attributes specified in attributesForFaceting.
facetFilters string Filter the query by a list of facets. Facets are separated by commas and each facet is encoded as attributeName:value. To OR facets, you must add parentheses. For example:facetFilters=(category:Book,category:Movie),author:John%20Doe. You can also use a JSON string array encoding (for example encodeURIComponent("[[\"category:Book\",\"category:Movie\"],\"author:John Doe\"]") ). Negations are supported via the - operator, prefixing the facet value. For example: encodeURIComponent("[\"category:Book\",\"category:-Movie\",\"author:John Doe\"]") .
maxValuesPerFacet integer Limit the number of facet values returned for each facet. For example:maxValuesPerFacet=10will retrieve max 10 values per facet.
Geo-search parameters
Name Type Description
aroundLatLng string Search for entries around a given latitude/longitude (specified as two floats separated by a comma, for examplearoundLatLng=47.316669,5.016670). You can specify the maximum distance in meters with the aroundRadius parameter. At indexing, you should specify the geoloc of an object with the
_geoloc
attribute (in the form_geoloc":{"lat":48.853409, "lng":2.348800}).
aroundRadius integer Control the radius associated with a aroundLatLng query. In meter.
aroundPrecision integer Control the precision of a aroundLatLng query. In meter. For example if you setaroundPrecision=100, two objects that are separated by less than 100 meters will be considered as identical by the
geo
ranking parameter.
insideBoundingBox string Search for entries inside a given area defined by the two extreme points of a rectangle (defined by 4 floats: p1Lat,p1Lng,p2Lat, p2Lng, for exampleinsideBoundingBox=47.3165,4.9665,47.3424,5.0201). At indexing, you should specify geoloc of an object with the
_geoloc
attribute (in the form_geoloc":{"lat":48.853409, "lng":2.348800}).

Sorting

To achieve the best performance possible, we pre-compute your sort criteria at indexing-time. The consequence of this approach is that your sort criteria are statically defined in your index settings: each index has unique sort criteria. This is an optimization (and one of the main difference we have with other engines) to ensure you will always have outstanding performance at query-time.

Default Ranking

Most search engines rank results based on a unique float value that is hard, if not impossible, to decipher. Instead of assigning such global float score to each result, Algolia's ranking algorithm rates each matching record on several criteria (such as the number of typos or the geo-distance), to which it individually assigns an integer value. Those values are then used to compare matching records against each others, one criterion after another until they are different: it's a kind of tiebreak-based ranking (read more about that in our blog post).

By default, Algolia ranks every matching record by using the following criteria, in the order listed below. The higher up the criterion on the list, the more importance it has on ranking.

  • typo: the number of typos, lower is better
  • geo: if application, the distance (in meters) from the user, lower is better
  • words: the number of query-words that matched (applicable only if you're using optionalWords), higher is better
  • proximity: a value reflecting how physically near are the query words in the record, lower is better
  • attribute: a value reflecting the best matching attribute (combines importance of the attribute and position of the match), lower is better
  • exact: the number of words that are exactly matching (not just prefixes), higher is better
  • custom: the popularity of your record (defined in the customRanking criteria)

Note: You can easily change this order if you want, but we have found that this default order is the best one in 90% of the use cases.

Index Setting ranking

Index-setting-ranking

Record Popularity: Custom Ranking

Algolia's default ranking includes a custom criterion which allows you to add business metrics to the relevance calculation. It's the last criterion of the default ranking configuration.

This custom criterion can be configured using the customRanking index setting and is defined by a list of attributes+orders:

  • asc(YourAttributeName): sort using an ascending order
  • desc(YourAttributeName): sort using a descending order

Which information should you use for this setting? Very easy: think of how you would want all your records ranked when the search box is empty. Amazon would probably want to show the products with the highest number of sales first (desc(sale_count)), LinkedIn might want to show the people with the most connections (desc(connection_count)), and TripAdvisor the website rank associated to each place (asc(rank)).

Index Setting customRanking

Tutorial-instant-search-customranking

Sort by Attribute

If you want to sort your results by a numerical attribute (by price, by date, by number of reviews, etc...) you can add an extra criterion on top of the ranking formula. As a consequence, your sort criterion will be applied before the default text relevance criteria.

We support either ascending or descending order:

  • asc(YourAttributeName): sort using an ascending order
  • desc(YourAttributeName): sort using a descending order

Notes: Only numerical (integers or doubles) attributes can be used here.

Index Setting ranking, sort by Highest Price

Index-setting-ranking-by-attribute

Multiple Sort Criteria

In some use cases you want to be able to sort by different criteria, for example ascending sort on price, descending sort on price, popularity, ...

To achieve the best performance possible and as described before, we pre-compute your sort criteria at indexing-time. The consequence of this approach is that you cannot change the sort criteria at query-time: each index has unique sort criteria. If you want to have two different sorts, you will need to duplicate your data in two indexes and set different sort settings on them. In order to avoid sending your data twice, you can use the master-slave feature.

The master-slave feature allows you to replicate the content of one index (called "master") in other indexes (called "slaves") that can have different settings (in our case different sort criteria). Your code will always push the data to the "master" index.

To declare slave indexes, you need to list their name in the master index settings. The resulting slave indexes will be created with the same content than your master index but have their own settings.

Index Setting slaves

Tutorial-instant-search-slaves

Filtering

There are several way to filter a result set, depending on the attribute you want to filter by, please consider one of the following methods:

Numeric-Search

Filter by numerical value

Numerical Values Indexing

Algolia supports indexing of numerical values (integers and doubles). This can be used for searching for products in a given price range for example.

To enable it, you need to have objects with numerical attributes (ensure your numerical values are not encoded as strings).

Considering the following object with its price attribute:

{
  "title": "Apple MacBook Pro 15.4-Inch Laptop with Retina Display",
  "price": 2594,
  "url": "http://www.amazon.com/Apple-MacBook-ME294LL-15-4-Inch-Display/dp/B0096VD85I"
},
{
  "title": "Apple iPhone 5S - 32GB",
  "price": 969.99,
  "url": "http://www.amazon.com/Apple-iPhone-5s-32GB-Space/dp/B00F3J4KYA"
},
...

You can search with numeric conditions on the price. We support six operators: <, <=, =, >, >= and !=.

# search only with a numeric filter
puts index.search("", { "numericFilters" => "price>1000" })
# search with a query string a numeric filter
puts index.search("appl", { "numericFilters" => "price>1000" })

You can easily perform range queries via the : operator (equivalent to combining >= and <= operators):

# search by query string and numeric range
puts index.search("appl", { "numericFilters" => "price:1000 to 3000" })

You can also mix OR and AND operators. The OR operator is defined with a parenthesis syntax (warning: != cannot be ORed).
For example (code=1 AND (price:[0-100] OR price:[1000-2000])) translates in:

# search by query string and numeric range
puts index.search("appl", { "numericFilters" => "code=1,(price:1000 to 3000,price:10 to 100)" })

If you're performing the query from your JavaScript code, please use the following syntax:

<script type="text/javascript">
  // search only with a numeric filter
  index.search('', function(success, content) {
    // TODO
    console.log(content.hits);
  }, {
    numericFilters: 'price>1000'
  });

  // search by query string and complexe numerical conditions
  index.search('los', function(success, content) {
    // TODO
    console.log(content.hits);
  }, {
    numericFilters: 'code=1,(price:1000 to 3000,price:10 to 100)'
  });
</script>

One-to-Many Association

Our engine supports the indexing of array of numerical values. Therefore, you're able to index one-to-many associations with an attribute containing the list of associated IDs.

{
  "objectID": 1,
  "name": "My project title",
  "contributor_ids": [12, 42, 34]
}

You're then able to search for all projects that have been contributed by a specific author ID using a regular numericFilters query parameter:

p index.search("", { "numericFilters" => "contributor_ids=42" })

Search by date

Filter by date

Algolia supports indexing of dates. To enable search by date, you must convert your dates in numeric values. We recommend to use a unix timestamp as illustrated by the following code:

# add date attribute as a timestamp
index.save_object({ :name => "Jimmy", :company => "Paint Inc.", :date =>  Date.parse('2013-03-10').to_time.to_i }, "myID1")

You can then use standard numeric operators in your search. You can even search by date range by combining two operators as illustrated by the following example:

# search by date between 2013-03-10 & 2013-04-20
puts index.search("", { "numericFilters" => "date>=1362873600,date<=1366416000" })

Category-Search

Search by tag

Filter by tag

Algolia supports indexing of categories (tags) that you can use when searching for a specific kind of objects.

To enable it, you need to index objects with a _tags attribute that contains the list of their categories (You can also use faceting, tags is just a simplified version of faceting).

Here is an example indexing products with tags:

{
  "title": "Apple MacBook Pro 15.4-Inch Laptop with Retina Display",
  "_tags": ["laptop", "computer", "retina"],
  "url": "http://www.amazon.com/Apple-MacBook-ME294LL-15-4-Inch-Display/dp/B0096VD85I"
},
{
  "title": "Apple iPhone 5S - 32GB",
  "_tags": ["phone", "smartphone", "retina"],
  "url": "http://www.amazon.com/Apple-iPhone-5s-32GB-Space/dp/B00F3J4KYA"
},
...

You can then easily search for one or multiple tags:

# search only by tags (smartphone AND retina)
puts index.search("", { "tagFilters" => "smartphone,retina" })
# search by query string and tags
puts index.search("appl", { "tagFilters" => "smartphone,retina" })

You can also mix OR and AND operators. The OR operator is defined with a parenthesis syntax. For example (retina AND (laptop OR smartphone)) translates in:

# search by query string and tags (retina AND (smartphone OR laptop))
puts index.search("appl", { "tagFilters" => "retina,(smartphone,laptop)" })

Negations are also supported via the - operator, prefixing the value. For example (retina AND NOT(smartphone)) translates in:

# search by query string and tags (retina AND NOT(smartphone))
puts index.search("appl", { "tagFilters" => "retina,-smartphone" })

Search by facet

Filter by facet

You can also filter your result set using facets. Please refer to the Faceting documentation.

Faceting

Algolia supports faceting and faceted search (or filtering, navigation). To enable it, you need to set the list of attributes on which you want to enable faceting in the index settings. The attributes can be of any type (we extract all numerical and string values in the attribute).

Here is an example of a book listing:

{
  "title": "The Hitchhiker's Guide to the Galaxy",
  "authors": "Adams Douglas",
  "type": "Literature & Fiction",
  "url": "http://www.amazon.com/Hitchhikers-Guide-Galaxy-Douglas-Adams/dp/0345391802"
},
{
  "title": "Remote: Office Not Required",
  "authors": ["Jason Fried", "David Heinemeier Hansson"],
  "type": "Business & Investing",
  "url": "http://www.amazon.com/Remote-Office-Required-Jason-Fried/dp/0804137501"
},
{
  "title": "Rework",
  "authors": ["Jason Fried", "David Heinemeier Hansson"],
  "type": "Business & Investing",
  "url": "http://www.amazon.com/Rework-Jason-Fried/dp/0307463745"  
},
...

You can enable faceting on the authors and type attributes with the following code:

index.set_settings({"attributesForFaceting" => ["authors", "type"]})

In the query you need to specify the list of attributes on which you want to enable faceting (* is a shortcut to enable faceting on all attributes specified in index settings).

<script type="text/javascript">
  // search by query string with faceting on all attributes
  index.search('appl', searchCallback, { facets: '*' })
  // search by query string with faceting on authors attribute
  index.search('appl', searchCallback, { facets: 'authors' })
  // search by query string with faceting on authors & type attributes
  index.search('appl', searchCallback, { facets: 'authors,type' })
</script>

Or with Ruby code:

# search by query string with faceting on all attributes
puts index.search("appl", { "facets" => "*" })
# search by query string with faceting on authors attribute
puts index.search("appl", { "facets" => "authors" })
# search by query string with faceting on authors & type attributes
puts index.search("appl", { "facets" => "authors,type" })

Filtering / Navigation

You can implement faceted navigation (or facet filtering) by specifying the list of facet values you want to use as refinements using the facetFilters query parameter:

Warning: Do not forget to configure your attributesForFaceting index setting with the list of attributes of want to facet on, otherwise you'll not be able to use it at query-time.

<script type="text/javascript">
  // filter on author=Adams Douglas AND type=Literature & Fiction
  index.search('appl', searchCallback, { facets: '*', facetFilters: ['authors:Adams Douglas', 'type:Literature & Fiction'] })
</script>

Or with Ruby code:

# filter on author=Adams Douglas AND type=Literature & Fiction
puts index.search("appl", {
  "facets" => "*",
  "facetFilters" => ["authors:Adams Douglas", "type:Literature & Fiction"]
})

Refinements are ANDed by default (Conjunctive selection).

Conjunctive-faceting Example of a conjunctive facet

You can get facets' counts in the facets attribute of the JSON answer:

{
  "hits": [ ... ],
  "page": 0,
  "nbHits": 2,
  "nbPages": 1,
  "hitsPerPage": 20,
  "processingTimeMS": 1,
  "query": "appl",
  "params": "query=appl&facets=*",
  "facets": {
    "authors": {
      "Jason Fried": 2,
      "David Heinemeier Hansson": 1,
      "Adams Douglas": 1
    },
    "type": {
      "Literature & Fiction": 1,
      "Business & Investing": 2
    }
  }
}

To OR refinements, you must use nested arrays. For example, to refine on "Business & Investing" books written by Jason Fried or David Heinemeier Hansson:

<script type="text/javascript">
  index.search('appl', searchCallback, { facets: '*', facetFilters: [['authors:Jason Fried', 'authors:David Heinemeier Hansson'], 'type:Business & Investing'] })
</script>

Or with Ruby code:

puts index.search("appl", {
  "facets" => "*",
  "facetFilters" => [["authors:Jason Fried", "authors:David Heinemeier Hansson"], "type:Business & Investing"]
})

Negations are also supported via the - operator, prefixing the facet value. For example to refine on "Vusiness & Investing" book written by Jason Fried and not David Heinemeir Hanssan:

<script type="text/javascript">
  index.search('appl', searchCallback, { facets: '*', facetFilters: ['authors:Jason Fried', 'authors:-David Heinemeier Hansson', 'type:Business & Investing'] })
</script>

Or with Ruby code:

puts index.search("appl", {
  "facets" => "*",
  "facetFilters" => ["authors:Jason Fried", "authors:-David Heinemeier Hansson", "type:Business & Investing"]
})

Disjunctive Faceting

The most common use case for faceted search or navigation is to select at most one value per facet, but there are at least two ways from which a user might select multiple values from the same facet:

  • Conjunctive "AND" selection (standard Navigation, described above)
  • Disjunctive "OR" selection. Selecting hotel ratings (e.g., hotels with 4 OR 5 stars) may be a kind of disjunctive selection. Checkboxes are usually used to represent such navigation capabilities.

Disjunctive-faceting Example of a disjunctive facet

We've implemented a JavaScript helper to help you generate such pages:

Note: Check out our instant-search tutorial: Instant-Search Tutorial

var helper = new AlgoliaSearchHelper(algoliaClient, /* index name */ 'hotels', {
  facets: ['facilities'],       // list of conjunctive facets
  disjunctiveFacets: ['stars'], // list of disjunctive facets
  hitsPerPage: 10
});
helper.search('luxury', function(success, content) {
  for (var i = 0; i < content.hits.length; ++i) {
    // display the result set
    // [...]
  }
  for (var facet in content.facets) {
    // display link-based refinements
    // [...]
  }
  for (var facet in content.disjunctiveFacets) {
    // display checkbox-based refinements
    // [...]
  }
});

Internals: Disjunctive faceting results in querying several times the index:

  • a query is performed to display the result set ANDing refined conjunctive facets and ORing refined disjunctive facets,
  • and a query used to display each disjunctive facet (with the associated number of hits that would be added to the result set if selected) ANDing the refined conjunctive facets.

For example, if a user is looking for an hotel matching the full-text query luxury with 4 OR 5 stars AND with a wifi facility, the following queries must be performed:

  • To display the result set and the conjunctive facet facilities:
    index.search("luxury", { facets: "facilities", facetFilters: "facilities:wifi,(stars:4,stars:5)" }).
  • and to display the disjunctive facet stars:
    index.search("luxury", { facets: "stars", facetFilters: "facilities:wifi" }).

Aggregations & Stats

All numerical-based facets returns the associated min, max & avg values. The values are available in the facets_stats attribute of the JSON answer. They are computed on all facets before the application of the maxValuesPerFacet parameter.

{
  "hits": [ ... ],
  "page": 0,
  "nbHits": 2,
  "nbPages": 1,
  "hitsPerPage": 20,
  "processingTimeMS": 1,
  "query": "appl",
  "params": "query=appl&facets=*",
  "facets": {
    "price": {
      "42": 4,
      "12": 3,
      "1": 1
    }
  },
  "facets_stats": {
    "price": {
      "min": 1,
      "max": 42,
      "avg": 25.625
    }
  }
}

Geo-Search

Algolia supports location aware search, particularly useful when building GPS aware mobile applications.

To enable it, you need to index objects with a _geoloc attribute that contains their latitude and longitude.

Warning: Ensure your lat/lng attributes are not encoded as strings, but floats.

Here is an example indexing cities with their geo-location:

{
  "Name": "Shanghai",
  "Country": "China",
  "Population": 14608512,
  "_geoloc": {
    "lat": 31.222219,
    "lng": 121.458061
  }
},
{
  "Name": "Buenos Aires",
  "Country": "Argentina",
  "Population": 13076300,
  "_geoloc": {
      "lat": -34.613152,
      "lng": -58.377232
  }
},
{
  "Name": "Mumbai",
  "Country": "India",
  "Population": 12691836,
  "_geoloc": {
      "lat": 19.072830,
      "lng": 72.882607
  }
},
...

You can then easily search around a position by specifying its latitude/longitude and the radius inside which you want to search.

# search only by geo-distance
puts index.search("", { "aroundLatLng" => "34.052231,-118.243683", "aroundRadius" => 10000})
# search by query string and geo-distance
puts index.search("los", { "aroundLatLng" => "34.052231,-118.243683", "aroundRadius" => 10000})

If you're performing the query from your JavaScript code, please use the following syntax:

<script type="text/javascript">
  // search only by geo-distance
  index.search('', function(success, content) {
    // TODO
    console.log(content.hits);
  }, {
    aroundLatLng: '34.052231,-118.243683',
    aroundRadius: 10000 // max 10km around
  });

  // search by query string and geo-distance
  index.search('los', function(success, content) {
    // TODO
    console.log(content.hits);
  }, {
    aroundLatLng: '34.052231,-118.243683',
    aroundRadius: 10000 // max 10km around
  });
</script>

Note: the radius parameter is in meters, so 10000 means 10km.

Geo-search queries take the distance between hits and the specified position into account to return hits that are close first. By default, geo-distance is the second criteria of the ranking sequence:

  1. Sort by decreasing number of typos between hits and query string,
  2. Sort by decreasing distance in meters between hits and geo-search point,
  3. Sort according to the proximity of query words in hits,
  4. Sort according to the order of attributes defined by attributesToIndex,
  5. Sort according to the number of words that matched exactly the query words (that is not as a prefix),
  6. Sort according to a user defined formula defined by customRanking.

You can retrieve the matching information for each hit by passing the getRankingInfo parameter to your query. You can also change the precision used in the sorting sort by setting the aroundPrecision parameter (setting it to 100 will consider as identical two hits that are distant of less than 100m):

puts index.search("", { "aroundLatLng" => "34.052231,-118.243683", "aroundRadius" => 10000,
                        "aroundPrecision" => 100, "getRankingInfo" => 1})

If you only want to filter results that are in a given area you can also search inside a bounding box:

# search with a string & inside a geo bounding box (defined by 4 floats: p1Lat,p1Lng,p2Lat,p2Lng)
puts index.search("los", { "insideBoundingBox" => "34.04,-118.24,34.06,-118,26"})

Configure Typo Tolerance

By default, our typo tolerance is configured to tolerate most users’ misspelled queries, even with very short words. The downside of this approach is that you can end up with a lot of approximatively matching results.

You can change this behavior by changing the two following parameters:

  • minWordSizefor1Typo configures the minimum number of letters in a word to tolerate one typo. The default setting is 3, you can increase it to 4 or 5 to be stricter.
  • minWordSizefor2Typos configures the minimum number of letters in a word to tolerate two typos. The default setting is 7, you can increase it to 8 or 9 to be stricter.

These two settings can be either set at query time using query parameters (dynamically defined for each query) or in your index settings (overrides default values).

Notes: Whereas we do not recommend to do it, you can disable typo tolerance by setting minWordSizefor1Typo and minWordSizefor2Typos to a high value (1000 for example).

Multilingual Search

It's pretty common to have attributes translated in several languages. In order to provide a multilingual search to your users (depending on their language) you'll need to push all variations of your attributes to Algolia. For example, your record will look like:

{
  "objectID": 1,
  "title_en": "my title",
  "title_fr": "mon titre",
  "title_de": "mein Titel"
}

By default, our engine will search in all attributes, which is rarely what you want when you have multilingual attributes: english users must search in the english attributes, french users in the french attributes and so on.

Each index is configured with a static list of searchable attributes (configured through the attributesToIndex setting). Therefore, if you want to search in a specific attribute, you will need 1 index per attribute. In our case, 1 index would be configured with attributesToIndex=["title_en"], 1 index would be configured with attributesToIndex=["title_fr"] and 1 index would be configured with attributesToIndex=["title_de"].

To ease such setup, we've developed a master/slave feature. It allows you to replicate the content of one index (identified as "master") in other indexes (identified as "slaves") that can have different settings (in our case different attributesToIndex settings). Your code will always push the data to the "master" index only: no changes required in your code.

To declare slave indexes, you need to list their name in the master index settings. The resulting slave indexes will be created with the same content than your master index but have their own settings.

Indexing Several Types

It's common to search for objects of different types. Think for example about singers and songs, products and their reviews or groups and their members. The most common way to index several types is to have one index per type. You can then perform one query per index after each keystroke and combine results in the UI.

You can learn how to create a simple search bar with multiple object types in this tutorial.

Indexing Relations

You can index your relations with an array of values as we support indexing of arrays in your JSON. It is usually better to index each element of array separately in order to obtain a better ranking.

For example if you want to index TV shows with their associated episodes names, you have two options:

1. Index one object for each TV show that includes all episode names:

{
  "episodes": [
    "A Scandal in Belgravia",
    "The Reichenbach Fall",
    "A Study in Pink",
    "The Great Game",
    "The Hounds of Baskerville",
    "The Blind Banker",
    "Unaired Pilot",
    "The Empty Hearse",
    "The Sign of Three",
    "His Last Vow"
  ],
  "show_name": "Sherlock",
  "popularity": 891
}

2. Index one object for each TV show and one object for each episode:

{
  "show_name": "Sherlock",
  "popularity": 891
},
{
  "episode_name": "A Scandal in Belgravia",
  "episode_of": "Sherlock",
  "popularity": 891
},
{
  "episode_name": "The Reichenbach Fall",
  "episode_of": "Sherlock",
  "popularity": 891
},
{
  "episode_name": "A Study in Pink",
  "episode_of": "Sherlock",
  "popularity": 891
},
...

The second option is far better in terms of ranking. Let's try to perform a query on a set of 10,000 TV shows.

Here are the first two results for the Game of Thrones query with the first option (one object per show that includes episode names):

{
  "show_name": "Game of Thrones",
  "popularity": 110000,
  "episodes": [
    "The Rains of Castamere",
    "Blackwater",
    ...
  ],
  ...
},
{
  "show_name": "Stargate SG-1",
  "opularity": 64460,
  "episodes": [
    "Children of the Gods",
    "Fair Game",
    "Romancing the Throne"
    ...
  ],
  ...
},
...

And here are the first two results for the Game of Thrones query with the second option (one object per show and one object per episode):

{
  "show_name": "Game of Thrones",
  "popularity": 110000
  ...
},
{
  "episode_of": "Game of Thrones",
  "episode_name": "The Rains of Castamere",
  "popularity": 110000,
  ...
},
...

Note than in the left column, the first hit is great as it matches all query words in the show_name attribute but the second result however is pretty ugly. Stargate SG1 was considered a correct hit because it contains three episodes each matching one of the query words. The right column does not have this problem and provide better results.

To conclude, here are the three important tips you should know to better index your objects:

1. It is better to have several small objects than a big one. It will reduce the probability to have a wrong result.

2. When sharing information between several objects, it is better to use a different name for each attribute. This enables to use attributes to order matches by importance. For example, in the second option we have set the attributesToIndex index setting (attributes are sorted by decreasing order of importance) as follows:

index.set_settings({"attributesToIndex" => ["show_name", "episode_name", "episode_of"]})
As a result, the query Game of Thrones the rains of castamere retrieves the hit matching both show name and episode name while still ensuring that shows are always returned before episodes when the query contains only the show name.

3. Finally to have an excellent ranking, you can use the customRanking index setting to introduce popularity of hits. In this case we have used:

index.set_settings({"customRanking" => ["desc(popularity)"]})
It's of course even better if you have a specific popularity for each of your nested object.

If you want to search only inside objects of a certain kind, you can have a look at the Category-Search guide (for example search only in shows or search only in episodes).

Synonyms

We support the configuration of single-word synonyms. You can use our synonyms feature to define several words as equal at query-time. For example, you may want to retrieve your black ipad record when your users are searching for dark ipad, even if the dark word is not part of the record.

Synonyms are configured in your index settings through the synonyms key, and are defined with an array of array of words:

index.set_settings({
  :synonyms => [
    ["black", "dark"],
    ["small", "little", "mini"]
  ]
})

Nothing special to do at query time, the query will be expanded automatically with all matching synonyms. The resulting _higlightResult object will contain an updated version of your original attribute: we integrate the matching synonym in the highlighted form in order to explain the result to the user. For example:

{
  "hits" : [
    {
      "objectID" : 1,
      "title" : "ipad retina",                // original attribute
      "_highlightResult" : {
        "title" : {
          "value" : "<em>tablet</em> retina", // replaced by the matching synonym
          "matchLevel" : "full",
          "matchedWords" : ["tablet"]
        }
      }
    }
  ],
  "nbHits" : 1,
  "page" : 0,
  "nbPages" : 1,
  "hitsPerPage" : 10,
  "processingTimeMS" : 1,
  "query" : "tablet",                         // the user query
  "params" : "query=tablet"
}

Security

Algolia provides several way to secure the way data are searched and retrieved from your indexes with the ability:

  • to create restricted API keys using advanced ACL,
  • to plug query rate limits,
  • or even to define per-user security filters.

ACL

The admin API key provides full control of all your indexes. You can generate user API keys to control security. These API keys can be restricted to a set of operations or/and restricted to a given index.

You can define the following rights while creating user API keys:

  • search: allows to search,
  • browse: allows to retrieve all index content via the browse API,
  • addObject: allows to add/update an object in the index,
  • deleteObject: allows to delete an existing object,
  • deleteIndex: allows to delete index content,
  • settings: allows to get index settings,
  • editSettings: allows to change index settings.

Rate Limits

By proposing a search engine, it becomes possible for people to use it to grab the indexed data. This is not a problem when your data is public, but if it is critical you may want to restrict its access. We provide several mechanisms to protect your data without sacrificing search quality:

  • Restrict the number of API calls: you can limit the allowed number of API calls per hour and per IP.
  • Restrict the number of hits per API call: by default an API call can retrieve up to 1000 hits, but if your application only needs 10 hits per API call you can set this limit to 10.
  • Ephemeral API keys: You can generate ephemeral API keys that can be use to grant a temporary access to your data.

All these limits can be easily combined when you create an API key in the administration interface or with a single line of code. For example you can create an API key that can only perform search queries in all your indexes, with a limit of 100 queries per hour per IP, and with a limit of 20 records fetched per API call:

res = Algolia.client.add_user_key(["search"], 0, 100, 20)
puts res['key']

Note: The second argument (0) indicates that the API key is not ephemeral, you can set a number of seconds of validity if you want an ephemeral API key.

You can also create an API key with the same limits but restricted to one index:

res = index.add_user_key(["search"], 0, 100, 20)
puts res['key']

Per-User Security

You may have a single index containing per-user data. In that case, all records should be tagged with their associated user_id in order to add a tagFilters=user_42 filter at query time to retrieve only what a user has access to. If you're using the JavaScript client, it will result in a security breach since the user is able to modify the tagFilters you've set modifying the code from the browser. To keep using the JavaScript client (recommended for optimal latency) and target secured records, you can generate secured API keys from your backend and use them in your public JavaScript code.

Those secured API keys are generated by hashing (HMAC-SHA-256) the following criteria together:

  • a private API key (can be any API Key that is not the admin API Key). In general this is a search-only API key, since this API key is used as a signature you should keep this key private.
  • a list of tags defining the security filters,
  • and an optional token identifying your user if you want to use this token instead of IP for rate limits (You need to pass it as third argument of the secure key generation and call the setUserToken method in your JavaScript client).

All queries using a secured API key will be automatically filtered by the list of tags specified while generating the key (no need to specify them twice using query parameters).

Example

Let's say you have both public and per-user records in your application. Adding a _tags:["public"] or _tags:["user_XXXX"] attribute to your records will allow you to filter the future queries with tagFilters=(public,user_XXXX) to retrieve public OR private content owned by user XXXX (the parenthesis are used to OR tags).

If the logged-in user has ID=42, generate her secured & public API key using the following code:

public_key = Algolia.generate_secured_api_key('YourSearchOnlyAPIKey', '(public,user_42)')

And configure the JavaScript API client with the following code avoiding the user for searching any content that doesn't match the tags public OR user_42:

<script type="text/javascript">
  var algolia = new AlgoliaSearch('YourApplicationID', 'PublicApiKeyGeneratedForUser42');
  algolia.setSecurityTags('(public,user_42)'); // must be same than those used at generation-time
  algolia.initIndex('YourIndex').search($('#q').val(), function(success, content) {
    // [...]
  });
</script>

UI

Here are some examples of UI built on top of Algolia.

Capture-twittersearch
Typeahead search demo

Auto-completion

To create a nice looking typeahead UI, we recommend the usage of Twitter's typeahead.js.

Inspired by twitter.com's autocomplete search functionality, typeahead.js is a flexible JavaScript library that provides a strong foundation for building robust typeaheads.

An example is available on Github: algolia/algoliasearch-client-js/blob/master/example/autocomplete.html

The following HTML snippet turns the search field into a typehead, you can also follow our complete tutorial.

<html>
  <head>
    <meta content='text/html; charset=utf-8' http-equiv='Content-Type'>
    <link rel="stylesheet" type="text/css" href="//demos.algolia.com/simple-ui.css">
    <meta name="viewport" content="width=device-width,initial-scale=1">
  </head>
  <body>
    <script src="//code.jquery.com/jquery-1.10.1.min.js"></script>
    <script src="//rawgithub.com/algolia/algoliasearch-client-js/master/dist/algoliasearch.min.js"></script>
    <script src="//rawgithub.com/algolia/algoliasearch-client-js/master/vendor/typeahead.jquery.js"></script>

    <div class="demo">
      <input class="typeahead" type="text" placeholder="Start typing" 
             id="typeahead-algolia"  spellcheck="false"/>
    </div>

    <script type="text/javascript">
      $(document).ready(function() {
        var algolia = new AlgoliaSearch('YourApplicationID', 'YourSearchOnlyAPIKey');
        // replace YourIndexName by the name of the index you want to query.
        var index = algolia.initIndex('YourIndexName');

        $('#typeahead-algolia').typeahead(null, {                                
          source: index.ttAdapter({ "hitsPerPage": 10 }),
          displayKey: 'YourAttributeName' // attribute used for display
        });
      });
    </script>

  </body>
</html>

You can also use the Hogan.js templating library to customize how results are displayed. You should add the following script tag and update typeahead to include a template. In this example we have used the url and highlighted title attributes of our objects:

<script src="http://twitter.github.com/hogan.js/builds/2.0.0/hogan-2.0.0.js"></script>
// use Hogan as templating engine
var template = Hogan.compile('<a href="{{url}}">{{{_highlightResult.title.value}}}</a>');
$('#typeahead-algolia').typeahead(null, {
  source: client.initIndex('<%= Contact.index_name %>').ttAdapter(),
  displayKey: 'title', // attribute displayed once selected
  templates: {
    suggestion: function(hit) {
      return template.render(hit); // moustache template rendered by Hogan
    }
  }
});
Capture-hnsearch
Instant search demo

Instant-search

This example will show you how to build a Google-like instant search: updating the results page from the first keystroke.

Basically, the JavaScript code updates the content of #hits at each keystroke.

Example is available on Github: algolia/algoliasearch-client-js/blob/master/example/instantsearch.html

<input autocomplete="off" class="autocomplete" id="q" placeholder="Start typing..." type="text" spellcheck="false" />
<div id="hits"></div>

<script type="text/javascript" src="//rawgithub.com/algolia/algoliasearch-client-js/master/dist/algoliasearch.min.js"></script>
<script type="text/javascript">
  function searchCallback(success, content) {
    if (content.query != $("#q").val()) {
      // do not take out-dated answers into account
      return;
    }

    if (content.hits.length == 0) {
      // no results
      $('#hits').empty();
      return;
    }

    // Scan all hits and display them
    var html = '';
    for (var i = 0; i < content.hits.length; ++i) {
      var hit = content.hits[i];

      // For example, display all properties that have at least
      // one highlighted word (matchLevel = full or partial)
      html += '<div class="hit">';
      for (var propertyName in hit._highlightResult) {
        var el = hit._highlightResult[propertyName];
        if (Object.prototype.toString.call(el) !== '[object Array]' && el.matchLevel !== 'none') {
          html += '<div class="attribute"><span>' + propertyName.substr(0,1).toUpperCase() +
            propertyName.substr(1) + ": </span>" + el.value + "</div>";
        }
      }
      html += '</div>';
    }

    $('#hits').html(html);
  }

  $(document).ready(function() {
    var $inputfield = $("#q");

    // Replace the following values by your ApplicationID and ApiKey.
    var algolia = new AlgoliaSearch('YourApplicationID', 'YourAPIKey');
    // replace YourIndexName by the name of the index you want to query.
    var index = algolia.initIndex('YourIndexName');

    $inputfield.keyup(function() {
      index.search($inputfield.val(), searchCallback);
    }).focus();
  });
</script>
Capture-jellynote
Full-featured results page

Results page with faceting

This example will show you how to build a full-featured results page with faceting: updating the results page and the facet counts and handling refinements from the first keystroke.

Basically, the JavaScript code updates the content of #hits and #facets at each keystroke and stores the current refinements in the refinements variable.

Example is available on Github: algolia/algoliasearch-client-js/blob/master/example/instantsearch+faceting.html

<input autocomplete="off" class="autocomplete" id="q" placeholder="Start typing" type="text" spellcheck="false"/>

<div class="facets-wrapper">
  <h1>Facets</h1>
  <div id="facets"></div>
</div>
<div class="hits-wrapper">
  <h1>Results</h1>
  <div id="hits"></div>
</div>

<script type="text/javascript" src="//rawgithub.com/algolia/algoliasearch-client-js/master/dist/algoliasearch.min.js"></script>
<script type="text/javascript">
  $(document).ready(function() {
    var refinements = {};
    var $inputfield = $("#q");
    // Replace the following values by your ApplicationID and ApiKey.
    var algolia = new AlgoliaSearch('YourApplicationID', 'YourApiKey');
    // replace YourIndexName by the name of the index you want to query.
    var index = algolia.initIndex('YourIndexName');

    $inputfield.keyup(function() {
      search();
    }).focus();

    function toggleRefine(refinement) {
      refinements[refinement] = !refinements[refinement];
      search();
    }

    function search() {
      var filters = [];
      for (var refinement in refinements) {
        if (refinements[refinement]) {
          filters.push(refinement);
        }
      }
      index.search($inputfield.val(), searchCallback, { facets: '*', facetFilters: filters });
    }

    function searchCallback(success, content) {
      if (content.query != $inputfield.val()) {
        // do not consider out-dated queries
        return;
      }
      if (content.hits.length == 0 || content.query.trim() === '') {
        // no results
        $('#hits').empty();
        $('#facets').empty();
        return;
      }

      // Scan all hits and display them
      var hits = '';
      for (var i = 0; i < content.hits.length; ++i) {
        var hit = content.hits[i];

        // For this hit, display all property that have a least
        // one word highlighted (matchLevel = full or partial)
        hits += '<div class="hit">';
        for (var propertyName in hit._highlightResult) {
          var el = hit._highlightResult[propertyName];
          if (Object.prototype.toString.call(el) !== '[object Array]' && el.matchLevel !== 'none') {
            hits += '<div class="attribute"><span>' + propertyName.substr(0,1).toUpperCase() +
              propertyName.substr(1) + ": </span>" + el.value + "</div>";
          }
        }
        hits += '</div>';
      }
      $('#hits').html(hits);

      // Scan all facets and display them
      var facets = '';
      for (var facet in content.facets) {
        facets += '<h4>' + facet + '</h4>';
        facets += '<ul>';
        var values = content.facets[facet];
        for (var value in values) {
          var refinement = facet + ':' + value;
          facets += '<li class="' + (refinements[refinement] ? 'refined' : '') + '">' +
              '<a href="javascript:toggleRefine('' + refinement + '')">' +
              value + '</a> (' + values[value] + ')' +
            '</li>';
        }
        facets += '</ul>';
      }
      $('#facets').html(facets);
    }
  });
</script>