Icon query rules white

Query Rules Usage

Last updated 21 September 2017

How does Query Rules work?

Technical Overview

Rules are essentially If-Then configurations, or condition/consequence pairs. The if involves parsing the query term and seeing if any one or more parts of the text satisfy a condition. If a condition is satisfied, then the search engine adapts its behavior according to the consequence associated with that condition. If the text contains “red”, for example, the consequence is that the engine can filter on the facet “color”. And that’s only one kind of consequence. As described above, and presented in technical detail below - there are many kinds of consequences that not only adapt the search but alter the relevance.

These condition/consequence pairs are set up in your query rules as you configure your indexes. This prepares the index, so that when a search is performed, the data is already prepared for the If-Then matching.

Note that text matching is case insensitive and can include matching on prefixes, suffixes, substrings, and also multiple word phrases.

All of this comes with no noticeable impact on performance. Most of the work is done during indexing, so the search performance is not affected.

API Overview

As suggested, Query Rules allows fine-tuning results for queries matching specific patterns. They do so in two complementary ways:

  1. Query pre-processing. Rules may alter the query parameters (i.e. not only the query string, but also filters, facets, etc.) before the query is processed.

  2. Results post-processing. Rules may cause results (hits) to be ranked differently for specific queries. They may also add user data to the results.

Rules are complementary to the traditional ranking and textual relevance settings: while settings act globally on every query to an index, rules act selectively on specific queries.

The features of Query Rules

{
    "condition": { /* What the query must match for the rule to be applied.  */ },
    "consequence": { /* How the query will be modified if the rule is applied. */ }
}

Condition

A rule’s condition identifies query strings matching a specific pattern.

More precisely, it is composed of:

  • A mandatory query pattern (acting on the full text query string, i.e. the query search parameter).

  • An optional context, which must match those supplied at query time (via the ruleContexts parameter).

A rule with a context is said to be contextual; a rule without a context is general.

Query pattern

The query pattern is the most important piece of the rule’s condition. It consists of a sequence of tokens, treated as a phrase (i.e. all tokens must appear contiguously and in the specified order). The allowed token types are:

  • Literal: a plain word that must appear as is. Matching is case insensitive, but not typo tolerant. Plurals and synonyms are not taken into account.

  • Facet value placeholder: will match any value of a given facet in the same index. The facet must have been declared in attributesForFaceting. Matching is case insensitive. Contrary to literals, facet values may be phrases.

It is worth noting that the pattern is implicitly a phrase, i.e. the order of words matters: foo bar and bar foo are not identical patterns.

In addition, the pattern has an anchoring type, depending on whether its boundaries (beginning, end) must coincide with the boundaries of the query string:

  • is: the pattern must exactly match the entire query string;
  • starts with: the pattern must match the beginning of the query string, but there may be extra words at the end;
  • ends with: the pattern must match the end of the query string, but there may be extra words at the beginning;
  • contains: the pattern can match any subsequence of words from the query string.

Context

A rule can have one context at most, but the query may specify multiple contexts which are treated as disjunctive (OR). When one or more contexts are specified at query time, contextual rules matching any of those contexts are activated. Note that general rules are always activated, no matter whether contexts are specified or not.

The context’s primary goal is to conditionally enable rules for only a subset of queries (e.g. in a specific category of an e-commerce application).

If present, the context is a string that must be passed at query time in the ruleContexts search parameter with the exact same value for the rule to be triggered. Matching is case sensitive.

The rulesContexts parameter is an array of contexts, which allows you to enable multiple contexts at the same time

Example

Here is what a typical rule condition looks like:

{
    "condition": {
        "pattern": "{facet:brand} smartphone",
        "anchoring": "contains",
        "context": "electronics_phones"
    },
    "consequence": { /* [...] */ }
}

For the complete schema, please refer to the API Reference.

Consequence

A rule’s consequence is composed of any non-empty combination of the following actions:

  • Add query parameters: Any number of any valid search parameters are supported. Note that these parameters are literals, i.e. constants.
{
    "condition": { /* [...] */ },
    "consequence": {
        "params": {
            "optionalWords": ["page", "line"],
            /* Any other query parameter is allowed. */
        }
    }
}
  • Automatic facet filter: Transform a facet value placeholder into a facet filter: the value from the query, as captured by the placeholder, is used as the value for the facet filter. (Of course, the targeted facet is the same as the placeholder’s.)
{
    "condition": {
        "pattern": "{facet:brand}",
        /* [...] */
    },
    "consequence": {
        "params": {
            "automaticFacetFilters": ["brand"]
        }
    }
}
  • Remove a word from the query string. The word is identified by the same notation as in the query pattern (literal, facet value placeholder…). If the removed word has multiple occurrences in the query string, all occurrences are removed.
{
    "condition": {
        "pattern": "{facet:brand}",
        /* [...] */
    },
    "consequence": {
        "params": {
            "query": {
                "remove": "{facet:brand}"
            }
        }
    }
}
  • Replace the query string entirely. (This is mutually exclusive with removing a word.) It may also have an impact on subsequent rules (see Matching algorithm).
{
    "condition": { /* [...] */ },
    "consequence": {
        "params": {
            "query": "this will replace the query string"
        }
    }
}
  • Promote specific hits. One or more objects from the same index, identified by their objectID, are promoted to specific positions in the hits. This works seamlessly with Algolia’s built-in pagination, mingling promoted hits with regular hits while keeping pages the same size. Hit promotion is detailed below.
{
    "condition": { /* [...] */ },
    "consequence": {
        "promote": [
            { "objectID": "a1b2c3d4", "position": 0 }, // positions are zero-based
            { "objectID": "e5f6xyzt", "position": 5 }
        ]
    }
}
  • Return user data. This user data is returned outside of the hits, in a dedicated userData array. If multiple rules are applied, each user data is simply appended to this array. User data can be used to display specific information that does not affect pagination.
{
    "condition": { /* [...] */ },
    "consequence": {
        "userData": {
            "message": "Summer Sales going on!",
            "blink": true,
            "url": "https:/www.my-awesome-shop.com/sales/summer"
        }
    }
}

Hit promotion

Only objects coming from the same index can be promoted. Promoted objects have to be explicitly identified by their objectID.

A promoted object will always be considered a hit, even if it doesn’t match the query. If it would have matched the query, it is removed from its original position and inserted at its promoted position, even if the original position would have been better than the promoted position (in other words, promoted hits can also be “demoted”). For performance reasons, promoted positions are restricted to the range [0, 50] (keep in mind that positions are zero-based).

Inside the same rule, each promoted object must have a different promoted position. If promoted objects from two distinct rules are triggered for the same query:

  • Any duplicates are merged, using the best position.
  • If the resulting positions conflict between distinct objects, objects are shifted down until a free slot is found.
  • All regular hits are shifted down as many times as necessary to ensure that all promoted objects get as close to their promoted position as possible (modulo conflicts between objects, as stated above).

User data

User data allows to inject data inside the results that are not objects coming from the index, and as such doesn’t compete with other hits for pagination. A typical use-case would be to display a banner on top of the result list.

User data can be any JSON object. It is not interpreted by the API whatsoever.

Matching algorithm

Because an index may contain many rules, each matching a different part of the query, with possible conflicts between them, the API applies a strict precedence logic to decide between rules in a deterministic fashion.

Precedence acts mainly along two axes: specificity (the more specific a rule is, the higher precedence it has—similar to CSS selectors) and query text.

Furthermore, a given word in the query can match only one rule (“no overlap” principle). If multiple rules match the same word, precedence logic is applied, and only the rule with the highest precedence is applied. Note that multiple rules can still match a given query, provided they match a distinct sets of words.

Precedence logic

The precedence logic sorts rules along the following criteria. This is a tie-breaking algorithm, much like the ranking formula. In other words, a criterion is only considered when all its preceding criteria rank equal.

  • Position: The earliest match wins (i.e. closest to the beginning of the query string).
  • Match length: The longest match wins (in terms of number of words from the query string).
  • Anchoring: is > starts with > ends with > contains.
  • Context: A contextual rule has higher priority than a general rule.
  • Literals over placeholders: If a word could match both a literal or a facet value, the literal takes precedence.
  • Rule ID: If there are still conflicts after all other criteria have been applied, we just take the smallest objectID in lexicographical order. This step is required to resolve ties in 100% of cases, but should not happen unless your set of rules contains duplicates.

A consequence of the precedence logic is that rules are applied from left to right (more precisely, from the beginning of the query string to its end).

Edge cases

If a rule removes a word from the query string, all subsequent rules that would have been triggered by this word (be it via a literal or a facet placeholder) are disabled.

If a rule replaces the query string entirely, all subsequent rules are disabled.

© Algolia - Privacy Policy