Product

Best practices for federated search
facebooklinkedintwittermail

Federated search essentially runs a search query across multiple sites, applications, or data sources, and displays the results in a single or multi-column layout. 

With a federated search interface, a media streaming platform could display a list of songs, artists, and podcasts, all coming from different data sources  – with a single query. Ecommerce as well could benefit by displaying recommendations and different categories of products on different parts of the display.

In a website context, you could create a real time search application that cuts across multiple domains or subdomains. It’s as easy as adding a new domain or subdomain to the list of sites to include in a central index, then creating a few additional rules for how search should appear. For example, if you have a “parent” website and several “sister” websites, you may want the parent website to display results from all sites equally, but on a sister site it could favor results from that sister site. 

For enterprise search, it could include different information sources such as intranets, databases, and other datasets. You could centralize your back-end data or execute the same query on different applications and datasets.

In all cases, federated search starts with a single query and returns diverse results from multiple datasources to help your users to pinpoint the information they need.

Federated search case study: NSW.gov.au

NSW.gov.au is the public face for the Australian state of New South Wales. Like other government agencies around the world, there’s a top-level government website plus dozens of other agency sites such as the department of treasury, ministry of health, department of industry, etc.

The team at NSW.gov.au found that visitors were often searching on questions around driver’s licenses, liquor licenses, moving, and other topics that were available on a sister agency site, ServiceNSW.

federated search for drivers license

Drivers license renewal information is on ServiceNSW. NSW.gov.au offers federated search across both sites to provide a better user experience for site visitors.

To accommodate visitors, NSW.gov.au blended datasets from ServiceNSW — including web pages, PDF, and DOCX content —and delivered results alongside the parent website’s content. 

Web searches on NSW.gov.au include results from both sites but relevance scoring and machine learning automatically improve results to favor certain keyword searches. Visitors that type in “rego” (short for driver’s license registration) on NSW.gov.au will get results from ServiceNSW.

Each site is being crawled independently and the data is consolidated into a federated search index; no additional connectors were needed. In another situation you may need to use our APIs and other developer resources to build a central index. 

Considerations for federated search across websites

There are several considerations for determining exactly how to federate search data across sites. Here are just a few. If you want more information, take a look at our in-depth implementation of federated search

Organizing search results

Every site has its own goals and objectives. The content and audience can vary considerably from site to site, so it’s important to ensure that the right content is delivered for the right query.

You can configure rules to deliver results on one or more sites in many different ways. For example, you could organize by category, by filters on the front end, or even by your backend indexing processes and query executions. And these are just a few ways to think about it. Let’s look at each of them briefly. 

  • Categories: Each site could have its own search synonyms or promotions. In that case, you may want to configure the system to search across categories differently on each site depending on where the query originates. 
  • Filters & filters: You could have different filters or facets on each site to narrow results to just that site or by topic. In other words, the filters could be explicitly by domain or, if the sites have very different content, the filters could be by content type. 
  • Indexing: Your indexing process can categorize content in websites A and B with a tag that’s similar so each record is tagged as it comes in, but set a different tag for site C. Then you could build rules for search faceting by tags. Or, maybe you want to bias the results from the website someone is searching on. In that case, you would have steps at query time that boost certain results to promote that domain’s results. And, you can also do filtering in the pipeline itself.
  • Multiple queries: You can execute the same query in parallel on different data sources and send the results of each dataset.

Likely the search index will be some mixture of all the above. In other words, there are many different options for a federated search solution both at index and query time to deliver relevant information. Start with determining the goals and outcomes for visitors on each site and then you can determine exactly how to accomplish those goals with search. 

Indexing, schema, and data transformation

Perhaps the biggest challenge of federating site search is indexing and managing radically different schema and site organization. 

  • Different schema: Sites can each use different schemas such as Dublin Core, Open Graph, or Schema.org which have different metadata fields and date and time formats. 
  • Domain structure: Each site could have a very different domain structure and hierarchy. Search engines can use the domain structure (e.g., /index, /products/, /services/, /services/details/, etc.) to categorize results and improve relevance. 
  • Tagging: The index can be impacted by (1) how h1, h2, h3, etc., tags are structured and (2) what metadata is included within tags (e.g., meta labels and properties).

To manage for these differences, you can add rules to transform data as it’s being indexed. For example, you will want to store records and data, such as time/date, in a consistent format. You may also choose to transform content for the search index. One website may call it “Corona virus” and another might call it “ COVID-19,” so you’ll want to store an index that contains the synonym. This can also be handled through more advanced vector analysis to cluster the data as numeric topics. 

Another consideration is duplication of content. Different sources of data may have the same type of content, e.g., /about or /company pages, so when someone is searching for information about the business, they could come across both. You’ll want to decide how to handle duplicate or very similar content. 

Filters

Search filters help users narrow their search query to find exactly what they want. With federated search, you can create filters that cut across each site or which are specific to each domain. 

federated search results

Generally speaking, there are three different types of filters. These are not mutually exclusive; you can use one or all three if you wanted.

  • Static filters which allow end users to filter content after entering a search query. For example, you can give visitors a way to filter results by topic or rating.
  • Dynamic filters (also called facets), which are generated based on the values of the search result set. As an example, if a user is searching for “car” you could display all the brands — Toyota, Ford, Volvo, etc. — available for that category, which would be different from a search for “boats” or “motorcycles.”
  • Filter results using filter expressions. In this case, the end-users will always see the filtered results. For example, you could limit results to only one site or part(s) of your site(s). If you sell shirts, shoes, and jewelry, you could exclude results from one or more sections. In other words, categories are a natural place to start your filters. 

Whichever kind of filter(s) you choose, you’ll want to consider how they show up on each site. You can use the same filters on each site, or deliver filters contextually. 

Analytics

A brief note on analytics: if you’ve added federated searching across different sources, how do you know it’s working? Each site owner will want a view into site search performance on their site. Metrics such as click-through-rate (CTR), conversions, popular queries, and ineffective searches or no results should be monitored to ensure visitors are finding what they need. 

Conclusion

Federated search can provide a better search experience for end-users, but it requires a good deal of planning to ensure the results match expectations on each site. It’s worth pointing out that federating search doesn’t mean that each site needs to use the same CMS or adhere to the exact same schema or metadata standards. As long as the search index can be standardized, it is easily possible to deliver great results.

About the authorJon Silvers

Jon Silvers

Director, Digital Marketing

Recommended Articles

Powered by Algolia AI Recommendations

What is Federated Search?
UX

What is Federated Search?

Louise Vollaire

Louise Vollaire

Product Marketing Manager
12 ways to improve your search index
Engineering

12 ways to improve your search index

Jon Silvers

Jon Silvers

Director, Digital Marketing
Why is site search so essential in today's digital economy?
Product

Why is site search so essential in today's digital economy?

Jon Silvers

Jon Silvers

Director, Digital Marketing