1. All Blogs
  2. Product | Algolia
  3. Ai | Algolia
  4. E-commerce | Algolia
  5. Customers | Algolia
  6. User Experience | Algolia
  7. Algolia | Algolia
  8. Engineering | Algolia

Bringing the T to ETL data pipelines: Solving real-world problems with Algolia data transformations

Updated:
Published:

It can be challenging for customers to format data optimally for use by Algolia. Struggling with data formatting and sourcing not only slows down implementation, it can limit your use of Algolia’s advanced search and discovery features.

To help customers to get maximum use out of Algolia, we’ve introduced new data transformation capability to our code editor to address these data preparation and integration issues.

Using data transformations minimizes manual work and lets team members solve data challenges themselves. By simplifying the data preparation process, you can reduce implementation time, enhance data quality and ensure optimum deployment of Algolia’s advanced features.

What is Algolia’s data transformation editor?

The data transformation editor is a code editor with pre-built transformation functions that solve common data use cases. The functions are accessible directly inside the user interface.

Pre-built functions let teams prepare data for search-specific tasks, such as hierarchical categories and custom rankings for search personalization. At the same time, transformations are customizable through code editing, so that you can maintain full flexibility over data integration.

Overall, the data transformation functions are designed to help you prepare your data more efficiently. Because the functions are set up for you, they’re easy to implement.

How can you use data transformations?

To illustrate how the data transformation editor works, let’s take a fictional example. Ecommerce retailer Gadget Tech has identified two problems with how data is ingested on its online portal.

The challenges

First, whenever the company offers a seasonal deal or discount, it updates its discount percentages manually. This approach not only wastes time and resources, but can lead to customer confusion and lost sales. Gadget Tech needs a way to automatically calculate and display percentages.

Second, Gadget Tech isn’t taking optimal advantage of personalized product rankings. As a result, customers struggle to find popular products that are uniquely relevant to them. 

Gadget Tech wants to optimize its ranking system by introducing a secondary data source that includes a new attribute: popularity scores.

Solutions

1. Data transformation to display discounted percentages

The first step is to create a new Json pipeline in the Algolia dashboard, in the Data sources section, under Connectors. Once that source is entered, create a new transformation. 

The new editor lets you create and type the transformation function directly in JavaScript. With Monaco as the editor running inside VS Code, auto-complete works directly on the record or for all JavaScript API. 

Use the helper functions to set up the necessary data transformations. They’re grouped by category at the bottom of the code editor.

You’ll find the Compute discount percentage function under Compute new properties.  From the drop-down menu, you can choose to Use the sample at the beginning of the function

This adds a discount percentage property to the record called computeDiscountPercentage. The percentage is rounded by default, but you can change this by changing the function.

Use the right side of the editor to test the function. 

After running the data transformation, the Transformation preview shows that a discountPercentage has been successfully calculated, with a value of 7.

The data transformation is applied to your full file. In other words, all items in your data source will be transformed, one by one.

The code editor is especially useful if you use an API client to ingest your data. This function is  currently in Beta, but you’ll soon be able to use an API client to directly push the record, then apply the transformation. 

2. Data transformation to introduce personalized ranking

Gadget Tech’s personalized ranking challenge is a more complex challenge since it involves calling an external service. Because the service is authenticated, it involves a “secret” function. 

To call the service, use a helper function under the Fetch / API section at the bottom of your screen (see image). Use the Fetch with API  key (in header) function to fetch from any third party API and call any URL. Adding the function at the current cursor position will add the {record.id} on the URL.

Next, manage the secret. Use the helper objects in the signature to get the secret, rather than copying and pasting in plaintext. 

As in the example, Gadget Tech will get a secret called ‘rankingKey’. To create and add it to the vault, click Secrets in the lower panel and fill in the field next to rankingKey. Just like basic credentials, the key is stored securely and safely. 

Next, use console.log to test that you’re calling the service correctly. Under the Helper functions, choose the Debug function to print the record.id and the result of the external service. 

You can find the result of the console.log in the >_Console panel at the bottom of the code editor. If there are any JavaScript errors or log errors, they’ll appear here.

When you run the data transformation, you’ll see the record.id you’re working with (in this example, it’s fairphone5, as per the input record). In the example, the external service replied with a ranking of 50

With this data in hand, the next step is to jump to the ranking algorithm.

From Helper functions, choose Ranking. Then select the Gravity/Decay algorithm to compute a blended ranking score. This model is especially effective because it incorporates recency data, which considers the influence of older data on ranking.

To compute the age of the record, use the created_at property directly extracted from the record. To determine the popularity score, use the result of the secondary source, which is under result.ranking.

Next, compute the gravity ranking. The transformation preview shows a “gravityRanking” property (in this example, 0.06).

With this value, you can now set up ranking directly in your index that sorts the results of your items according to popularity attributes for specific records.

At this point, to continue the workflow, give your function a new name and save it. Create a new index and destination, then launch a new run from this ingestion task.

You can monitor what’s happening on the dashboard by clicking Status or on Connector Debugger. In the example, you can see the system has successfully completed the following steps:

  • fetched from the source

  • applied the transformation

  • dumped the record in the Algolia index

With the data transformation pipeline successfully completed, Gadget Tech can go to the Indices section of its dashboard to see that values for both discountPercentage and gravityRanking have been added to each record.

Start using data transformations to maximize search

Pre-built functions simplify and optimize data ingestion to improve search-specific performance. You can use them to calculate custom percentages, fetch data, and set up a custom ranking model. 

Algolia continues to build tools that streamline data preparation for search. Learn more about Data Transformations

And check out our DevCon video: Bringing the T to ETL data pipelines: Solving real-world problems with Algolia data transformations.

Recommended content

Get the AI search that shows users what they need