Search by Algolia
Feature Spotlight: Query Rules
product

Feature Spotlight: Query Rules

You’re running an ecommerce site for an electronics retailer, and you’re seeing in your analytics that users keep ...

Jaden Baptista

Technical Writer

An introduction to transformer models in neural networks and machine learning
ai

An introduction to transformer models in neural networks and machine learning

What do OpenAI and DeepMind have in common? Give up? These innovative organizations both utilize technology known as transformer models ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What’s the secret of online merchandise management? Giving store merchandisers the right tools
e-commerce

What’s the secret of online merchandise management? Giving store merchandisers the right tools

As a successful in-store boutique manager in 1994, you might have had your merchandisers adorn your street-facing storefront ...

Catherine Dee

Search and Discovery writer

New features and capabilities in Algolia InstantSearch
engineering

New features and capabilities in Algolia InstantSearch

At Algolia, our business is more than search and discovery, it’s the continuous improvement of site search. If you ...

Haroen Viaene

JavaScript Library Developer

Feature Spotlight: Analytics
product

Feature Spotlight: Analytics

Analytics brings math and data into the otherwise very subjective world of ecommerce. It helps companies quantify how well their ...

Jaden Baptista

Technical Writer

What is clustering?
ai

What is clustering?

Amid all the momentous developments in the generative AI data space, are you a data scientist struggling to make sense ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

What is a vector database?
product

What is a vector database?

Fashion ideas for guest aunt informal summer wedding Funny movie to get my bored high-schoolers off their addictive gaming ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Unlock the power of image-based recommendation with Algolia’s LookingSimilar
engineering

Unlock the power of image-based recommendation with Algolia’s LookingSimilar

Imagine you're visiting an online art gallery and a specific painting catches your eye. You'd like to find ...

Raed Chammam

Senior Software Engineer

Empowering Change: Algolia's Global Giving Days Impact Report
algolia

Empowering Change: Algolia's Global Giving Days Impact Report

At Algolia, our commitment to making a positive impact extends far beyond the digital landscape. We believe in the power ...

Amy Ciba

Senior Manager, People Success

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve
e-commerce

Retail personalization: Give your ecommerce customers the tailored shopping experiences they expect and deserve

In today’s post-pandemic-yet-still-super-competitive retail landscape, gaining, keeping, and converting ecommerce customers is no easy ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Algolia x eTail | A busy few days in Boston
algolia

Algolia x eTail | A busy few days in Boston

There are few atmospheres as unique as that of a conference exhibit hall: the air always filled with an indescribable ...

Marissa Wharton

Marketing Content Manager

What are vectors and how do they apply to machine learning?
ai

What are vectors and how do they apply to machine learning?

To consider the question of what vectors are, it helps to be a mathematician, or at least someone who’s ...

Catherine Dee

Search and Discovery writer

Why imports are important in JS
engineering

Why imports are important in JS

My first foray into programming was writing Python on a Raspberry Pi to flicker some LED lights — it wasn’t ...

Jaden Baptista

Technical Writer

What is ecommerce? The complete guide
e-commerce

What is ecommerce? The complete guide

How well do you know the world of modern ecommerce?  With retail ecommerce sales having exceeded $5.7 trillion worldwide ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Data is king: The role of data capture and integrity in embracing AI
ai

Data is king: The role of data capture and integrity in embracing AI

In a world of artificial intelligence (AI), data serves as the foundation for machine learning (ML) models to identify trends ...

Alexandra Anghel

Director of AI Engineering

What are data privacy and data security? Why are they  critical for an organization?
product

What are data privacy and data security? Why are they critical for an organization?

Imagine you’re a leading healthcare provider that performs extensive data collection as part of your patient management. You’re ...

Catherine Dee

Search and Discovery writer

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit
e-commerce

Achieving digital excellence: Algolia's insights from the GDS Retail Digital Summit

In an era where customer experience reigns supreme, achieving digital excellence is a worthy goal for retail leaders. But what ...

Marissa Wharton

Marketing Content Manager

AI at scale: Managing ML models over time & across use cases
ai

AI at scale: Managing ML models over time & across use cases

Just a few years ago it would have required considerable resources to build a new AI service from scratch. Of ...

Benoit Perrot

VP, Engineering

Looking for something?

facebookfacebooklinkedinlinkedintwittertwittermailmail

We invited our friends at Starschema to write about an example of using Algolia in combination with PostgreSQL. Enjoy this two-part series by Software Developer Mátyás Budavári!

If you would like to skip ahead to part two, where the implementation is covered, click here.


Hello, I’m Mátyás, a software developer at Starschema. This post is about the audit of one of our products and how we can leverage the speed and distributed nature of Algolia to let our moderators search through the audit data without having access to the underlying PostgreSQL database.

Use Case

Our tool lets the users of our enterprise partners access all their reports from different Business Intelligence (BI) software and other resources in a centralized place. The most popular BI tools that we work with are Tableau and PowerBI. BI tools help communicate, present and analyze data with powerful interactive visualizations that can work even with real-time data. Managing multiple data sources and dashboards is often a frustrating and time-consuming experience filled with noise that harms the ability to derive insights and leads to fallible decisions.

We developed a service to improve the data analysis and presentation process by enabling individuals or teams to curate and collaborate with customized shared workspaces. Users can focus on analyzing data and communicating insights without having to shuffle between applications and authenticate multiple times. The contents of these reports are still served from the original tools, and we provide a unified experience on top of them. In addition, our users can modify the visible name of these reports and add additional tags and other metadata that make it easier for their teams to organize these reports.

Indexing Business Intelligence Data

We collect usage data periodically, which enables us to fine-tune the application based on how our users actually use it. In addition to collecting these metrics, we must comply with security requirements. We need to be able to tell who could access certain reports at a given time and what changes were made in the application by whom in a specific time range. We collect so-called audit logs to be able to extract this information. This data must be handled with care, and only people with higher privileges shall have access to the raw data.

In a recent project, we needed to develop an optional extension to extract anonymized audit log information into an external place out of our PostgreSQL database to let a team of content managers access that data quickly. These audit logs can contain JSON metadata, and it is hard to set up advanced search features with PostgreSQL. It’s better to index the data so that it will speed up the queries.

What we need

We need data about the usage distribution to present to the creators of the BI reports. Both Tableau and PowerBI provide usage metrics on their reports, but we’d like to show the usage information in a unified manner. We can show them detailed information on how their content is accessed through our application.

Since the report owners often want to know the peak periods of their insights, we need audit information to see what reports were accessed by whom through our application.

Note that the external BI sources manage permissions in their specific way. For this reason, the creators need to set the audience access up in the tool they use to publish the reports, we synchronize this information, and only those BI reports will show up for the given users what they can access.

How we can achieve it

We mainly use our application in internal and protected networks. Most of the time, due to strict policies, it’s not possible to connect dedicated analytics services so we save our analytics data into our database. However, most of the time, our analytics team does not have access to the database. We don’t send out actual user data and report metadata. Sometimes, we can send out anonymized data and tell them what specific identifiers represent in a different medium if it’s relevant and we’re allowed to share.

All in all, we need a way to query the usage logs fast without access to the original database.

We decided to use Algolia as the search index. We don’t want to host, manage and maintain our search index service per installation, and it’s much better to have a managed centralized place where we can send our data. Our focus groups can use it to compose the reports they need based on the data sent over there.

Out of security and respect for our clients, it’s an opt-in feature.

Architecture

Our backend service of the application is written in Golang. We store the connection information and report metadata for the BI Tools and the usage information in a PostgreSQL database.

We aim for a solution as close to the database as possible. We want to leverage the language features that PostgreSQL provides without any extensions. And we don’t want to modify our code too much to support this data gathering, though we’re open to running small microservices that communicate with Algolia.

To send our data to Algolia, we need to write some kind of code outside PostgreSQL.

We want to create the data load component as a separate service with the single purpose of sending data to the index. Due to the privacy of our users and reports, we can not expose our database to the above-mentioned external auditors. We can only send them a selected set of information.

Proposed solution

Indexing Business Intelligence data

The audit log table can grow large over time in proportion to usage. We don’t want to query the whole log table every time we need to send out data. Having a small queue, that takes care of the data load is better. Each log line is necessary, but we can’t rely on these small services to be available at any point in time. We have to save what rows are needed to be processed.

We plan to create a new table in PostgreSQL that stores the filtered data that is yet to be sent to Algolia. This table will act as a LIFO (last-in-first-out) queue to prioritize the new data over the old ones.

We can send our already collected data into Algolia upon creating this new queue table.

We aim to point a trigger to the source table, and on every insert statement, a PostgreSQL procedure copies the necessary fields into this new table.

We plan to write a new service that periodically reads the new table, gets the last few items, and sends them to Algolia. It would be even better if multiple such mini-services could run simultaneously without interrupting each other and work on the queue independently.

In the next part of this article series, we’ll put the plan in motion by setting up the required triggers, seeding some random test data, and finally setting up the Producer and Consumer code to handle syncing data from PostgreSQL to our Algolia Index.

About the author
Mátyás Budavári

Software Developer Team Lead

Recommended Articles

Powered byAlgolia Algolia Recommend

Part 2: Business Intelligence Data Indexing from PostgreSQL using Algolia – Implementation and conclusion
engineering

Mátyás Budavári

Software Developer Team Lead

Part 2: Supercharging search for ecommerce solutions with Algolia and MongoDB — Proposed solution and design
engineering

Soma Osvay

Full Stack Engineer, Starschema

Introducing our new navigation
product

Craig Williams

Director of Product Design & Research