Search by Algolia
Introducing new developer-friendly pricing
algolia

Introducing new developer-friendly pricing

Hey there, developers! At Algolia, we believe everyone should have the opportunity to bring a best-in-class search experience ...

Nick Vlku

VP of Product Growth

What is online visual merchandising?
e-commerce

What is online visual merchandising?

Eye-catching mannequins. Bright, colorful signage. Soothing interior design. Exquisite product displays. In short, amazing store merchandising. For shoppers in ...

Catherine Dee

Search and Discovery writer

Introducing the new Algolia no-code data connector platform
engineering

Introducing the new Algolia no-code data connector platform

Ingesting data should be easy, but all too often, it can be anything but. Data can come in many different ...

Keshia Rose

Staff Product Manager, Data Connectivity

Customer-centric site search trends
e-commerce

Customer-centric site search trends

Everyday there are new messages in the market about what technology to buy, how to position your company against the ...

Piyush Patel

Chief Strategic Business Development Officer

What is online retail merchandising? An introduction
e-commerce

What is online retail merchandising? An introduction

Done any shopping on an ecommerce website lately? If so, you know a smooth online shopper experience is not optional ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

5 considerations for Black Friday 2023 readiness
e-commerce

5 considerations for Black Friday 2023 readiness

It’s hard to imagine having to think about Black Friday less than 4 months out from the previous one ...

Piyush Patel

Chief Strategic Business Development Officer

How to increase your sales and ROI with optimized ecommerce merchandising
e-commerce

How to increase your sales and ROI with optimized ecommerce merchandising

What happens if an online shopper arrives on your ecommerce site and: Your navigation provides no obvious or helpful direction ...

Catherine Dee

Search and Discovery writer

Mobile search UX best practices, part 3: Optimizing display of search results
ux

Mobile search UX best practices, part 3: Optimizing display of search results

In part 1 of this blog-post series, we looked at app interface design obstacles in the mobile search experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 2: Streamlining search functionality
ux

Mobile search UX best practices, part 2: Streamlining search functionality

In part 1 of this series on mobile UX design, we talked about how designing a successful search user experience ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Mobile search UX best practices, part 1: Understanding the challenges
ux

Mobile search UX best practices, part 1: Understanding the challenges

Welcome to our three-part series on creating winning search UX design for your mobile app! This post identifies developer ...

Vincent Caruana

Sr. SEO Web Digital Marketing Manager

Teaching English with Zapier and Algolia
engineering

Teaching English with Zapier and Algolia

National No Code Day falls on March 11th in the United States to encourage more people to build things online ...

Alita Leite da Silva

How AI search enables ecommerce companies to boost revenue and cut costs
ai

How AI search enables ecommerce companies to boost revenue and cut costs

Consulting powerhouse McKinsey is bullish on AI. Their forecasting estimates that AI could add around 16 percent to global GDP ...

Michelle Adams

Chief Revenue Officer at Algolia

What is digital product merchandising?
e-commerce

What is digital product merchandising?

How do you sell a product when your customers can’t assess it in person: pick it up, feel what ...

Catherine Dee

Search and Discovery writer

Scaling marketplace search with AI
ai

Scaling marketplace search with AI

It is clear that for online businesses and especially for Marketplaces, content discovery can be especially challenging due to the ...

Bharat Guruprakash

Chief Product Officer

The changing face of digital merchandising
e-commerce

The changing face of digital merchandising

This 2-part feature dives into the transformational journey made by digital merchandising to drive positive ecommerce experiences. Part 1 ...

Reshma Iyer

Director of Product Marketing, Ecommerce

What’s a convolutional neural network and how is it used for image recognition in search?
ai

What’s a convolutional neural network and how is it used for image recognition in search?

A social media user is shown snapshots of people he may know based on face-recognition technology and asked if ...

Catherine Dee

Search and Discovery writer

What’s organizational knowledge and how can you make it accessible to the right people?
product

What’s organizational knowledge and how can you make it accessible to the right people?

How’s your company’s organizational knowledge holding up? In other words, if an employee were to leave, would they ...

Catherine Dee

Search and Discovery writer

Adding trending recommendations to your existing e-commerce store
engineering

Adding trending recommendations to your existing e-commerce store

Recommendations can make or break an online shopping experience. In a world full of endless choices and infinite scrolling, recommendations ...

Ashley Huynh

Looking for something?

Indexing Markdown content with Algolia
facebookfacebooklinkedinlinkedintwittertwittermailmail

I teamed up with Starschema Full Stack Engineer Soma Osvay to write about a project near and dear to my heart: developer documentation. Soma and I both hope you enjoy this article where we cover the use-case and challenges and the implementation Soma came up with.


Here at Starschema, we have a lot of projects we’ve done as part of a consultation with markdown documentation. When we’re supporting these existing solutions or wanting to develop a new project, we often want to search through all of the documentation. We currently don’t have a solution, costing us man-hours since we have to do this work manually.

Use Case

We need to be able to search for projects related to specific topics to check implementation specifics, get ideas from the code, and much more. The Sales team also needs a solution to determine what kind of projects our team has completed for certain topics and be able to quickly communicate it back to potential clients. It would also be great to have something like an internal Stack Overflow for our Developers since we are doing a lot of deep technical work in Tableau. Project Managers also need to be able to determine which employees have worked with certain technologies so they can get questions answered quickly and easily.

In short, we need to be able to search for our own projects based on the following attributes:

  1. Technologies used (coding languages, databases, etc)
  2. Project keywords (tags)
  3. Project documentation itself such as install instructions, technical documentation, etc.
  4. Timeframe of when the project occurred

All of these attributes exist within the internal documentation markdown files – we just need a way to search them.

Implementation Plan

The plan is to use a NodeJS CLI as a proof of concept which will:

  1. Scrape the top GitHub public repositories and grabs the README markdown file (which will ultimately represent our internal projects)
  2. Stores the documentation files into Algolia alongside the project’s basic information (title, programming language(s), tags, etc)

The CLI will contain advanced logging and command line arguments for ease of use. We also want to host it on the web so people can try it out.

Challenge

The biggest challenge at hand is the record size — Algolia only allows our records to be 100KB at the most. However, most markdown documentation files are much larger. The solution is that we need to split up the markdown files into multiple pieces within the Index. We also then need to make sure that when we search for something, a single project only appears once—even though it is split into multiple records.

Fortunately, Algolia has a distinct feature so we can de-duplicate these results very easily.

Indexer Implementation

To make using the Indexer as easy as possible, I opted to create a CLI for it as mentioned above. After supplying the arguments required, the tool will automatically initialize the repository, remove any existing records, and configure the relevancy settings.

Powering the tool is a straightforward GitHub API that grabs the requested number of top repositories and extracts all of their metadata and downloads the README file. It’ll also filter out repositories that are missing an owner or missing a README file, giving us the best results. Finally, it’ll also convert the markdown content to HTML for easier rendering in the frontend.

To keep the record size in check, the tool will automatically split up READMEs over 50k characters into additional records. This way the records won’t get too large but one record should still serve almost all repositories. It’ll then sync this information over to Algolia 100 records at a time so we can meet their batching recommendations as described here in the documentation.

Frontend Implementation

To serve as my starting point, I took advantage of the create-instantsearch-app library released by Algolia and launched an InstantSearch.js boilerplate. From here, I was able to add additional widgets provided by InstantSearch.js such as the pagination and the page size selector which worked great.

As we also collected repository metadata with the markdown, we also needed to customize the hits component to include this additional information. Often times the metadata is as important as the library description so developers can at a glance see if it’s a popular library, who released it, the tags, and much more. I also added facets so Users could filter by the programming language, tags, or how many times it has been forked.

The final piece to the puzzle was adding the ‘Open documentation’ button allowing you to quickly and easily read the markdown content for the repository in a pop-up without leaving the application. If the record we are clicking on has multiple rows, it’ll automatically load the additional records and concatenate them in the display—awesome!

Conclusion

This project was a fun test and really showed me how flexible Algolia is for different use cases such as ours. The ready-to-go widgets saved me a lot of time during prototyping and having relevant results from the first few keystrokes is super impressive. I also think it would be super interesting if we were able to harness the power of Algolia Recommend if we were able to generate enough events from Users clicking on projects internally.

You can view a live demo of the GitHub test project here, with a button that will set you up with the default demo credentials to view our index. Interested in the backend indexing code? You can find that here on GitHub!

Algolia Code Exchange


We hope that you enjoyed this in-depth article from Soma! If you are looking for more content like this, we have many more topics that we’ve covered on the Algolia Engineering Blog! If you’re new to Algolia, you can try it out by signing up for a free tier account.

About the authors
Michael King

Developer Advocate

githublinkedinmediumtwitter
Soma Osvay

Full Stack Engineer, Starschema

Recommended Articles

Powered byAlgolia Algolia Recommend

Good API Documentation Is Not About Choosing the Right Tool
engineering

Maxime Locqueville

DX Engineering Manager

Redesigning our Docs – Part 6 – The processes and logistics of a large scale project
algolia

Maxime Locqueville

DX Engineering Manager

How to Unlock Time for Side Projects During Working Hours
engineering

Nicolas Torres

Software Engineer