Software Developer & Pizza Lover 🍕
Sorry, there is no results for this query
Search on the Yarn website started with the documentation. We wanted people to easily find information on how to use Yarn. As with 300 other programming community websites, we went for Algolia’s DocSearch and this was merged in yarnpkg/website#105. Then another Yarn contributor (@thejameskyle) asked in yarnpkg/website#194 if there should be package searching abilities, much like npm had.
This is where Algolia came into play. We are a search engine, Yarn wants search and we are heavy users of Yarn, so we figured: let’s do this!
This is how it started on December 5th in our #2016-community-gift Slack channel:
In early 2017, we met with the Yarn team for a one day in-person brainstorming in London. The goal was to think about evolutions of the search experience along with defining a package details page. Algolia proposed design views of what search could be and from that we drafted a master plan.
^ This is not sped up. It is THAT fast (try it!). Yes, it still wows even us every time.
Instead of showing a dropdown of results, we chose to replace the page completely with the search results. This requires more data to be available immediately, but gives more context on the decisions you make while searching for a fitting package. Having the search page be the main entry point will make sure that you don’t need to know exactly what you want before “committing” to a search.
After using npm search many times, we knew what was missing and what was superfluous from the search results and package detail pages. We brainstormed a bit and iteratively added a lot of useful metadata.
Here’s a comparison between the two search results pages (npm on the left, Yarn on the right):
npm search results on the left, Yarn search results on the right (click to enlarge)
In the search results of Yarn we decided to directly display, for example, the number of downloads for every packages, the license, direct links to GitHub, and the owning organization.
This metadata helps the user to not have to open many package detail pages before getting the information they want.
For the package detail page, we took a similar approach. We started with the same base metadata as npm shows, but also took the opportunity to add a lot more. We decided to show changelogs (when available), GitHub stargazers, commit activity, deprecation messages, dependencies and file browsing.
Here’s what it looks like:
npm detail page on the left, Yarn detail page on the right (click to enlarge)
— John-David Dalton (@jdalton) March 30, 2017
This is an iterative process, and suggestions and feedback are always welcome.
The npm registry is exposed as a CouchDB database, which has a replication protocol that can be used to either set up your own npm registry, or in our case a service (the Algolia index) that has the same data as the npm registry.
Replication in CouchDB is a very simple but powerful system that assigns an “update sequence” (a number) to any changes made on a database. Then, to replicate a database and stay in sync, you only need to go from the update sequence 0 (zero) to the last update sequence, while also saving the last update sequence you replicated on your end. For example, right now, the last update sequence known on the npm registry is 5647452 (more than five million changes).
Early on we saw that going from 0 to 5647452 was very slow (multiple hours) and we wanted it to be faster. So, we made a replication system consisting of three phases:
For all of those phases, we use the PouchDB module which we recommend because it has an awesome API to interact with CouchDB databases.
All the phases go through the same steps to get the required metadata for displaying. Some of the metadata is also retrieved on the frontend directly, like the GitHub ones (stargazers, commit activity).
Here are all our sources:
One of them that really helped us is Query Rules. When you are searching for a package, there are two questions to answer: the package that you exactly typed, and the package that you probably typed. We found that other searches often don’t have what you typed exactly early in the results, even though it exists.
What we have as a solution is a query rule that applies when the user types the name of a package exactly or without special characters (to allow people affordance in how they type dashes).
Example query rule to boost exact matches
This allows a query like `reactnative` to have as first result `react-native` which is very popular, and as second result `reactnative`, which is deprecated and not supposed to be used, but still exactly what the user typed and may be looking for.
For a package search, we can’t make any assumptions like “Maybe the user was looking for this package instead of what they typed”. Instead we want to always present them both the exact package match if any and then the popular ones.
Building on that we want to add new features like:
We did not stop at the search feature. I am proud to be a frequent contributor to the Yarn website, helping on adding translations, reviewing or merging PRs and updating the Yarn cli documentation.
This project wouldn’t have been feasible without the help of everyone from the Yarn and Algolia teams. Since our first contact with the Yarn team, communication has always been great and allowed everyone to feel confident about shipping new features to the Yarn website.
We also want to thank very much the npm team for being responsive and advising us while we were building our replication mechanism.
We hope you enjoyed this article, see you soon for this year’s community gift 🚀