AI

tl;dr — Vector search is a search method that involves representing semantic concepts with numbers and comparing those records using machine learning AI models.

The recent explosion in practical AI usage in modern business has introduced us to a lot of jargon.

But like, what is vector search? How does it work? Good questions! We’ll get into that in this article. We’ll also look into why vector search is not just a good idea, but absolutely critical to implement for your ecommerce site. And as it turns out, it’s not that tough to jump on board the vector search train because almost all of the hard work has already been done for you! Keep reading to find out how to take advantage.

## The problem we’re solving

Language is often ambiguous and fuzzy. Two words can mean the same thing (synonyms) or the same word can have multiple meanings (polysems). In English for example, “fantastic” and “awesome” can sometimes be synonymous, but “awesome” can also mean many different things — inspiring, daunting, divine, or even plentiful.

## Enter vectors.

Vectors are basically numbers with a direction attached, so they’re often displayed as arrows on a graph, like this:

This particular vector lives in a world of only 2 dimensions (left-right and up-down). Our world is 3-dimensional. The more dimensions you add, the more information a vector in that space contains. So if we were to set up a space with many, many dimensions, we could train an AI to associate human concepts with certain directions in that space. When we feed that AI words (or other short pieces of textual data called tokens), it can essentially translate those words into vectors, something we can do math with!

## Comparing vectors

Now that linguistic meaning has been encoded into these vectors, we can actually do some really straightforward math with them. For example, if the vector embeddings for two different words are super similar, then those words are probably synonyms. This approach to synonym generation is markedly different from previous approaches — now instead of creating synonyms manually based on vibes, they’re mathematically derived.

We can take that a step further too by doing basic addition and subtraction with the vectors. For example, from a purely linguistic point of view, what do you think should be the answer to `king - man + woman`? Our minds understand this question intuitively: if you were to take away the concept of “man” from “king” you get a genderless “monarch”. When we add back in the concept of “woman”, we should get “queen”. Now, with the vector embeddings of these words, we can get that result mathematically.

An AI trained on text where these gendered words show up enough to demonstrate a correlation would be able to subtract the vector representing “man” from the vector representing “king”, and get a vector super close to the one representing “monarch”. Then adding in “woman”, it’ll get extremely close to the vector for “queen”. This kind of math can be extended further to do increasingly complex operations on data that was previously thought unquantifiable.

## A brief history of vector embeddings

Some of the earliest models and attempts to represent words as vectors go back to the 1950s with roots in computational linguistics. In the 1960s, research on semantic differentials attempted to measure the semantics, or meaning, of words. Natural language processing (NLP), a way to analyze text to infer meaning and structure, began with complex sets of handwritten rules, but turned to new machine learning models in the 1980s. NLP is still used today in search engines to help structure queries.

### Latent semantic analysis

It was in the late 1980s that a new statistical model, latent semantic analysis (LSA), also called latent semantic indexing (LSI), was developed for creating vectors and performing information retrieval. LSA is very good at understanding document relatedness by analyzing what terms are frequently used together to build a model of semantic relatedness (e.g., “royalty” and “queen”).

It is a good approach for handling certain kinds of problems — such as synonyms and polysems, and measuring distance (or similarity) between objects — however, it has difficulty scaling. LSA can be computationally expensive especially as the number of vectors increases or as the underlying data changes — for example, every time you update your catalog.

Because of the complexity involved, this was traditionally a job only undertaken by tech giants like Google and Amazon. These companies have hired thousands of engineers and data scientists, and some have even developed their own computer chips to run machine learning more easily.

### Word2Vec

In 2013, Word2Vec was introduced as a new model to understand word similarity using neural networks. Like LSA, Word2Vec can be used to create the word embeddings and then be trained to find text that is semantically similar.

As the name suggests, neural networks are machine learning networks that resemble the neurons in a brain. Underlying neural networks is a type of machine learning known as deep learning. Every “neuron” in a neural network is essentially just a mathematical function. The weighted total of each neuron’s inputs is calculated; the more significant an input’s weight, the more it influences the neuron’s output.

This kind of technology is not only rather simple to create by ourselves thanks to more educational tools being available, but it’s now being perfected by companies like OpenAI, who are then making it available to the public through API subscriptions.

Even better, since the input used to train these neural networks were standardized vectors instead of loose, unstructured data, we can actually continually update the model with more recent data if we just vectorize it first. This approach is called retrieval augmented generation, and you can learn more about it here.

You can find deep learning used in voice assistants, facial recognition, self-driving cars, and many other applications. Deep learning can be trained on enormous datasets and is able to recognize a large number of complex patterns.

## Examples of vector search

Nowadays, there is a wide diversity of vector embedding models to process different data such as images, videos, and audio. There are also many freely available vector databases with vector embeddings and distance metrics that represent nearness or similarity between vectors.

There are also various algorithms which can be used to search a vector database to find similarity. These include:

• ANN (approximate nearest neighbor): an algorithm that uses distance algorithms to locate nearby vectors.
• kNN (k-nearest neighbors): an algorithm that uses proximity to make predictions about grouping.
• SPTAG (Space partition tree and graph): a library for large scale approximate nearest neighbors.
• Faiss: Facebook’s similarity search algorithm.
• HNSW (hierarchical navigable small world): a multilayered graph approach for determining similarity.

There are tradeoffs between these different techniques and often you’ll see multiple techniques being used to deliver results faster and with greater accuracy. These various techniques will deliver better results even for hard-to-process queries.

For example, when searching an electronics catalog, people sometimes type “usbc”, “usb-c”, or “usb c”. Do these mean the same thing, or is it for three different items? Keyword engines can struggle with this kind of formatting, and typically you might need to create if/then rules to teach the search engine how to manage this query. However, with vector search, this isn’t a problem. Vector search engines will just know to deliver similar results because the vector embeddings of these queries are almost identical.

Here’s a more interesting example:

In our test database with more than 20,000 products — which includes only product titles and brand names — we performed a search for “coffee gift card” (above). The term “coffee” is not in Starbucks gift card description, however the vector engine can make the connection between “coffee” and “starbucks” easily because their vectors are in the same general region.

## Vector search challenges

Vector embeddings help us to find similarity between documents. When it comes to relevance, vector search is superior to keyword search for many types of queries. If they’re so great, why don’t we use vector search for everything? In fact, for many query types, keyword search still provides better relevance. We’ve written more in-depth on this in the past, but here’s a summary:

### Accuracy vs keyword search

Vector search is terrific for fuzzy or broad searches, but keyword search still rules the roost for precise queries. As the name suggests, keyword search tries to match keywords exactly. Other popular features like autocomplete, instant search, and filters are also much easier to implement with keyword search.

For example, when you query for “Adidas” on a keyword engine, by default you will only see the Adidas brand. The default behavior in a vector engine is to return similar results — Nike, Puma, Adidas, etc.. They are all in the same conceptual space. Keyword search still provides better results for short queries with specific intention.

### Speed and scale

Bottlenecks are more likely with vector search because queries must do complex vector calculations to predict relationships as opposed to just reading column based indexes.

To cope, vector search engines either need more compute power or must instead process the same queries faster. Vector search companies have been pushing the benefits of vector AI for years, but the cost and performance issues have impeded its progress.

Some companies that offer vector search module add-ons will attempt to skirt the problem by only running the vector search if the keyword search result is poor. The message is that you can have one or the other — keywords or vectors, speed or quality — but not both running at the same time.

Some have suggested that caching is a good way around this problem. The argument goes that by caching results you can virtually eliminate costs and provide results instantly. In practice, search queries vary considerably and the cost benefit for caches is often questionable. The cache rate of search can be extremely low, especially for sites with massive longtail content (using our own customer data we have seen, on average, 50% of the traffic are longtail queries that are not frequent enough to be cached).

One fix to all of these problems — accuracy, speed, scalability, and cost — is called neural hashing. We’ll explain briefly how it works.

## Binary vectors

Vectors work, but as mentioned above, have speed and scale limitations that affect performance and cost. We took a different approach, called neural hashing, that leverages vectors without tradeoffs.

Neural hashing makes vector-based search as fast as keyword search and this is done without the need for GPUs or specialized hardware. Neural hashing uses neural networks to hash vectors — compressing the vectors into binary hashes (or binary vectors). You may have heard of hashes; cryptographic hashing is a commonly used technique in security for producing a tiny, unique output for protected password comparisons.

Performance-wise, these hashed vectors can be run on commodity hardware, retain 96% (or more!) of the vector information, and can be calculated potentially hundreds of times faster than vectors alone.
Now, if there was only some way to get keyword search and neural hashing into the same query….

## Hybrid search

Hybrid search is a new method to combine a full-text keyword search engine and a vector search engine into a single API to get the best of both worlds.

There is tremendous complexity in running both keyword and vector engines at the same time for the same query. Some companies have opted to go around the complexity by running these processes sequentially — they run a keyword search and then, if a certain relevance threshold isn’t met, run a vector search. There are many poor tradeoffs for this such as speed, accuracy, filtering, and heap sorting. These so-called dual systems suffer because the vector databases often don’t have the same (or any) filtering capabilities so they return massive amounts of information that’s unnecessary.

True hybrid search is different. By combining full-text keyword search and vector search into a single query, customers can get more accurate results fast. For Algolia, we’ve combined neural hashes with our world-class and blazingly fast keyword search technology into a single API call. It scales to meet the needs of any size dataset — even for indexes that have a lot of changes with frequent updates and deletions — without any additional overhead.

## The ecommerce implications

We’ll keep this section short. Here’s the gist: connect potential customers with stuff they want to buy, and they’ll buy more. Seems logical, right?

Vector search has proved to connect customers with stuff they want to buy more easily, and that makes sense since it actually understands the query instead of just blindly matching it to similar strings of text. If 61% of shoppers are likely to engage with relevant additional content in the search results, we should see the average conversion rate of people who searched for specific products go up from 79% to 92% when we also surface similar content using vector search.

Convinced this is the way to go yet? Us too. If you’re ready to get started, you don’t have to make it all yourself. Just sign up for a free account and start toying around on our huge free plan immediately.

Dustin Coates

Product and GTM Manager

Freelance Writer at Authors Collective

Product

#### What is a vector database?

Vincent Caruana

Sr. SEO Web Digital Marketing Manager
AI

#### What are vectors and how do they apply to machine learning?

Catherine Dee

Search and Discovery writer
AI

#### The past, present, and future of semantic search

Julien Lemoine

Co-founder & former CTO at Algolia