Search by Algolia

Sorry, there is no results for this query

Improving Search for Twitter Handles

Hello Twitter,

I have been using your service for awhile, and I love it!

At first, I was skeptical about what you could offer: Broadcasting to all my friends that I was eating a pizza, or taking a walk, is not really my cup of tea. But 3 years ago I figured out what Twitter was really meant for and how it could help me in a totally different way from what I first thought:

  • sharing interesting articles,
  • checking if /replace by the service provider you want/ is down,
  • or catching up on HackerNews.

More recently, I discovered you had a feature that could help me even more: I can now ask for support by tweeting. Tweeting is often faster and more productive than sending an email. You taught me to include the recipient’s Handle in my tweets, and your current Handle auto-completion implementation works pretty well: but what if you could provide a better typo-tolerance and ranking? (I’m NOT speaking about your official OSX/iOS native clients and its totally unusable auto-completion feature… btw, could you explain me why it is different from the one on your website?).

I have been leading a search-engine development team over the last 5 years and I’m now VP of engineering at Algolia. I am aware that considering my job, I have kind of an “expert” point of view about search. But search has become so essential that I am convinced it must be irreproachable. Did you know that 1.7M+ people are currently following @twittersearch: I’m not the only one expecting great things from your search-engine, Twitter 🙂

For sure we can improve it, let’s code!

First of all Twitter, I need your Handles database 🙂

  • I used your Streaming API to crawl about 20M+ accounts in ~2 weeks: it’s not blazing fast but I must admit it does the job (and it’s free). That’s about 5 lines of Ruby with TweetStream, good job guys!
  • and Daemonize to create a bin/crawler executable.

[code lang=”ruby”]#! /usr/bin/env ruby
require File.expand_path(File.join(File.dirname(__FILE__), ‘..’, ‘config’, ‘environment’))

daemon =‘crawler’, :log_output => true)
daemon.on_inited do
ActiveRecord::Base.logger =, ‘log/stream.log’), ‘w+’)
daemon.on_error do |message|
puts “Error: #{message}”
daemon.sample do |status|
For each new tweet you send to me, I store the author (name + screen_name + description + followers_count) and all his/her user mentions.

[code lang=”ruby”]class Handle < ActiveRecord::Base

def self.create_from_user(user)
h = Handle.find_or_initialize_by(screen_name: user.screen_name)
puts h.screen_name if h.new_record? =
h.description = (user.description || “”)[0..255]
h.followers_count = user.followers_count
h.updated_at ||=

def self.create_from_status(status)
status.user_mentions.each do |mention|
m = Handle.find_or_initialize_by(screen_name: mention.screen_name)
m.updated_at ||= =
m.mentions_count ||= 0
m.mentions_count += 1


And every minute, I re-index the last-updated accounts with a batch request using algoliasearch-rails, scheduled by Whenever:
[code lang=”ruby”]every 1.minute, roles: [:cron] do
runner “Handle.where(‘updated_at >= ?’, 1.minute.ago).reindex!”
The result order is based on several criteria:

  • the number of typos,
  • the matching attributes: the name/handle is more important than the description,
  • the proximity between matched words,
  • and the followers count (I also use the “mentions count” if my crawler didn’t get the followers count yet).

I could have improved the results by using the user’s list of followers/following but I was limited by your Rate LimitsInstead, I chose to emphasize your top-users (accounts having 10M+ followers).

Here is the configuration I used

[code lang=”ruby”]class Handle < ActiveRecord::Base

include AlgoliaSearch
algoliasearch per_environment: true, auto_index: false, auto_remove: false do
# add an extra score attribute
add_attribute :score

# add an extra full_name attribute: screen_name + name
add_attribute :full_name

# do not take `full_name`’s words order into account, `full_name` is more important than `description`
attributesToIndex [‘unordered(full_name)’, :description]

# list of attributes to highlight
attributesToHighlight [:screen_name, :name, :description]

# use followers_count OR mentions_count to sort results (last sort criteria)
customRanking [‘desc(score)’]

# @I_love_you
separatorsToIndex ‘_’

# tag top-users
tags do
followers_count > 10000000 ? [‘top’] : []

def full_name
# consider screen_name and name equal
# the name should not match exact so we concatenate it with the screen_name
[screen_name, “#{screen_name} #{name}”]

# the custom score
def score
return followers_count if followers_count > 0
if mentions_count < 10
elsif mentions_count < 100
mentions_count * 10
elsif mentions_count < 1000
mentions_count * 100
mentions_count * 1000


The user query is composed by 2 backend queries:

  • the first one retrieves all matching top-users (could be replaced by a query targeting your followers/following only)
  • the second one the others.

Try it for yourself, and enjoy relevant and highlighted results after the first keystroke: Twitter Handles Search.

About the author
Sylvain Utard

VP of Engineering


Recommended Articles

Powered byAlgolia Algolia Recommend

Post-Exit Year in Review

Ciprian Borodescu

AI Product Manager | On a mission to help people succeed through the use of AI

How we built the real demo for our fake CSS API client

Tim Carry

The Faces of Algolia: Meet Marie-Laure Sin

Marie-Laure Sin