The Washington Post publishes around 1,200 stories, graphics, and videos per day. You read that correctly: one thousand two hundred pieces of content every single day. That’s nearly half a million pieces of content per year. It’s a prolific output, but nothing unusual. The Wall Street Journal and The New York Times each produce around 240 articles and blog posts a day. BuzzFeed, a company known for its high output, publishes 220 listicles, blogs, articles, and videos a day.
The point is: newspapers produce a prodigious amount of material. It’s not just newspapers, either. Most media companies — whether they’re in broadcasting, news, publications, audio, or streaming — produce and manage enormous content libraries. Their archives often live on a home-built content management system (CMS) — and there are good reasons for this.
Media companies have very particular needs: specific content catalogs, unique editorial workflows, and particular publication strategies. There are few (if any) off-the-shelf solutions that are powerful and flexible enough to fit every need. So companies build, rather than buy. For the purpose of production and publication, their home-built CMSs do work. But they inadvertently lock content away in inaccessible archives.
The native search within home-built platforms is consistently poor. Relevance is sub-par, the user interface is crowded, and the advanced features are lacking — if present at all. Often, the search function will run on simple keyword matching, which is fine for small search indices but not a good fit for colossal media libraries.
Consider a journalist writing a weekly update on low-carbon technologies in their country. If they search their CMS for low-carbon, a basic keyword search will return hundreds or thousands of results. A quick experiment reveals more than 32,000 results on The Guardian. Such a heavy-handed approach surfaces far too many results to sift through manually. Unless the journalist in question hits the jackpot, they aren’t going to stumble on what they’re looking for by chance. But outmoded search technology isn’t the only challenge. Data consistency burdens internal search, too.
Few media companies produce all their content internally. Newspapers publish wire stories from external journalists. Magazines run syndicated stories from partner publications. Television networks buy shows from a myriad of production companies. Music streaming services amalgamate back catalogs from hundreds of independent labels. Without fail, each source will have a different metadata structure. And there’s no guarantee that the creator (journalist, filmmaker, editor) will have stuck to their in-house structure, either. Inconsistent metadata means there’s no guarantee that you’re searching your entire content library.
Both challenges (search technology and metadata consistency) render internal search unreliable, even misleading and false, for many media companies. But why does that matter?
Time, productivity, and impact
Think back to that journalist reporting on low-carbon technologies. It’s a simple, straightforward article, but one with lots of opportunities for enrichment. The reporter could include a graph showing low-carbon rollout statistics by industry, or embed an interview they recently did with a bio-technologist. They could link back to earlier modeling projects or cross-link to an opinion piece on decarbonization strategies.
But the journalist can only do that if they can find old content in their archives.
Underpowered search harms organizations in three ways.
First, employees lose time. If an old piece of content is necessary to a new story, there’s no getting around endless scrolling and searching, page after page, search term after search term. Often people will resort to using the site search operator query (site:example.com “search term”) on Google. Remember that these are highly-skilled, creative professionals we’re talking about — journalists, filmmakers, editors, and others. Do organizations really want to pay them to spend time on mundane information-gathering tasks? Almost certainly not.
The second consequence is productivity loss. While employees lose time searching, the knock-on effects go wider. Slogging through endless pages of search results is frustrating and energy-sapping. It leaves people feeling disengaged and uninspired. In the brutally competitive world of media, that is the last thing you want.
Lastly, companies lose the opportunity for impact. Media companies are like icebergs: you see only a fraction of their mass. The Washington Post publishes 1,200 stories a day but its archive runs to tens of millions of items. Faced with an undercooked search function, a lot of people will walk away, abandoning the opportunity for reuse, enrichment, and addition. Think again about the journalist who gives up looking for the perfect graph. This represents a colossal waste. It means media companies are leaving their archive to languish, ineffective, unused, and unprofitable.
But search doesn’t have to be a blocker.
In fact, modern search services can unleash media libraries and archives, putting information and content at your fingertips.
Your technology stack. Better search technology.
The future of search and discovery isn’t in improving the CMS’s native search. It’s impossible to build an off-the-shelf platform that fits everyone’s needs — let alone one with a powerful search function. Instead, leading media companies like The Times and POLITICO are turning to search microservices like Algolia.
Think of microservices as building blocks that can attach to your existing technologies. They augment narrow functions in your CMS, rather than replace it wholesale. It means you can revolutionize your internal search (and likely your user-facing search, too) without disrupting your editorial workflow.
Consider POLITICO Europe. Back in 2010, they launched POLITICO Pro, which aggregates journalism, data, and tools to deliver real-time intelligence and analysis to policy professionals. Publishing 500 daily articles, the platform faced the same challenge most media companies do: How do you manage such a huge content library in such a way to make it accessible?
To cut a long story short, they deployed a modern search microservice for their customers. It integrated with their existing tech stack, uniting their media libraries and allowing users to query multiple content indices at once. Overnight, paid users gained a way to intuitively and effectively search through ”24 million votes, 700,000 amendments, 16,000 legislators, and 7 million searchable items.” But here’s the interesting thing: it wasn’t just POLITICO Pro’s paid users who started using the service.
POLITICO Europe’s staff journalists began tapping into the new search function to make sense of the platform’s immense archive. Suddenly they could surface long-lost insights and once-hidden data points. It helped them do their job better and faster.
This is the future of search. Similar to what we described in our article on back-office search, it’s creating a Google-esque experience — only better. Third-party search engines own, control, and hide the search algorithm. You are at the mercy of their whims and experiments. On the other hand, search services reveal the inner workings of the machine and hand control to you. If you want to prioritize recency, you can. If you want to promote content from certain sources, you can. If you want to rank pieces by popularity, you can. You control the search experience.
Empower human-led creation
Media sits at the crossroads of human-led and machine-led creation. While algorithms and automated recommendations have revolutionized content publication, there is still something magical about having a human take control. Media companies need to double down on that magic by empowering their employees to search their media libraries and surface valuable content. We already have the technology to do so. All it requires is leadership to take the plunge.
Learn more about search in the media industry.