FAQ / Searching / Can I index pdf/word/… documents?
Aug. 13, 2019

Can I index pdf/word/… documents?

Yes, but not directly.

You’ll have to first extract the textual content of your documents and index that information.

If you happen to have big documents, we also recommend to split the content into smaller chunks. You could for example split your text into paragraphs, and index those independently.

Then, you can use our distinct feature to return each document only once.

Our community member Omar Bahareth wrote an excellent step-by-step tutorial: Indexing PDF or Other Files using Tika and Nokogiri.

Did you find this page helpful?