How to search through pdf media files?

I have a blog with reasonable number of pdf files. How can I make wordpress look into pdf files when I search it?

Thanks

Related posts

Leave a Reply

2 comments

  1. You could use the open-source utility, xpdf, to convert the pdf file into a plan text file and search through it like any other text file. Incorporating into WordPress is another thing to worry about though.

    I used it on my Mac to extract data from my company’s quarterly financial statements. You could install under Mac OS X, if you’ve got MacPorts installed.

    port install xpdf
    

    You would then need to use the utility xpdf-pdftotext to convert the pdf file to a plain text file.

    You could simply store the result in PHP as a variable

    $file_name = '/usr/local/src/file1.pdf';
    
    // On Mac the utility is called xpdf-pdftotext, and pdftotext on Linux I believe
    
    if (file_exists($file_name)) {
        $file_contents = shell_exec("/opt/local/bin/xpdf-pdftotext $filename - 2>&1");
    
        echo $file_contents;
    }
    

    If you’d like to install xpdf from source, its available on their website, which offers both Linux and Windows versions.

    Check out the xpdf man pages to get an idea of the utility, they’ve got different tools for different applications. Here’s the man page for the pdftotext utility.

    Best of luck.