Searching content of pdf files


  1. Posts : 158
    Windows 7 Professional SP1 64-bit
       #1

    Searching content of pdf files


    Search, although taking a long time if you have a number of pdf files, doesn't seem to return any results.

    Can anything be done to make it work?
      My Computer


  2. Posts : 9,582
    Windows 8.1 Pro RTM x64
       #2

    What are you using to search? If you wish to search the contents of PDF files themselves, you will need to use a PDF reader such as Acrobat from Adobe.
      My Computer


  3. Posts : 158
    Windows 7 Professional SP1 64-bit
    Thread Starter
       #3

    Searching with Windows Explorer. I have about 25 large pdf files, and I want to Search specifing a word or phrase in the Content, so I then know which 2 or 3 pdf files to open and explore further.
    XP did it, albeit very slowly, but Windows 7 returns no results.
      My Computer


  4. Posts : 5
    Windows 7
       #4

    I'm having the same problem


    I am a student with a number of research-related pdfs through which I would like to search for text from an explorer window and the start menu search box.

    I have gone into Indexing Options->Advanced->File types->pdf->Index Properties and File Contents but that did not solve the problem that I share with macnab.

    I looked into iFilters from Adobe. But Adobe claims that the latest iFilters are bundled with the latest versions of Acrobat and Reader, so I should have them.

    Does anyone have a solution? I want to find the word “causative” among pdfs in a folder with a few dozen of them.
      My Computer


  5. Posts : 1,814
    XP / Win7 x64 Pro
       #5

    The file has to have recognizable text stored in it before it can be searched. A word file that is then converted to a pdf may have this information if the user saved the file as a searchable PDF. However, something that has been scanned into a pdf or something of the like will not. In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text. Adobe Pro, for example, has this feature built in that you can use, and there are other dedicated programs for doing this as well. In fact, MS Office Tools used to have a built-in OCR tool that allowed you to OCR a scanned tiff image, but I'm not sure if it has it any longer.
      My Computer


  6. Posts : 5
    Windows 7
       #6

    Cool, but didn't work


    FliGi7 said:
    In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text.
    Thanks for the reply. That is an interesting tip! However, I encountered 2 problems:

    1. Most of the pdfs I tried to use Adobe Pro's OCR function on had "renderable text" on each page. I'm not sure where that renderable (i.e. editable) text is, but it prevented the OCR from working.

    2. Some of the pdfs from older papers could be OCR'd, but that still didn't allow the text inside them to be indexed into Windows 7's search. (I tried it even after making sure that indexing was "complete" in the "Indexing Options" tool).

    You seem to know what you're about. Any more ideas? Is there some iFilter I need to download? Some setting I need to tweak? I don't want to have to learn to use Mendeley Desktop...
      My Computer


  7. Posts : 5
    Windows 7
       #7

    Adobe iFilter 9 for 64-bit systems instructions


    Here's something new that might help:

    I have tried downloading and installing the Adobe iFilter 9 for 64-bit systems. Afterwards, Windows search still does not work.

    However, the installation instructions include this:

    "After installing the PDF iFilter, it is recommended that you set your system PATH environment variable to the "bin" folder of the "Ifilter" installation. For example, from the "Control Panel\System\Advanced\Environment Variables" tab, append to PATH "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\" and then restart the computer. "

    I Windows Search for "environment variables" (an example of its utility) and find the control panel. However, it is not totally clear to me that these instructions are for Windows 7 and I don't want to screw up my computer.

    There are two sets of variables, for User and for System. The system variables are not edit-able. The current path points to Windows Live. Should I change that path?
      My Computer


  8. Posts : 1,814
    XP / Win7 x64 Pro
       #8

    hungrysokoke said:
    FliGi7 said:
    In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text.
    Thanks for the reply. That is an interesting tip! However, I encountered 2 problems:

    1. Most of the pdfs I tried to use Adobe Pro's OCR function on had "renderable text" on each page. I'm not sure where that renderable (i.e. editable) text is, but it prevented the OCR from working.

    2. Some of the pdfs from older papers could be OCR'd, but that still didn't allow the text inside them to be indexed into Windows 7's search. (I tried it even after making sure that indexing was "complete" in the "Indexing Options" tool).

    You seem to know what you're about. Any more ideas? Is there some iFilter I need to download? Some setting I need to tweak? I don't want to have to learn to use Mendeley Desktop...
    1. The renderable text problem happens when there is an image of text as well as some actual text in the document. It therefore finds this "renderable text" and thinks there's nothing else it can do from there. The best bet in that scenario is to save it as a PDF again (print to PDF with cutePDF writer or something of the sort), then do the OCR again on the document.

    2. Are you saving them again after you OCR them? If so, then the OCR is not properly recognizing the text inside the document that you're trying to search for. Have you tried right clicking on the folder they're in and doing a search instead of the windows indexed search?
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 06:09.
Find Us