Searching content of pdf files

macnab

New member
Member
VIP
Local time
11:39 PM
Messages
158
Location
Roodepoort
Search, although taking a long time if you have a number of pdf files, doesn't seem to return any results.

Can anything be done to make it work?
 

My Computer My Computer

At a glance

Windows 7 Professional SP1 64-bitIntel i3-3220 3.3GHz4GBNVIDIA GeForce GTX650 1GB
Computer type
PC/Desktop
Computer Manufacturer/Model Number
Built-up
OS
Windows 7 Professional SP1 64-bit
CPU
Intel i3-3220 3.3GHz
Motherboard
Gigabyte
Memory
4GB
Graphics Card(s)
NVIDIA GeForce GTX650 1GB
Sound Card
On-board Realtek
Monitor(s) Displays
LGL1719S
Screen Resolution
1280 x 1024
Hard Drives
Samsung HD321HJ, Samsung HD103SI, Seagate ST2000DL, Seagate ST3100524AS
PSU
450W
Case
AOpen
Cooling
Case fan, CPU fan, PSU fanf.
Keyboard
Microsoft
Mouse
AOpen
Internet Speed
1Mb/s
Antivirus
MSSE
Browser
Firefox
What are you using to search? If you wish to search the contents of PDF files themselves, you will need to use a PDF reader such as Acrobat from Adobe.
 

My Computer My Computer

At a glance

Windows 8.1 Pro RTM x64Intel Core-i5-3570K 4-core @ 3.4GHz (Ivy Brid...4 x 4GB DDR3-1600 Corsair Vengeance CMZ8GX3M2...MSI GeForce GTX770 Gaming OC 2GB
Computer type
PC/Desktop
Computer Manufacturer/Model Number
Dwarf Dwf/11/2012 r09/2013
OS
Windows 8.1 Pro RTM x64
CPU
Intel Core-i5-3570K 4-core @ 3.4GHz (Ivy Bridge) (OC 4.4GHz)
Motherboard
ASRock Z77 Extreme4-M
Memory
4 x 4GB DDR3-1600 Corsair Vengeance CMZ8GX3M2A1600C9B (16GB)
Graphics Card(s)
MSI GeForce GTX770 Gaming OC 2GB
Sound Card
Realtek High Definition on board solution (ALC 898)
Monitor(s) Displays
ViewSonic VA1912w Widescreen (VGA)
Screen Resolution
1440x900
Hard Drives
OCZ Agility 3 SSD 120GB SATA III x2 (RAID 0)
Samsung HD501LJ 500GB SATA II x2
Hitachi HDS721010CLA332 1TB SATA II
Iomega 1.5TB Ext USB 2.0
WD 2.0TB Ext USB 3.0
PSU
XFX Pro Series 850W Semi-Modular
Case
Gigabyte IF233
Cooling
1 x 120mm Front Inlet 1 x 120mm Rear Exhaust
Keyboard
Microsoft Comfort Curve Keyboard 3000 (USB)
Mouse
Microsoft Comfort Mouse 3000 for Business (USB)
Internet Speed
NetGear DG834Gv3 ADSL Modem/Router (Ethernet) ~4.0 Mb/s (O2)
Antivirus
Avast! 8.0.1497
Browser
IE 11
Other Info
Optical Drive: HL-DT-ST BD-RE BH10LS30 SATA Bluray
Lexmark S305 Printer/Scanner/Copier (USB)
WEI Score: 8.1/8.1/8.5/8.5/8.25
Asus Eee PC 1011PX Netbook (Windows 7 x86 Starter)
Searching with Windows Explorer. I have about 25 large pdf files, and I want to Search specifing a word or phrase in the Content, so I then know which 2 or 3 pdf files to open and explore further.
XP did it, albeit very slowly, but Windows 7 returns no results.
 

My Computer My Computer

At a glance

Windows 7 Professional SP1 64-bitIntel i3-3220 3.3GHz4GBNVIDIA GeForce GTX650 1GB
Computer type
PC/Desktop
Computer Manufacturer/Model Number
Built-up
OS
Windows 7 Professional SP1 64-bit
CPU
Intel i3-3220 3.3GHz
Motherboard
Gigabyte
Memory
4GB
Graphics Card(s)
NVIDIA GeForce GTX650 1GB
Sound Card
On-board Realtek
Monitor(s) Displays
LGL1719S
Screen Resolution
1280 x 1024
Hard Drives
Samsung HD321HJ, Samsung HD103SI, Seagate ST2000DL, Seagate ST3100524AS
PSU
450W
Case
AOpen
Cooling
Case fan, CPU fan, PSU fanf.
Keyboard
Microsoft
Mouse
AOpen
Internet Speed
1Mb/s
Antivirus
MSSE
Browser
Firefox
I'm having the same problem

I am a student with a number of research-related pdfs through which I would like to search for text from an explorer window and the start menu search box.

I have gone into Indexing Options->Advanced->File types->pdf->Index Properties and File Contents but that did not solve the problem that I share with macnab.

I looked into iFilters from Adobe. But Adobe claims that the latest iFilters are bundled with the latest versions of Acrobat and Reader, so I should have them.

Does anyone have a solution? I want to find the word “causative” among pdfs in a folder with a few dozen of them.
 

My Computer My Computer

At a glance

Windows 7
OS
Windows 7
The file has to have recognizable text stored in it before it can be searched. A word file that is then converted to a pdf may have this information if the user saved the file as a searchable PDF. However, something that has been scanned into a pdf or something of the like will not. In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text. Adobe Pro, for example, has this feature built in that you can use, and there are other dedicated programs for doing this as well. In fact, MS Office Tools used to have a built-in OCR tool that allowed you to OCR a scanned tiff image, but I'm not sure if it has it any longer.
 

My Computer My Computer

At a glance

XP / Win7 x64 ProIntel Quad-Core Q9450 @ 3.2GHz2x2GB GSkill DDR2NVIDIA GeForce 8600 GTS (EVGA)
OS
XP / Win7 x64 Pro
CPU
Intel Quad-Core Q9450 @ 3.2GHz
Motherboard
Asus P5-E
Memory
2x2GB GSkill DDR2
Graphics Card(s)
NVIDIA GeForce 8600 GTS (EVGA)
Monitor(s) Displays
Dell 2408WFP
Screen Resolution
1920x1200
Cool, but didn't work

In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text.

Thanks for the reply. That is an interesting tip! However, I encountered 2 problems:

1. Most of the pdfs I tried to use Adobe Pro's OCR function on had "renderable text" on each page. I'm not sure where that renderable (i.e. editable) text is, but it prevented the OCR from working.

2. Some of the pdfs from older papers could be OCR'd, but that still didn't allow the text inside them to be indexed into Windows 7's search. (I tried it even after making sure that indexing was "complete" in the "Indexing Options" tool).

You seem to know what you're about. Any more ideas? Is there some iFilter I need to download? Some setting I need to tweak? I don't want to have to learn to use Mendeley Desktop...
 

My Computer My Computer

At a glance

Windows 7
OS
Windows 7
Adobe iFilter 9 for 64-bit systems instructions

Here's something new that might help:

I have tried downloading and installing the Adobe iFilter 9 for 64-bit systems. Afterwards, Windows search still does not work.

However, the installation instructions include this:

"After installing the PDF iFilter, it is recommended that you set your system PATH environment variable to the "bin" folder of the "Ifilter" installation. For example, from the "Control Panel\System\Advanced\Environment Variables" tab, append to PATH "C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\" and then restart the computer. "

I Windows Search for "environment variables" (an example of its utility) and find the control panel. However, it is not totally clear to me that these instructions are for Windows 7 and I don't want to screw up my computer.

There are two sets of variables, for User and for System. The system variables are not edit-able. The current path points to Windows Live. Should I change that path?
 

My Computer My Computer

At a glance

Windows 7
OS
Windows 7
In the cases in which the text has not been recognized or was not stored, you will need to use software that can perform OCR (optical character recognition) on the file and then save it again with the stored, recognized text.

Thanks for the reply. That is an interesting tip! However, I encountered 2 problems:

1. Most of the pdfs I tried to use Adobe Pro's OCR function on had "renderable text" on each page. I'm not sure where that renderable (i.e. editable) text is, but it prevented the OCR from working.

2. Some of the pdfs from older papers could be OCR'd, but that still didn't allow the text inside them to be indexed into Windows 7's search. (I tried it even after making sure that indexing was "complete" in the "Indexing Options" tool).

You seem to know what you're about. Any more ideas? Is there some iFilter I need to download? Some setting I need to tweak? I don't want to have to learn to use Mendeley Desktop...

1. The renderable text problem happens when there is an image of text as well as some actual text in the document. It therefore finds this "renderable text" and thinks there's nothing else it can do from there. The best bet in that scenario is to save it as a PDF again (print to PDF with cutePDF writer or something of the sort), then do the OCR again on the document.

2. Are you saving them again after you OCR them? If so, then the OCR is not properly recognizing the text inside the document that you're trying to search for. Have you tried right clicking on the folder they're in and doing a search instead of the windows indexed search?
 

My Computer My Computer

At a glance

XP / Win7 x64 ProIntel Quad-Core Q9450 @ 3.2GHz2x2GB GSkill DDR2NVIDIA GeForce 8600 GTS (EVGA)
OS
XP / Win7 x64 Pro
CPU
Intel Quad-Core Q9450 @ 3.2GHz
Motherboard
Asus P5-E
Memory
2x2GB GSkill DDR2
Graphics Card(s)
NVIDIA GeForce 8600 GTS (EVGA)
Monitor(s) Displays
Dell 2408WFP
Screen Resolution
1920x1200
Back
Top