Searching in bulk for files within a folder!

Page 1 of 2 12 LastLast

  1. Posts : 12
    windows 10 64
       #1

    Searching in bulk for files within a folder!


    I've got a task at work (where I can’t download any program or run the command tool) to OCR 10000 documents so I left adobe acrobat running during the night to do that.

    When I came back in the morning I found acrobat stopped OCR-ing and was crashed, and when I checked the target folder where I kept the OCRed files to see how many files were OCR-ed I found only 4000+/- files OCRed. Now I have to find those files that were not OCRed to proccess them on adobe acrobat .

    I have also managed to extract the titles of the files that were not OCRed from the main list and put them in txt file (just the name of the files without the extension) however I couldn't figure out how to split those files from the OCRed one in the source folder, Can you please help ?
      My Computer


  2. Posts : 7,351
    Windows 7 HP 64
       #2

    - Make a copy of the folder with the OCR files. Make sure they have different dates than the original files.
    - Rename the extension of the OCR files to be equal to the original files (not OCR).
    - Select all original files (not OCR) from the original folder, select copy, go to the folder with the already OCR that you changed the extension and hit paste.
    - It will tell that you already have a file with same name and if you want to replace. Press Shift +skip. It will copy only the files that hasn't been OCR to the folder that has a copy (with extension renamed).
    - Select sort by date. Select all files with date created by OCR and delete them. The remaining files will be those that hasn't been OCR.
    Last edited by Megahertz07; 21 Sep 2017 at 17:44.
      My Computers


  3. Posts : 1,784
    Linux Mint 18.2 xfce 64-bit (VMWare host) / Windows 8.1 Pro 32-bit (VMWare guest)
       #3

    I'm not sure how to do specifically what you are asking. However, you might try to break up the 10,000 files into separate subdirectories. For example, move all of the files which start with a, b, or c into one folder, d, e, f into another folder, etc. Then run the process several times, once on each of the folders you created. In this way, each run of Adobe would have a lot less files to deal with.
      My Computer


  4. Posts : 7,107
    W7 home premium 32bit/W7HP 64bit/w10 tp insider ring
       #4

    any extension can be found, once you click on the required folder
    The top right search option can be used, in your case use .txt
    these are then listed.

    Roy
      My Computer


  5. Posts : 12
    windows 10 64
    Thread Starter
       #5

    Megahertz07
    Thanks for your reply however, I don't think your solution would work for me.

    - Make a copy of the folder with the OCR files. Make sure they have different dates than the original files.
    I was working on the whole files on the same date, why should they have different dates ?
    - Rename the extension of the OCR files to be equal to the original files (not OCR).
    The OCRed files list doesn't have extension names, I don't have a list of the OCRed files but I have attached a spread sheet with a 3 lists/sheets the first one shows the whole files names (with extension, Tif & PDF) the second one it holds again the whole file names but without the extensions, the third sheet holds the remaining of the files list that need to be OCRed (without the extension) . Microsoft OneDrive - Access files anywhere. Create docs with free Office Online.
    - Select all original files (not OCR) from the original folder, select copy, go to the folder with the already OCR that you changed the extension and hit paste.
    Do you want me to overwrite the OCRed files with original files (not OCRed) that means we are going back to the start point!

    - It will tell that you already have a file with same name and if you want to replace. Press Shift +skip. It will copy only the files that hasn't been OCR to the folder that has a copy (with extension renamed).
    I'm a bit confused here, what do you mean by "Press Shift + Skip" ? and again the ocred files are all converted to PDF once its OCRed while the source files are mixed between Tifs and PDFs.
    - Select sort by date. Select all files with date created by OCR and delete them. The remaining files will be those that hasn't been OCR.
    The dates won't be that distinguished!

    mrjimphelps
    Thank you for your reply, your solution is too late now, but I'm using it for the following 10000 files:)

    torchwood
    Thanks I know that, but why do you think I need to search by extension? the source and the target folders contain PDFs files where the source/original folder contains both Tifs and PDFs.
      My Computer


  6. Posts : 7,351
    Windows 7 HP 64
       #6

    The OCRed files MUST have an extension. If they are a known type (*. pdf, *.doc, *.tif, etc) and if in explorer options is set not to show, you won't see the extension. But they have one.
    Open Control Panel\All Control Panel Items\Folder Options - View Tab - Un mark "Hide extensions for known type". For security reasons it should always remain un marked.
    Once you find out what extension you have on the OCR files, please report.
      My Computers


  7. Posts : 12
    windows 10 64
    Thread Starter
       #7

    #Megahertz07

    Well I think it's my mistake because I didn't make it clrear to you, The OCRed files have the Extension PDF as a standard format for any OCRed file, what I was talking about is the extensions on the list that I send you, Did you check that ?
      My Computer


  8. Posts : 7,351
    Windows 7 HP 64
       #8

    Yes I did.
    What name is given to the OCR file when a source file has a name ACE87603...tif?
    The name will be the same but with a pdf extension (ACE87603...pdf)?
      My Computers


  9. Posts : 12
    windows 10 64
    Thread Starter
       #9

    Yes indeed, it will have the same name with the pdf extension
      My Computer


  10. Posts : 7,351
    Windows 7 HP 64
       #10

    You may not had understood the solution I've posted (#2) but it will work.
    Can you run a cmd file?
    Lets make a test

    Open notepad and paste the command below

    time
    pause

    Save it as test.cmd on your desktop an then run it.
    It will show the computer time.
    Hit enter twice.
      My Computers


 
Page 1 of 2 12 LastLast

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 23:10.
Find Us