File size of zipped password protected files larger than expected


  1. Posts : 5
    Windows 7 Ultimate 32bit
       #1

    File size of zipped password protected files larger than expected


    When I compress a password protected Word or Excel file by adding to a zip or archive folder, the file size is much greater than when I add a non protected file. For example, if I compress a 5Mb Excel file, the archive (zip) file is a little less than 5Mb, with an 8% compression ratio. If I compress the same file without the password protection, the archive file is about 1Mb with an 82% compression ratio.

    I guess I can somewhat understand this. The act of compressing the file is a manner of changing it. But then why do I get some minimal (8%) compression at all?

    I have tried this with combinations of WinXP, Win7, Office 2003, and Office 2010, and it seems the low compression (large file size) with the password protected file is specific to systems running Office 2010. The compression ratio of the file is about 86%-88% with Office 2003, regardless of password protection. Is this part of the "enhanced security" of Office 2010? Is there a workaround for this, or is it an inescapable result of compressing a password protected file when using Office 2010?
      My Computer


  2. Posts : 10,485
    W7 Pro SP1 64bit
       #2

    If you had a simple text file filled with the word TEST over and over - then that would be highly compressible since the compression app just needs to take note of the word (and any spaces) & then record how many times the word (and any spaces) should appear in the decompressed document.

    My theory for your findings with spreadsheets is:
    An Excel spreadsheet must have lots of repeated patterns that can be highly compressed. When you encrypt that spreadsheet, you scramble some of those repeated patterns - thus less compressibility.

    To test this theory, I saved a spreadsheet (Office 2010) and opened it with a HEX editor. Just as I thought, there are lots of repeated patterns in that spreadsheet file. Then I made a copy of that spreadsheet and password protected the entire workbook. The HEX editor showed fewer repeated patterns for the encrypted file... but I cannot put a percentage on it. If I had filled the test spreadsheet with a lot more data/labels/charts/text, then I suspect that the difference in repeated patterns (and thus compressibility) would have been greater. My hope would be that the compression app would not be able to detect a pattern in the part of the encrypted spreadsheet file that represented the secured contents.

    I don't feel like installing Office 2003 to try and ferret out why you still saw high compression rates on protected files... but it seems that the protection schemes differ between Office 2003 and Office 2010.

    All of the above is just conjecture on my part. I would not put too much stock in my ramblings.
      My Computer


  3. Posts : 2,497
    Windows 7 Pro 64 bit
       #3

    UsernameIssues is probably correct. Encryption tends to greatly reduce the compressibility of a file, regardless of it's type or content. Office 2010 improved encryption considerably from Office 2003.

    Remember that the purpose of improving encryption is to improve security. The difference in compressibility is rarely even considered.
      My Computer


 

  Related Discussions
Our Sites
Site Links
About Us
Windows 7 Forums is an independent web site and has not been authorized, sponsored, or otherwise approved by Microsoft Corporation. "Windows 7" and related materials are trademarks of Microsoft Corp.

© Designer Media Ltd
All times are GMT -5. The time now is 20:21.
Find Us