.Net Zip Library/Utility updates and fixes


This is an update to my zip library and utility.  You can read previous entries on this here


[Update 30 October 2007]: I moved this library to a CodePlex project. See www.codeplex.com/DotNetZip  It has been steadily improved since the time of this post. 


A reader mailed me to say that the Zip library did not work with some zip archives.  I knew that would be true.  The zip library code I had originally posted does not handle all zip files.  The zip spec is pretty rich, and has a bunch of advanced features.  The simple DotNetZip library I produced intially did not support them all. 


But this particular reader had a pretty simple request. They needed support for the Data Descriptor field when reading a zip archive.  


There’s a part of the PKZIP spec that allows the compressed and uncompressed size of the files, and the CRC, to be placed after the actual compressed file data within the zip archive, in what the spec calls a “Data Descriptor”.  This is to support zip creation where the output stream does not support streaming.  The presence of the Data Descriptor is flagged in a zip file when the “BitField” for a Zip entry has bit 3 set.  The current APPNOTE from PKWare says:



This descriptor exists only if bit 3 of the general purpose bit flag is set (see below). It is byte aligned and immediately follows the last byte of compressed data. This descriptor is used only when it was not possible to seek in the output .ZIP file, e.g., when the output .ZIP file was standard output or [another] non-seekable device. 


The person who emailed me also gave a zip archive that included this data descriptor, and asked “why can’t your library read this zip?”.  Using that as a test case, I’ve updated the Zip library to support the Data Descriptor for reading/extracting.  It will work correctly now, upon reading such archives. 


Keep in mind that DotNetZip, the zip library implementation I built,  is not a streaming implementation, so it will never produce (write) a zip archive that includes a Data Descriptor like this.  [update:  this is no longer true, as of 2008]  It will produce zips that are interoperable with winzip and pkzip, as well as the Java JAR tool, etc.  


[update: See the codeplex project for source code ]

DotNetZip-src-v1.1.zip

Comments (4)

  1. daydreamsy2k says:

    I am the one who sent you the problem. It has been solved in this release.

    I just want to thank you for your fast response and for fixing the problem.

    Your class so good and so small not like SharpZipLib so big and a lot of classes.

    Please do not stop improving (SharpZipLib) and do not forget to always keep it so simple because there are a lot of people like me want only simple zip/unzip in C#

    Kind Regards

  2. Andrew says:

    Fantastic library. I’m enjoying using it very much. However, there are some caveats I’m finding.

    For instance, I’m zipping up a large number of .wav files. I use tugzip for my normal, everyday zipping operations. The tugzip archive comes out to about 1.6 mb. The library’s produced zip is about 2.6 mb. When I open the library-created zip in any zip app (in this case i tested with winzip, tugzip and winrar) theyre all reporting a negative ratio for most files, meaning that the packed file is larger than the file on disk. For example, a 55k wav file was ‘compressed’ and showed as 67kb in the zip.

    I’m also unable to figure out how to add files to a ZipFile from say.. ‘C:WindowsTemp’ and have just the file inserted into the zip without a directory structure. ie. The diretory structure in the zip ends up like this…

    C:

     Windows

       Temp

         somefile.extension

    The same occurs if I just pass C:WindowsTemp to the AddDirectory method.

    Any help or explanation would be greatly appreciated!

  3. andrew_ says:

    I found some more information on the ratio issues. It seems to be happening only with .wav files (so far, as far as i can tell)

    The size on disk of my test wav file is 55802 bytes. The size in the zip is 67904 bytes.

    It seems as if the line..

    Int32 isz = (Int32)_UnderlyingMemoryStream.Length;

    is reporting the length of the stream (file) as 67904.

    This much is confusing. I’m not sure why the stream contains more bytes than the file is on disk. At any rate, it’s more info to work from 🙂

  4. cheeso says:

    I mentioned this in an update to the first post on the Zip library: The DeflateStream class in .NET can actually Inflate the size of the stream. The output is still valid, but it isn’t compressing as you’d expect.  Because my example Zip library and utilities just use the built-in DeflateStream for the compression, the zip library and utilities can exhibit the same behavior.  I hate to sound like I am deflecting responsibility, but my zip library is just a wrapper. 

    The base class library team at Microsoft is aware of this behavior and is considering it. If you’d like to weigh in on this behavior, and I encourage you to do so if you find Deflate/Zip support to be important, please use the Microsoft Connect:

    http://connect.microsoft.com/VisualStudio/feedback/SearchResults.aspx?SearchQuery=DeflateStream