Working with Zip Files in .NET [Richard Lee]

Article
06/28/2010

Before getting started, I’ll introduce myself. My name is Richard Lee, and I’m a developer intern on the BCL for the summer. I’ve only been here for a few weeks, but it’s been great working here. The people, the environment, and my project are all great. Speaking of which, my project is to add general purpose .NET APIs for reading and writing Zip files, which we’re considering adding to the next version of the .NET Framework.

The most common Zip tasks are extracting to a directory and archiving a directory. For these mainline scenarios, we have static convenience methods. The following code takes all of the files in the Zip file, photos.zip, and extracts them to a folder on the file system:

 ZipArchive.ExtractToDirectory("photos.zip", @"photos\summer2010");

This code does the reverse, putting all of the files in the folder into the Zip file:

 ZipArchive.CreateFromDirectory(@"docs\attach", "attachment.zip");

For more sophisticated manipulations of Zip archives, there are two main classes. ZipArchive represents a zip archive, which is a collection of entries, and ZipArchiveEntry represents an archived file entry. The following code extracts only text files from the given archive.

 using (var archive = new ZipArchive("data.zip"))
{
    foreach (var entry in archive.Entries)
    {
        if (entry.FullName.EndsWith(".txt", StringComparison.OrdinalIgnoreCase))
        {
            entry.ExtractToFile(Path.Combine(directory, entry.FullName));
        }
    }
}

Zip archives can also be created on-the-fly. This example creates a new archive with a readme file that is created without the need for a corresponding file on disk, and a file from the file system.

 using (var archive = new ZipArchive("new.zip", ZipArchiveMode.Create))
{
    var readmeEntry = archive.CreateEntry("Readme.txt");
    using (var writer = new StreamWriter(readmeEntry.Open()))
    {
        writer.WriteLine("Included files: ");
        writer.WriteLine("data.dat");
    }

    archive.CreateEntryFromFile("data.dat", "data.dat");
}

The ZipArchive class supports three modes:

In Read mode, data is read from the file on demand, using only a small buffer.
In Create mode, data is written directly to the file using only a small buffer. Only one entry may be held open for writing at a time.
In Update mode it is possible to read and write from existing archives, as well as rename or delete entries. This mode requires loading the entire archive into memory, and as such we recommend that it be used only with small archives when this functionality is needed.

Below is our current thinking on what the public API listing will look like (note that this hasn’t been finalized yet).

 namespace System.IO.Compression
{
    public enum ZipArchiveMode { Read, Create, Update }

    public class ZipArchive : IDisposable {
        // Constructors
        public ZipArchive(String path);
        public ZipArchive(String path, ZipArchiveMode mode); 
        public ZipArchive(Stream stream);
        public ZipArchive(Stream stream, ZipArchiveMode mode);
        public ZipArchive(Stream stream, ZipArchiveMode mode, Boolean leaveOpen);

        // Properties
        public ReadOnlyCollection<ZipArchiveEntry> Entries { get; }
        public ZipArchiveMode Mode { get; }
        
        // Instance methods
        public ZipArchiveEntry GetEntry(String entryName);
        public ZipArchiveEntry CreateEntry(String entryName);

        public void Dispose();
        protected virtual void Dispose(Boolean disposing);

        public override String ToString();

        // Instance convenience methods
        public ZipArchiveEntry CreateEntryFromFile(String sourceFileName, String entryName);

        public void ExtractToDirectory(String destinationDirectoryName);

        // Static convenience methods
        public static void CreateFromDirectory(String sourceDirectoryName, String destinationArchive);
        public static void CreateFromDirectory(String sourceDirectoryName, String destinationArchive, Boolean includeBaseDirectory);

        public static void ExtractToDirectory(String sourceArchive, String destinationDirectoryName);
    }

    public class ZipArchiveEntry {
        // Properties
        public DateTimeOffset LastWriteTime { get; set; }
        public String FullName { get; }
        public String Name { get; }
        public Int64 Length { get; }
        public Int64 CompressedLength { get; }
        public ZipArchive Archive { get; }

        // Methods
        public Stream Open();
        public void Delete();
        public void MoveTo(String destinationEntryName);

        // Convenience methods
        public void ExtractToFile(String destinationFileName);
        public void ExtractToFile(String destinationFileName, Boolean overwrite);

        public override String ToString();
    }
}

We would love to hear what you think of the APIs so far, and how you plan on using them.

Working with Zip Files in .NET [Richard Lee]

Additional resources