InfoQ

News

Support for Zip Files Still Lacking In .NET 3.0

Posted by Jonathan Allen on Jan 02, 2007

Community
.NET
Topics
.NET Framework
Tags
Compression
The ability to use file compression like the venerable ZIP format is very important to many developers. For those developers using.NET, that means dropping to command shell or using a third-party component. With .NET 3.0, there is built-in support for ZIP files, though the implementation is somewhat questionable.

The first step to create a ZipPackage object. One would expect this to work just like any other object.

   Dim zipFile As New ZipPackage("C:\Temp\test.zip", FileMode.Create)
ZipPackage zipFile = new ZipPackage("C:\\Temp\\test.zip", FileMode.Create);

Unfortunately that isn't the case. Instead, one has to use a factory method in the base class Package.

   Dim zipFile As ZipPackage = Package.Open("C:\Temp\test.zip", FileMode.Create)
ZipPackage zipFile = (ZipPackage)Package.Open("C:\\Temp\\test.zip", FileMode.Create );

Though ZipPackage is the default return type for Package.Open, there is no way to actually specify that in any of the overloads making it somewhat difficult to create your own implementations.

Moving on, adding files to the zip package in the normal use case is exceedingly painful. One would think the code would look something like:

   zipFile.AddFile("C:/temp/someFile.txt", CompressionOption.Maximum)
zipFile.AddFile("C://temp//someFile.txt", CompressionOption.Maximum);

To add files the .NET way, one has to:

  1. Create a new URI object that will represent the name of the file inside the ZipFile.
  2. Determine the correct Mime type for the file.
  3. Create a new PackagePart using the aforementioned information.
  4. Open the source file as a Stream.
  5. Copy said stream into the PackagePart stream.

The below code shows how to do this using a very crude stream copy loop. Note that it should be much faster to use buffers than to read the stream one byte at a time.

   Dim newUri As New Uri("/someFile.txt", UriKind.Relative);
Dim part1 As ZipPackagePart = zipFile.CreatePart(newUri, _
System.Net.Mime.MediaTypeNames.Text.Plain, CompressionOption.Maximum)
Using output As Stream = part.GetStream,
input As FileStream = File.OpenRead("C:/temp/someFile.txt")
Dim value As Integer = input.ReadByte
Do Until value = -1
output.WriteByte(CByte(value))
value = input.ReadByte
Loop
End Using

While there isn't a simple way to decompress a zip file, it is far less painful than creating the file in the first place. This code lists all of the files in a zip file and dumps the text ones to the screen.

   zipFile = CType(ZipPackage.Open("C:\Temp\test.zip", IO.FileMode.Open), ZipPackage)
For Each part As ZipPackagePart In zipFile.GetParts
Console.WriteLine(part.Uri)
Console.WriteLine(vbTab & "Type:" & part.ContentType)
Console.WriteLine(vbTab & "Option:" & part.CompressionOption)
If part.ContentType.ToLower.Contains ("text/") Then
Using output As New StreamReader(part.GetStream)
Console.WriteLine(output.ReadToEnd)
End Using
End If
Next
Even when Maximum compression is chosen, the compression rate is very poor compared to that of WinZip. We tested this using a 2KB plain text file containing the readme for an application. WinZip had a 46% compression while .NET 3.0 had only a 4% compression. For WinZip the setting "Maximum (portable)" was used. For .NET 3.0, the above code was used.
 
Actually that comparison isn't fair, because buried in the docs is this note:
For the default ZipPackage subclass, the CreatePart method only supports two compressionOption values, NotCompressed or Normal compression. Other CompressionOption values of Maximum, Fast, or SuperFast use Normal compression.

One last warning, this method doesn't create standard zip files. While the files can be read by normal tools, the zip files will have an addition file called "[Content_Types].xml". Likewise, .NET 3.0 cannot read zip files unless the file contains "[Content_Types].xml". If the file is missing, it silently fails to find any files.

 In conclusion, .NET 3.0's support for the ZIP format is so highly specialized that is is useless in the general case.
.NET Zip support by Rob Eisenberg Posted Jan 2, 2007 9:01 AM
Re: .NET Zip support by Jonathan Allen Posted Jan 2, 2007 10:27 AM
Re: .NET Zip support by Birger Halfmeier Posted Jan 2, 2007 9:44 PM
Re: .NET Zip support by Ted Neward Posted Jan 4, 2007 3:14 PM
  1. Back to top

    .NET Zip support

    Jan 2, 2007 9:01 AM by Rob Eisenberg

    There is quite a bit of inaccuracy in this post. To begin with, Zip support has been available in .NET since version 1.1, although it was buried in some j# or vb libraries. No one really knew about it. However, .NET 2.0 introduced the System.IO.Compression namespace which contains two implementations: DeflateStream and GZipStream. I believe the 3.0 specific functionality mentioned above (System.IO.Packaging) is specifically related to XPS documents. Perhaps someone can confirm this? As to the quality of any of these algorithms, I am no expert.

  2. Back to top

    Re: .NET Zip support

    Jan 2, 2007 10:27 AM by Jonathan Allen

    There is quite a bit of inaccuracy in this post. To begin with, Zip support has been available in .NET since version 1.1, although it was buried in some j# or vb libraries. No one really knew about it.

    Can you give me a reference for that?
    However, .NET 2.0 introduced the System.IO.Compression namespace which contains two implementations: DeflateStream and GZipStream.

    While those can be used for accessing ZIP files, the code needed to do it isn't trivial. Even just getting the file list requires manually parsing the stream to get the header information. There is a sample at MSDN
    I believe the 3.0 specific functionality mentioned above (System.IO.Packaging) is specifically related to XPS documents. Perhaps someone can confirm this?

    Most of the implementation details for that namespace are in the XPS documents.

    Much of the fault with the namespace is that is was described as a general solution in the documentation, and only when you really dig into it do you see that it is just support code for XPS.

  3. Back to top

    Re: .NET Zip support

    Jan 2, 2007 9:44 PM by Birger Halfmeier

    There is quite a bit of inaccuracy in this post. To begin with, Zip support has been available in .NET since version 1.1, although it was buried in some j# or vb libraries. No one really knew about it.

    Can you give me a reference for that?

    Have a look at this MSDN Magazine article:
    Using the Zip Classes in the J# Class Libraries to Compress Files and Data with C#

  4. Back to top

    Re: .NET Zip support

    Jan 4, 2007 3:14 PM by Ted Neward

    There is quite a bit of inaccuracy in this post. To begin with, Zip support has been available in .NET since version 1.1, although it was buried in some j# or vb libraries. No one really knew about it.

    Can you give me a reference for that?

    It's in J#; look at java.util.zip packages. It's a port of the ZIP support introduced in JDK 1.1, and it may not be any better than what you see in System.IO.Compression, but it is there. There was an MSDN article from some years back that demonstrated how to use those libraries from C#, by the way--a quick Google search (which I'm too lazy to run at the moment) should dig it up.

Educational Content

Brian Marick on 4 Challenges and 5 Guiding Values of Agile Software Development

Brian Marick takes us through a quick tour of the most important values and challenges to adopting Agile successfully (they aren't the typical challenges and values we hear in the community).

Are You a Software Architect?

The line between development and architecture is tricky. Does it exist at all? Is an ivory tower actually needed? There's a balance in the middle, but how do you move from developer to architect?

Agile – A Way of Life and Pragmatic Use of Authority

The word 'authority' sometimes produces an allergic response in hard-line agilists. Freedom and authority – both are bad if misused and both are good if used in right spirit for a noble cause.

Getting Started with Grails, Second Edition

"Getting Started with Grails" brings you up to speed on this modern web framework. Companies as varied as LinkedIn, Wired, and Taco Bell are all using Grails. Are you ready to get started as well?

Using ITIL V3 as a Foundation for SOA Governance

Those familiar with only ITIL V2 often scoff at the thought that ITIL could serve as a governance framework for SOA. With ITIL V3, the focus of the framework shifted towards service-orientation.

Adrian Colyer on AspectJ, tc Server and dm Server

SpringSource CTO Adrian Colyer discusses AspectJ, SpringSource's dm Server and tc Server products, OSGi and Scrum.

Adam Wiggins on Heroku

Heroku's Adam Wiggins talks about Rails, Background Jobs, Add-Ons, Ruby, and how Heroku manages to work around Ruby's inefficiencies using Erlang and other languages.

SOA as an Architectural Pattern: Best Practices in Software Architecture

For Grady Booch the foundation of a good architecture is patterns, SOA being just one of many patterns. In this Second Life presentation, Booch attempts to bring more clarity on what architecture is.