CreateMHTMLBody Won't Read Web Pages Compressed with GZIP

The title says it all.  If you've got a web page hosted on a site that does GZIP compression, you can't point CDO's CreateMHTMLBody at it.  You'll get back one of a few different errors, depending on your IE setup.  The ones I've seen are (with IE 6.0 SP1 installed):

  • "Invalid Syntax" (if you don't have "Automatically Detect Settings" checked in your LAN Settings)
  • "Interface not registered" (if you do have it checked).

Both of these problems stem from one of CDO's dependencies: URLMON.DLL.  CDO uses this component to manage all of the work to actually fetch the web content.  We just pass the URL off to URLMON and wait for it to provide us back the HTML.  The problem is URLMON handles this through an asynchronous call, so it spins off a worker thread to do the actual work.  When this thread gets the content back from the server, it sees that it is GZIP-compressed.  It then tries to create a COM object (that URLMON implements) to decompress the content.  This is where the fun begins.

URLMON never calls CoInitialize on this worker thread.  So of course we run into problems.  The two different errors are just different flavors of COM problems.  The first error is caused by a failure to create an IClassFactory interface (COM's not initialized, remember?).  The second error results from COM having been initialize by the process, but not on that specific thread.  Unfortunately this COM object responsible for decompressing GZIP stuff works only on the STA thread.

All is not lost though!  CDO developers can get around this.  You have a number of options.  You could disable compression on your website (assuming that the site you're using is under your control).  You could turn off HTTP 1.1 (1.1 allows for compression, 1.0 doesn't) in your IE settings (assuming that your app is running in a user's context).  Or, you could fix it the right way ;-)

You can disable HTTP 1.1 for just your process using InternetSetOption.  If you do this before making your CreateMHTMLBody call, it will work like a champ.

In case your interested, here's how to do it in VB.NET:

     Private Const INTERNET_OPTION_HTTP_VERSION = 59

    <StructLayout(LayoutKind.Sequential, CharSet:=CharSet.Auto)> _
    Private Class INTERNET_VERSION_INFO
        Public dwMajorVersion As Int32
        Public dwMinorVersion As Int32
    End Class

    <DllImport("wininet.dll")> _
    Private Shared Function InternetSetOption(ByVal hInternet As IntPtr, _
        ByVal dwOption As Integer, _
        ByVal lpBuffer As INTERNET_VERSION_INFO, _
        ByVal dwBufferLength As Integer) As Boolean
    End Function

    Private Sub DisableHTTP1_1()
        Dim verInfo As New INTERNET_VERSION_INFO

        verInfo.dwMajorVersion = 1
        verInfo.dwMinorVersion = 0

        Dim bRet As Boolean
        bRet = InternetSetOption(IntPtr.Zero, INTERNET_OPTION_HTTP_VERSION, _
            verInfo, Marshal.SizeOf(verInfo))

        If bRet = True Then
            MessageBox.Show("HTTP Version set to 1.0", "Success")
        Else
            MessageBox.Show("HTTP Version option not set!", "Error")
        End If
    End Sub