SSL Troubleshooting for IIS Web Sites contd...

Recently a colleague of mine was working on a customer's case which was a
Critical level incident. High pressure job, huhh!

The issue was with SSL not working for one of their web sites. They were
seeing "Page cannot be displayed" when trying to access this site over SSL. It
worked just fine over HTTP.

In the System event log we were seeing this intermittently:

Event Type: Error
Event Source: W3SVC
Event
Category: None
Event ID: 1114
Description:
One of the IP/Port
combinations for site 'NNNNN' has already been configured to be used
by
another program. The other program's SSL configuration will be used.

We troubleshot on this issue for hours without luck :-(. We tried all the
steps I guess as mentioned here .

Here is what all we tried:

  • Checked the Certificate properties to ensure it was a valid one. It was
    good.
  • Yet, replaced the current certificate with a new one, still no luck.
  • Here customer had all the sites running under different IP addresses. Rest
    of the other sites were working over SSL, except this one :-(.
  • We ran SSLDiag which gave a misleading error.
  • We tried running the site on a different SSL port, still no luck.
  • We setup the securebindings metabase property for the web site in question,
    still no luck.
  • We ran netstat -ano to check for any other process listening on this port,
    everything looked clean. refer this.
  • We disabled all the 3rd party non-MS services, restarted Windows Server in
    selective startup mode, no luck.
  • We installed Windows Server 2003 Service Pack 1 32-bit Support Tools on the
    server
    , ran the httpcfg query iplisten. It gave a clean output, no specific
    IP entries listed by it.
  • Restarted IIS/HTTP services umpteen number of times during the course of
    troubleshooting, no luck whatsoever. Even reboot was done a couple of
    times.

Finally after few hours of troubleshooting we decided to run this site on a
different IP address (we had thought of this earlier but our customer was under
a constraint) and hurray it worked this time!!!. Now everything was set but we
had a lingering question in mind as to why, why, why this site did not work on
that IP address we had. It had an entry in the Advanced TCP/IP Settings, was a
valid one in all the sense to our best knowledge.

Finally we figured out that there was a problem with the IIS SSL
listener.

To get a list of IP and port configuration binded to a certificate, run
"httpcfg query ssl". Here is an excerpt from a technet article:

The HTTP API enables applications to communicate over HTTP without
using Microsoft Internet Information Services (IIS). Applications can
register to receive HTTP requests for particular URLs, receive HTTP
requests, and send HTTP responses. The HTTP API includes SSL support so
applications can also exchange data over secure HTTP connections
without depending on IIS. It is also designed to work with I/O
completion ports.
....Such meta-information is maintained by the HTTP API in a metastore, and
is used to locate certificates for certificate exchange in HTTPS
sessions.

Below is a sample of a working and non-working
scenario:
------------------------------------------------------------------------------

\Program Files\Support Tools> httpcfg.exe query ssl

Working scenario:

IP                      : 192.168.100.118:443
Hash                  :
c96667684997887f 5b889b7b3f737c8c4da5f16
Guid                  :
{4dc3e181-e14b-4a21-b022-59fc669b0914}
CertStoreName           :
MY
CertCheckMode           : 0
RevocationFreshnessTime :
0
UrlRetrievalTimeout     : 0
SslCtlIdentifier       
:
SslCtlStoreName         :
Flags                   : 0

Non-working scenario:

IP                     : 192.168.100.234:443
Hash               
:
Guid                : {00000000-0000-0000-0000-000000000000}
CertStoreName :
(null)
CertCheckMode : 0
RevocationFreshnessTime :
0
UrlRetrievalTimeout : 0
SslCtlIdentifier : (null)
SslCtlStoreName :
(null)
Flags : 0

Here Hash will have the same value as the Thumbprint in your SSL certificate. You will notice that the Guid is all zero in a
non-working scenario. You may see the Hash either having some
value or blank. Even if we remove the certificate from the web site, and then
run "httpcfg query ssl", the site with all Guid as all "0" will
still be listed. If you see the GUID as "{0000...............000}, there is a
problem.

We need to remove this entry by running the command "httpcfg delete ssl -i
<IP:Port Number>". In the above example, we need to type "httpcfg delete
ssl -i 192.168.100.234:443". Once we remove it, then we need to reinstall the
certificate back on to the web site.

Also once certificate is installed, in the cmd prompt type in "httpcfg query
ssl" to confirm the GUID is no longer all 0.

This fixed the issue for the web site on the failing IP address.

Hope this helps someone.

Till next time, Cheers!