In this series of posts, we’re going to take a look at some of the tools we at Microsoft Developer Support -- Database commonly use, how to use them, and the types of issues they can be used to resolve.
In this article, Adam Saxton starts down the long road of double-hop troubleshooting by providing a gentle introduction to Kerberos, and how to recognize Kerberos+SQL related issues. Future articles will move beyond identification, and discuss the tools and processes we use to troubleshoot such issues.
We receive numerous calls where the problem falls under Kerberos. This will present itself when trying to establish a connection to SQL Server (or a Linked Server). I’m going to discuss what Kerberos is, how to recognize when there is a problem with it, and go over some of the items that need to be looked at in order to resolve a Kerberos issue with SQL Server. As there is a lot of information on this topic, I’ll be splitting it up over a few blog posts.
What is Kerberos?
Kerberos is a secure method for authenticating a request for a service in a computer network, and is used to verify a user or host identity. Kerberos lets a user request an encrypted "ticket" from an authentication process that can then be used to request a particular service from a server. The user's password does not have to pass through the network.
Put simply, it is a more secure form of Windows Authentication versus NTLM. Also, understand that Kerberos is not a Windows specific technology. Windows will always try to use Kerberos first over TCP. If that doesn’t work, it will drop down to NTLM. The exception to this is if you are trying to access a resource that is local. NTLM will always be used when accessing local resources. Also note that Kerberos cannot currently be used over Named Pipes or Shared Memory.
How does Kerberos work?
There are a lot of details into how Kerberos works. I’m not going to get that deep into it within this blog series. I’m going to relate this to how it works with SQL. To do that, I’m going to use the classic example of a Double-hop with a Web Server. We use the work Double-hop to explain that the IIS Server is using a resource that is located on a different server. In this case the first “hop” is from the web browser client to the IIS Server (ASP or ASPX Page); the second hop is to the SQL Server on a different machine. The server on the second hop is also referred to as the Backend Server.
In order to successfully establish a connection using Kerberos, a SQL ServicePrincipalName (SPN) needs to exist within Active Directory. I will talk more about SPN’s and where they are located in a later post as the topic is complex enough to deserve its own. I would say about 80-90% of calls we receive relating to Kerberos involve either missing SPN or duplicate SPN’s.
When you log onto your machine, you will get what we call a Kerberos Ticket. To be more specific, you get a Ticket-Granting Ticket (TGT). You use the TGT as a master ticket to access all Kerberos services on a network. A Kerberos ticket includes all the user credentials or computer credentials in an encrypted format. These credentials are used to identify a specific user or a specific computer on a network for access to Kerberos services. When you access a Kerberos service, that service uses the ticket that you received to authenticate you. After the service verifies your identity and authenticates you, the service issues a service ticket. This is where the SPN’s come into play. Think of the SPN as a pointer for Kerberos so it knows where to go. That’s why, if it’s missing or there are duplicates, it doesn’t know what to do.
How to recognize a Kerberos error with SQL Server:
There are a few errors that we see where we can make an educated guess that a connectivity issue is Kerberos related.
Cannot Generate SSPI Context
Login failed for user ‘(null)’
Login failed for user ‘NT AUTHORITYANONYMOUS’
Those are the three biggest errors we see in relation to Kerberos with SQL Server. These errors occur when the credentials from the first hop cannot be passed to the second hop. Usually, this is because we are failing over to NTLM or our SPN configuration is not correct. There are also other settings within Active Directory that come into play as well as certain machine configuration based on how services are setup. For now, let’s focus on the basics, so I’ll get more into those setting and machine configurations in a later post.
One of the last things I’d like to mention is that I’ve seen customers get the following error confused with a Kerberos issue:
Login failed for user ‘SomeUser’
The SomeUser could be whatever user (SQL or Windows) that you are trying to connect with. If you see the actual username in the error that means the credentials actually got to the SQL Server. The issue at that point lies in the SQL Security configuration itself (either the server as a whole, or the database you are trying to connect to). This is not a Kerberos related issue, but instead is a much simpler login failure.
Posted By: Adam Saxton