Issues with SQL Server Login Failures and dependency on Windows Password Policy Check

Good afternoon techies! I recently worked with some customers on SQL Server 2005 login failure issues. Thought I’d make a blog post on what exactly we do @ PSS in such situations and specifically share my thoughts on login failure issues relating with password policy checks.

Symptoms

Typical symptoms you would notice are these:-

a) Your SQL Server logins start failing

b) SA account is unable to login

c) Any SQL Agent jobs for which a SQL login is the owner start failing.

d) Users/Application which use SQL authenticated logins are experiencing login failures.

For e.g., you might notice these error messages in the job history,

Message: The job failed. Unable to determine if the owner (sa) of job ITALIAN_JOB has server access (reason: Unable to connect to server - check SQL Server and SQL Server Agent errorlogs).

If you look up the SQL Error logs, you should see this,

2009-09-29 11:38:35.65 Logon Error: 18456, Severity: 14, State: 10.
2009-09-29 11:38:35.65 Logon Login failed for user 'SqlLogin'. [CLIENT: 172.30.199.199]

Let me tell you something more about “State: 10”. This state is reached while evaluating the password policy checks. So any SQL Login which has is_policy_checked=1, are eligible for state 10 failures.

You might ask me as to how does this affect your SQL Logins??

It does! If the SQL Server service account gets locked out on the domain controller, all SQL Server authenticated logins which use domain password policy enforcement feature (CHECK_POLICY) will be unable to login to the SQL Server until the service account is "unlocked".

And more importantly check_policy is by default ON for any SQL login you create, unless you explicitly turn it off.

 

So how does this check_policy work?

SQL Server after all is an application and it has to rely on the exposed Windows API’s to do work for it when it comes to any external authentication. This is implemented by call to WinAPI NetValidatePasswordPolicy to implement password complexity, history, lockout etc. Now, in SQL the thread which calls this API runs under the context of the SQLSvc account. So, if the SvcAccount is locked out, this operation fails & hence the login to SQL fails. Make sense ?

If you look up the documentation for the NetValidatePasswordPolicy API, it does 3 types of validation

NetValidateAuthentication (for checking password expiration and account lockout policy)

NetValidatePasswordChange (password validation when change password is done)

NetValidatePasswordReset (password validation during when password reset is done. Can also reset the lockout state)

Next up, check if all the accounts which owned jobs or experienced login failures have the check_policy turned ON. You can verify all the logins which have the check_policy turned on running the query,

select * from sys.sql_logins where is_policy_checked = 1 

 

name

type_desc

is_disabled

is_policy_checked

is_expiration_checked

SA

SQL_LOGIN

0

1

0

SQLUSER1

SQL_LOGIN

0

1

0

 

More importantly, since SQL 2005 SP2, we have added new ring buffer entries (sys.dm_os_ring_buffers) for various security errors. For more info on this read RDorr’s blog at https://blogs.msdn.com/psssql/archive/2008/03/24/how-it-works-sql-server-2005-sp2-security-ring-buffer-ring-buffer-security-error.aspx

So, lets query the security ring buffer for more information,

select * from sys.dm_os_ring_buffers where ring_buffer_type = 'RING_BUFFER_SECURITY_ERROR'

The Record column will contain the API Name as well as any ErrorCode that was returned by the API.

Sample output with error

RING_BUFFER_SECURITY_ERROR 3435668357 <Record id = "14" type ="RING_BUFFER_SECURITY_ERROR" time ="3435668357">

<Error>
<SPID>67</SPID>
<APIName>NetValidatePwdPolicy</APIName>
<CallingAPIName>CAPIPwdPolicyManager::ValidatePwdForLogin</CallingAPIName>
<ErrorCode>0x8</ErrorCode>
</Error>

As you can see a WinAPI call to NetValidatePwdPolicy is made to validate the password and there was a failure there. This is returned by GetLastError WinAPI. In the above logs the error returned was "0x8"

 

0x8 -> 8 (decimal) -> Not enough storage is available to process this command.

Interesting you should see this here. This means your system event logs should certainly have some kind of Kerberos/netlogon errors reported. Here is a sample,

11/11/2009 2:32:11 PM NETLOGON Error None 5719 N/A SFRDBC2 "This computer was not able to set up a secure session with a domain controller in domain LITTLER-US due to the following: Not enough storage is available to process this command.  

This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.

11/11/2009 2:32:11 PM Kerberos Error None 7 N/A SFRDBC2 The kerberos subsystem encountered a PAC verification failure. This indicates that the PAC from the client _besadmin in realm STARWARS.COM had a PAC which failed to verify or was modified. Contact your system administrator

These are some of the things you can do:-

1) Verify if the service account is locked out. If it is, then have your sysadmin unlock it. But you still need to unlock your SQL account or restart sql service for them to work again. Also thinking ahead you might want to audit and find out who/why locked out the service account.

Now this can happen for any of the following reasons :-

 

a) Invalid attempts (by someone or some application) using the SQLSvc account to login to Windows.

b) SvcAccount is disabled on the DC.

c) Password has expired for the SvcAccount etc.

2) If service account is NOT locked out and you are experiencing errors similar to ones above, the quick fix for this issue, is to disable CHECK_POLICY for the SQL Logins

In case you this is not an option and you want to avoid such issues in the future, we have introduced a new trace flag T4614. Trace flag 4614 when enabled allows SQL Server authenticated logins (eg. SA) that use Windows domain password policy enforcement (check_policy = ON) to log on to the instance even though the SQL Server service account is locked out or disabled on the Windows domain controller.

Note: this was introduced in build 2005.90.2194 and can be enabled as a dynamic trace flag using DBCC TRACEON (4614,-1). Read more about it at https://support.microsoft.com/default.aspx?scid=kb;EN-US;925744

Here is a sample script I wrote using which you can change disable check_policyfor ALL SQL logins (please test before use).

DECLARE @name sysname
DECLARE @tmpname nvarchar (1024)
set @tmpname = ''
DECLARE login_curs CURSOR FOR
select name from sys.sql_logins where is_policy_checked = 1;

OPEN login_curs
fetch next from login_curs into @name
WHILE @@fetch_status = 0
BEGIN
print 'Altering property CHECK_POLICY for Login: ' + @name
set @tmpname = N'ALTER LOGIN '+@name+' WITH CHECK_POLICY = OFF'
exec sp_executesql @tmpname
FETCH NEXT FROM login_curs INTO @name
END

CLOSE login_curs
DEALLOCATE login_curs
GO

3) Get your System/Active Directory administrator to look into the Kerberos warnings/failures to see what the issue is.

I hope this helped clear up some of the concepts and helps you troubleshoot Login failures in SQL. As always, stay tuned for more. Cheers!

Sudarshan Narasimhan
Technical Lead, Microsoft SQL Server