people picker & user accounts resolution in domainS FROM different forestS

Ever run into a situation where the people picker in SharePoint will fail to resolve usernames that are within a domain in a totally different forest? Assuming the trust relationships are setup properly? Well think again. Here is a quick check list:

  • Ensure your people picker property is configured correctly in SharePoint.
  • Configure your trust relationships properly.
  • Ensure the ports required for inter server communication is opened. A list can be found <insert hyperlink>
  • Ensure your DNS configuration is correct. This is specifically important because the web server will need to locate the Global Catalog Servers and the Domain Controllers in the source & target domains.

Each of the above points is in itself a big task. A failure in any one of these dependent components will cause people picker to fail. To allow SharePoint to query AD of a different domain, you need to configure it to use a specific account from the trusted domain. Here’s how you do that using STSADM command line

  • stsadm –o setapppassword –password ********
    stsadm –o setproperty –pn peoplepicker-searchadforests –pv “domain:FQDN of trusted domain,Account in trusted domain,Password” –url URL of App
    Recycle AppPool for changes to take effect (Optional)

Alright. So now we know the basic check lists, we need to know how this works normally before you can troubleshoot any issues. In other words, unless you know what is normal, you cannot spot the abnormal?

Here’s a description of how PeoplePicker works in SharePoint.

Web server contacts one of the DCs in its domain and requests a SID lookup using the Windows API LsarLookupNames4. The LsarLookupNames4 method translates a batch of security principal names to their SID form. This traffic is encrypted and the Web server and domain controller talks via RPC. The RPC end point mapper is a UUID: E1AF8308-5D1F-11C9-91A4-08002B14A0FA. Now because this is initiated from LSASS, the LSARpc identifier is 12345778-1234-ABCD-EF00-0123456789AB. You should see both of these in a network trace. A successful request/response indicates that that RPC communication is successful.

So with LsarLookupNames4 API, we should get a SID. The next thing that happens is an LDAP query trying to lookup this SID and see if the name matches with what the user entered. To perform this, you need to have Kerberos traffic flowing properly. If Kerberos is working properly, you should also see that traffic just before the LDAP query with the username that you configured within SharePoint. After Kerberos authentication, SharePoint server then sends the LDAP query to one of the DCs in the trusted domains and does a search - something like:

LDAP:Search Request, MessageID: 26, BaseObject: DC=SharePoint,DC=com, SearchScope: WholeSubtree, SearchAlias: neverDerefAliases
LDAP:Search Result Entry, MessageID: 26, Status: Success

A filter is also passed based to indicate search based on the SID. Filter: (&(objectSID=))

The search result contains the properties requested for the user including the user’s SID. If everything matches, then we are done and the user’s full name should be displayed.

So that’s how it is “expected” to work. But most of the times when a support engineer is looking at the problem, he will not find the above traffic. Instead he is looking at the traffic in the broken scenario and there may be several reasons why the feature is not able to find the user. For eg:

  • What if the trust relationship is not setup properly? Can we verify that using a network trace? Is it possible?
  • What if the MSRPC is broken? Can we determine that using a network trace?
  • What if the DNS entries are not setup properly? Can you determine that using the network trace?

So well, the answer is “depends”. A lot of times you can make a good conclusion depending on what you see in the network trace if you have domain specific knowledge.

  • If you never see the MSRPC bind requests getting a success response, chances are that the trust is not setup properly.
  • If you do not see Kerberos traffic or connecting with the username specified in SharePoint, then your SharePoint configuration is probably not correct.
  • If you see DNS related errors in the network trace (filter by DNS traffic), then your DNS is probably broken and needs to be fixed.

Obviously, what needs to be fixed depends on what his broken. No matter what works and what does not, at the end of the day, if you performed a Check Name operation within People Picker, we must match the user with a SID. To do that, SharePoint goes to great lengths. SharePoint will attempt a query based on Person or Group and also perform a wild card search. Here’s an example of filters used that you may see in the network trace for LDAP queries:

First Attempt:

filter: (objectCategory=person)
filter: (objectClass=user)
filter: (!(BIT_AND: (userAccountControl)&2))
filter: (|(name=Sharepoint\Skumar)(displayName=Sharepoint\Skumar)(cn=Sharepoint\Skumar)(mail=Sharepoint\Skumar)(samAccountName=Sharepoint\Skumar)(proxyAddresses=SMTP:Sharepoint\Skumar)(proxyAddresses=sip:Sharepoint\Skumar))

filter: (objectCategory=group)
filter: (BIT_AND: (groupType)&2147483648)
filter: (|(name=Sharepoint\Skumar)(displayName=Sharepoint\Skumar)(cn=Sharepoint\Skumar)(samAccountName=Sharepoint\Skumar))

Result: None

Second Attempt

filter: (objectCategory=person)
filter: (objectClass=user)
filter: (!(BIT_AND: (userAccountControl)&2))
filter: (|(name=Sharepoint\Skumar*)(displayName=Sharepoint\Skumar*)(cn=Sharepoint\Skumar*)(mail=Sharepoint\Skumar*)(sn=Sharepoint\Skumar*)(SamAccountName=Skumar*)(proxyAddresses=SMTP:Sharepoint\Skumar)(proxyAddresses=sip:Sharepoint\Skumar))

filter: (objectCategory=group)
filter: (BIT_AND: (groupType)&2147483648)
filter: (|(name=Sharepoint\Skumar*)(displayname=Sharepoint\Skumar*)(cn=Sharepoint\Skumar*)(SamAccountName=Skumar*))

Chances are that you may get back a response with a wild card search – as in the second case on my machine, because an OR search on SamAccountName=SKumar* found a record but not with SamAccountName=SharePoint\SKumar as in the first case. However, what happens right after that is, the system will pick the SID from the response (if any) and attempt a match with the SID. If that fails, Check Name operation will throw an error that it could not find the user. So the key to getting that to work is ensure that we can perform a SID lookup successfully.

So what tools are there to check if we can resolve the SIDs?

Microsoft Support uses from PSGetSID from SysInternals. It is one of the tools you can use to verify if the SID lookup is working properly in your environment. & it is really easy to use. From a command line, run: PSGetSid <domain\username>.

If this tool fails to get the SID, ignore the SharePoint part and focus on fixing your environment first.