MOSS 2007 | Search Query Mechanism | Part 1

This blog is “Part One” of the upcoming series of blogs that would describe the high level architecture of the

search query mechanism in MOSS 2007. The purpose of this blog is to explain to the reader what really happens

in the background when a user performs a SharePoint search. Below is the diagram that gives you a pictorial

vision of the high level communication that happens between IIS, MSSearch.exe, the Search DB and the Full-

Text Index.

image 

Below is a breakdown of the 6 different stages.

Stage 1:

First thing that you need is the search query. The WebPart or the Web Service helps build up a search query.

This search query is then passed to the Query Object Model (OM) via the WebPart or the Web Service. This

step happens between the client machine that submits the query and the web front end. Network traffic would

show the following GET and the response:

Request: 

GET /searchcenter/Pages/Results.aspx?k=sharepoint&s=All%20Sites HTTP/1.1Accept: image/gif, image/x-

xbitmap, image/jpeg, image/pjpeg, */*Accept-Language: en-usUA-CPU:x86Accept-Encoding: gzip, deflateUse

r-Agent: Mozilla/4.0 (compatile; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)Host: wfeConnection:

Keep-AliveAuthorization: NTLM  lRMTVNTUAADAAAAGAAYAHQAAAAYABgAjAAAAA4ADgBIAAAAGgAaAFY

AAAAEAAQAcAAAAAAAAACkAAAA BYKIogUCzg4AAAAPQwBPAE4AVABPAFMATwBBAGQAbQBpAG4AaQBz

AHQAcgBhAHQAbwByAEQAQwApEBdASjCmUQAAAAAAAAAAAAAAAAAAAABd0mXyd+ vSlujhEVc5rpVdPt

QYe1WTXa0=Cookie: WSS_KeepSessionAuthenticated=80; MSOWebPartPage_AnonymousAccessCookie=80

Note (Highlight): In the above GET request ‘k’=keyword used and‘s’=scope

Response:

HTTP/1.1 200 OKCache-Control: private, max-age=0Content-Length: 76654Content-Type: text/html; charset

=utf-8Expires: Tue, 08 Dec 2009 22:33:34 GMTLast-Modified: Wed, 23 Dec 2009 22:33:34 GMTServer: Microsoft

-IIS/6.0X-Powered-By: ASP.NETMicrosoftSharePointTeamServices: 12.0.0.6318X-AspNet-Version: 2.0.50727

Set-Cookie:WSS_KeepSessionAuthenticated=80; path=/Set-Cookie: MSOWebPartPage_AnonymousAccess

Cookie=80; expires=Wed, 23-Dec-2009 23:03:34 GMT; path=/Set-Cookie: http%3A%2F%2Fwfe%2FSearchCenter

%2FDiscovery=WorkspaceSiteName=U2VhcmNo&WorkspaceSiteUrl=aHR0cDovL3dmZS9TZWFyY2hDZW50ZXI=

&WorkspaceSiteTime=MjAwOS0xMi0yM1QyMjozMzozNQ==; expires=Fri, 22-Jan-2010 22:33:35 GMT; path

=/_vti_bin/Discovery.asmxDate: Wed, 23 Dec 2009 22:33:34 GMT

Stage 2:

The query Object Model then in turn calls the Query Processor. The role of the Query Processor is to join results

from the Full-Text Index with the SearchDB. On SQL the two databases that are queried are:

· SSP database.

· SSP Search database.

On the SSP database proc_MSS_GetKeywordInformation is run to get the keywords for displaying on the site

if Office SharePoint Search is being used. On the search database the proc_MSSGetMultipleResults is run to

get the properties from MSSDocProps table. Both these stored procedures and more details about them can be

viewed via a SQL trace.

Stage 3:

The Query processor then opens a Query Pipe to the query machine. The link between the Web Front End and

Query server is a query pipe only provided by MSSearch.exe. When the Query pipe network traffic is examined,

it is revealed to be SMB traffic. Network Monitor 3 will identify this traffic as being the CIS protocol A high level

overview of what we can see via a Network Monitor capture is below. Communication happens between the WFE

and the Query Server. This is done to Query ports 139 and 445 and finally establishing the connection through SMB.

In the network capture you will also notice a 'Tree Connect AndX Request' for path \\QueryServer\IPC$ from WFE's

random port to the Query machine's port 445. Port 139 is used as the end point mapper. Once the Tree is created,

the path is changed to \OSearch in order to be able to query the flat files inside the Query server's file system.

Note however that at this point the default 'searchindexpropagation' share which is used while propagating index

from index server to the query server is not used to retrieve the query responses.

Stage 4:

The indexer plug-in (refer diagram) on the query machine retrieves the results from the index. The Indexer Plug-in

is the only part of Query that will access the Full-Text Index. We will get into more details about this in the following

blogs of the same series. .

Stage 5:

Now let’s get into the part where the results are returned. The results come back from the Full-Text Index as DocIds.

This task occurs based on the data that the web front end receives back from the Query and the SQL servers.

Stage 6:

The end result is then the function of the query processor (refer diagram above). The Query processor joins results

from the Search DB and full text index. It takes the DocIds and does the join to the SearchDB to access the document

properties (Title, Display URL, doc format, size, etc). This task occurs based on the data that the web front end receives

back from the Query and the SQL servers.