Search 2010 Architecture and Scale – Part 2 Query


Several things have changed in SharePoint 2010 Query.   Query infrastructure is also componentized so now you only provision what you need.   This blog will go through will define the query components, how they work together, and how to provision them. 


Special shout-out goes to Jon Waite for his valuable technical input\review…



Query Basics


Just like crawl, Query has been componentized as well and the following goals are met:



  • Sub-second query latency

  • Index is no longer a single point of failure and is stored on Query servers

  • Query consists of components which can be scaled out among multiple servers to improve performance

 



Query Flow



  1. A search is performed by a user

  2. The WFE serving the call uses the associated search service application proxy to connect to a server running the Query and Site Settings Service also known as the Query Processor.  It uses WCF for this communication.

  3. The QP will connect to the following components to gather results merges\security trims and return results back to WFE:




      • Query Component – holds entire index or partition of an index

      • Property Store DB – holds metadata\properties of indexed content

      • Search Admin DB – holds Security Descriptors\Configuration data

     4.   WFE displays search results to the user


 


Several Query components can be scaled out as an index\property store grows.  A single search service application can have multiples of the following:





      • Property Store DB

      • Query Components

      • Query Processors


 


Query Component and Property Store DB



I’ll use Query server and Query Component interchangeably throughout the blog.  A Query Server is a server that runs one or more Query Components.   These servers hold a full or partial of the search index.  Query Servers are now the sole owner of storing the index on the file system.   As stated from previous post, the indexer crawls content and builds a temporary index.  The Indexer propagates portions of the temporary index over to Query Server to be indexed.  Query Servers contain a copy of the entire or partial index referred to as an Index Partition.  Query components run under the context of an Index partition.   Query components are responsible for serving search queries.  Query component runs under MSSearch.exe.   A query component is mapped to only one Property Store DB.  By now, you should’ve noticed that we split up the databases (For Example: Property Store DB and Crawl DB).   By separating these databases the following has been accomplished:





      • Overall Database size is reduced

      • Database performance is improved

 


Also, by carving out the databases, performance hits like writing crawled data to Crawl Store DB won’t affect tasks like serving Queries “query performance” which heavily depends on the Property Store DB. 


It’s possible to provision multiple Property Store databases and Query components for a single Search service application.  The reasons for doing this are plentiful and most of the reasoning will be explained throughout this post.   Query components can be provisioned to partition an index and\or mirror an index in order to provide fault tolerance.  Both of these components can be created by using either Central Administrator or PowerShell.  To simplify things a bit I’ll cover how to do it in Central Administrator.  In order to make changes to the Search topology, you must access the Search Administration page via the following:


 



Central Administrator\Application Management\Manage Service Applications\Select Search Service Application and select Manage from Ribbon


Scroll to the bottom of the page and this is where you can view\change the search topology. 


clip_image001


 


Provisioning happens in 3 stages:



  1. Hit Modify button

  2. Select New Property Database or Query Component and enter appropriate options at your discretion

  3. Apply Topology Changes


 


Fault tolerance + Performance


 



Query Component (Fault tolerance)


It’s highly recommended to create fault tolerance with your index.   This is accomplished by mirroring a Query component assigned to a different server.   Under the Search Application Topology, you can simply select the Query Component and Add mirror:


clip_image002


The end result is a second query component within the same Index Partition.


clip_image003



Note:  The Query Processor will distribute requests across both Query Components. 




 


Question: I don’t want Queries being served by one of my mirrored Query Components.


Answer:  On the Add mirror query component page, you can check the following option:



clip_image004



This doesn’t eliminate the failover query component from receiving queries.  The Query Processor will prefer Query Components not marked as fail over (active).  If all active Query Components are down, then Query Processor will submit requests to Query Components flagged as fail over. 



 


Property Store (Fault tolerance)



We fully support SQL mirroring to achieve fault tolerance with Property Store DB’s on the backend.






 


Query Component (Performance)



In previous builds of SharePoint, every query server stored the entire index.   While this achieved fault tolerance it didn’t help with performance.    There is a direct correlation between the size of an index and query latency.  The size of an index can easily become a bottleneck for query performance.  


For Example:



  • Index contains 10 million documents =  Average of 2 seconds per query

  • Index contains 20 million documents = Average of 4 seconds per query

This problem has been solved in SharePoint 2010.   Index partition can contain the entire index or a portion of the index.  By creating additional query components, a new index partition is created and owns a portion of the index. 



For Example:



If the entire index is 8 GB and contains 20 million documents:



Holds 50%: 4GB of index\10 million documents:  Query Server 1 – Index Partition 1   


Holds 50%: 4GB of index\10 million documents:  Query Server 2 – Index Partition 2


By partitioning large indexes, query times are reduced and a solution to this type of bottleneck can be solved.   Partitioning an index is as simple as provisioning new Query Components from the Search Application Topology section in Central Administrator.


For Example:


clip_image005


Question: If an index is partitioned out with multiple Query Components, how does the crawler distribute the indexed content?



Answer: The crawler evenly distributes crawled content to Index Partitions using a hash algorithm based on Doc ID’s.  






 


Property Store DB (Performance)



Just like Query components, Property Store DB can be scaled out and share the load of the metadata stored in the Property Store DB.   If the Property Store DB becomes a bottleneck due to the size of the database and\or strains the disk subsystem with high I/O latency on the back end, a new Property Store DB can be provisioned to share the load.  Just like the Crawl DB, the Property Store DB is useless unless it’s mapped to something.  In this case, a Property Store DB must be mapped to a Query component.   If a decision is made to provision an additional Property Store DB to boost performance, an additional non-mirrored Query Component must be provisioned and mapped to it. 


The following is a true statement:


Creating an additional Property Store DB requires the Index to be partitioned off because provisioning a new Query Component is required”.     







 


Query Processor



Great, so understanding Property Store DB and Query component scale out is only half of the battle.   The Query Processor remains and still plays a vital role in Search 2010.  The Query processor is responsible for processing a Query and runs under w3wp.exe process.  It retrieves results from Property Store DB and the Index\Query Components.   Once results are retrieved, they are packaged\security trimmed and delivered back to the requester which is the WFE that initiated the request.  The Query Processor will load balance request if more than one Query Component (mirrored) exists within the same Index Partition.  The exception to this rule is if one of the Query Component’s is marked as fail over only. 


Question: What if I partitioned off my index and I have multiple Query Components provisioned each serving a partition of the index?  How does Query Processor know which partition to connect to in order to accurately retrieve results?


Answer:  It doesn’t!  The Query Processor will connect to every single non-mirrored Query component that contains a partition of the Index to retrieve results.  




Question:  What if I created multiple Property Store Databases for performance reasons?   How does Query Processor know which Property store to connect to in order to accurately retrieve results?


Answer: It doesn’t!  The Query Processor will connect to every single Property Store DB to retrieve results.  


 


In SharePoint 2007, the Query Processor ran on any WFE.   In SharePoint 2010, any server can run the Query Processor.  It’s no longer tied into a server running the Query role.   You provision Query Processor role on a server by performing the following steps:



  1. Within Central Administrator, System Settings, Service on Server

  2. Start the Search Query and Site Settings Service

clip_image007



Note:  Post provision a new web service is created within IIS on that server.


clip_image009






 


 


Query Processor Scale Out



Just like the Query Component and Property Store DB, the Query Processor role can be scaled out to multiple servers.  If the Query Processor is a bottleneck, For Example:



· Not able to keep up with inbound requests or perhaps the box and/or associated W3WP.exe process hosting Query Processor is CPU\Memory bound.


In this case, you provision additional Query Processors as needed.  By provisioning additional query processors, requests will be load balanced in a round robin fashion to each server hosting a Query Processor.    


The same case can be made for achieving fault tolerance.  By having two servers hosting Query Processor role, if one goes down, the other will be used. 




 


Query Processor functions in Parent\Child Farm


In a Publishing/Consumer farm scenario, the Query Processor always runs in the farm where the Search Service Application resides.   So if Search Service Application resides in Publishing farm, Query Processor only runs in publishing farm.   The Consumer farm utilizes the associated Search Service Application proxy to make the connection over WCF to a Query Processor in the publishing farm.



 


Observe is Step 1 and Taking Action is Step 2


Before arbitrarily provisioning new query components and property store DB’s, observe the current environment\query health so some evidence can be gathered before making this important decision.  The obvious reasons of Fault Tolerance and Query Latency are covered in the previous sections so I won’t discuss those further.  Observing for System\Hardware bottlenecks is a good first step before considering adding more Query Components\Property Store DB’s. 


 


Monitoring Query Server


Observation:  The Query server is almost maxed on CPU and\or is at the peak of available physical memory and query latency has increased as a result.


Action Taken:  Provision a new query component 


Monitoring SQL Server


Observation:  Property Store DB is I/O bound on SQL and disk latency is unexpectedly high.


Action Taken:  Provision a new Property Store on same/different SQL server




Important:  These are very basic methods on approaching system bottlenecks.   For Example, don’t assume from a general observation of a spiked CPU would automatically require provisioning additional query components.   More analysis would be required.  Such as finding answers to the following questions:



  1. Does CPU only spike during crawl times?

  2. Which process is spiking?

  3. As the overall size of the index/Property Store DB increased?

  4. Does SP health monitoring or Performance monitor reveal anything of use?

  5. Etc….

 


Thanks,


Russ Maxwell, MSFT


Comments (18)

  1. Alex says:

    Hi Russ,

    great post! Could you tell me how to make the admin component fault tolerant? It seems that this component is running on one server only -if this server gets down, (how) can I still move it to another server?

    I guess via powershell you could do it (set-SPEnterprisesearchAdministrationComponent) but i have not been able to get this to work.

    Thanks in advance,

    Alex

  2. Vipin says:

    Hi Russ,

    Best Search SP2010 Post , i have seen . Have read it almost twice .

    Do you know how to take a approx size of Index ? Need to find out Web Server Disk size .

    I am considering Web Server disk size to be 80 GB system drive + Index size + some buffer .

  3. Suhaib says:

    Man this is really Cool!!!

    Thanks a lot.

  4. Russmax says:

    Thanks guys  :)  Only one admin component is allowed per SSA so fault tolerant can't be acheived on the server hosting it.  However, the Search Admin DB can probably be configured with SQL mirroring on the backend…

  5. Umar says:

    This is great . I do have one question after reading the whole article i was able to decide but i also posted a question on tech net which  no no replied i will post that here , hope fully   you will suggest some thing about it . thanks

    I have 2 WFEs and 1 App server running sp2010. I want my WFEs to provide query role for the farm and Application server (APPServer) will be used as index server. Can you  please help me to figure out if this is the right configuration, if not what will be the best configuration. How can I assign query roles and index roles to the servers in the farm. Thanks

    Admin

            Administration component                   running on APP Server

    Crawl – DBServerSSA-CrawlStoreDB

             Crawl Component 0                           running on APPServer

    Databases                                                

            Admin  database                                running on dbserver

             Crawl Database                                running on dbserver

             Property data base                             running on dbserver  

    Index partition 0   DBserverSSA-PropertyStoreDB            Running on APPServer

         Query Component1                           Running on APPServer

    Index Partition 1   DBserverSSA-PropertyStoreDB                    

         Query Component2                        Running on WFE1

    Index Partition 2    DBserverSSA-PropertyStoreDB      

        Query Component3                       Running on WFE2

    I am aslo runnig services on the following servers

    Search Query and Site Settings Service       Running on APPServer+ WFE1+WFE2

    Sharepoint Foundation Search                      Running on APPServer

    Sharepoint Serach Server                            Running on APPserver + WFE1 + WFE2

  6. venkata says:

    Hi, We are in deep water with 2010 here. Hopefully you could help us with this… We are leveraging the benefits of Metadatata Service using it as base of relationship model. Every thing with metadata is fine but in our server farm application pool for managed metadata serivce is spiking so much the application pool is crashingevery time it reaches in the range of 2 GB – 3 GB ram. We did dig really deep into this metadata service and found that WCF Service calls is not able to get the message back from the data base.

    Every time the application pool *w3wp.exe crashes we get

    Message: Exception of type 'System.OutOfMemoryException' was thrown.

    StackTrace:    at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32 length, Int32 capacity)

      at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)

      at System.Text.StringBuilder.Append(Char[] value, Int32 startIndex, Int32 charCount)

      at System.IO.StringWriter.Write(Char[] buffer, Int32 index, Int32 count)

      at System.Xml.XmlEncodedRawTextWriter.FlushBuffer()

      at System.Xml.XmlEncodedRawTextWriter.WriteAttributeTextBlock(Char* pSrc, Char* pSrcEnd)

      at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)

      at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)

        at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

    Could you give us some pointers on how to solve this Out of memory exceptions… as we haven't done any configuration changes w.r.t the client.config and web.config for Metadata at C:Program FIlesMicrosoft Office Servers14.0"

    Thank you for your time

  7. Russmax says:

    Usually this occurs because your not disposing objects properly.  You could potentially work around this by setting your application pools to recycle prior to the 2 GB mem usage mark.  However in that scenario, your adding a bandaid and not actually fixing the problem.  If you need deeper analysis, go to the ULS logs and filter it based on the Thread ID of the exception.  If you need further assistance, I would contact Microsoft to create a support case.  

  8. Vegas says:

    Great post. Really helped me with understanding.

    Thanks!

  9. Phil Thornton says:

    Hi,

    I have a specific question.

    When splitting up the index onto separate boxes with assosiated query components, what is the recommended best practice for physically placing the index. I know by default it goes to the c: drive. But, would there be any benefit in say, puting the index on the d: drive??

    Any response would be graetly appriciated

    Thanks,

    Phil

  10. allenwang says:

    What a great post by senior SEE, Russ!

  11. Pablo Alejandro Fain says:

    Great post, Russ! Thanks for sharing your knowledge.

    Pablo Alejandro Fain

    MCP, MCSA+M, MCTS, MCITP

  12. Marli says:

    I am changing the Server Farm to a 3-tier topology and want to change the Query Component to run on the Web Server.

    When I change the Search Service Application Topology for the Query Component to run on the Web Server and not on the Application Server then I get the following error:

    Errors were encountered during the configuration of the Search Service Application.

    Microsoft.Office.Server.Search.Administration.SearchConfigWizard+SearchConfigWizardException: Topology provisioning failed due to an error.Crawl component '…..' on Server cannot be dismounted. Check that the server is available. at Microsoft.Office.Server.Search.Administration.SearchConfigWizard.WaitForTopologyTimerJobToFinish() at Microsoft.Office.Server.Search.Administration.SearchConfigWizard.UpdateSearchApp() at Microsoft.Office.Server.Search.Administration.SearchConfigWizard.ProvisionSearchServiceApplication() at Microsoft.Office.Server.Search.Administration.SearchConfigWizard.ExecuteTimerJob()

    Date & Time

    I did run the "net start SPTimerV4" command in command prompt on the servers. The Query Component stays on the initializing Status and I keep on getting this error.

    Please help?!

    Thank you!!

  13. Sandeep Nadig says:

    Awesome post!

  14. babu says:

    Hi All,

    I am having a strange result with SP 2010 Enterprise Search (Not Fast Search)

    When I Performed Search for a word, with default refinement option I get the result of 9 items (which includes the result from 3 web apps )

    And with the same word when I filter with Web app in refinement Panel as below

    Web app 1 – 8 Items

    Web app 2 – 1 Item

    Web app 3 – 1 Item

    How it can be total 10 Items where I am expecting only 9 items

    When the numbers are more the above calculation is becoming more inconsistant

    And also   what is meant by the count shown in the below screen short  ?   When I am Page number 1 result shows 29 K and when I moved to Page No 3 results Shows 28 K

  15. Ambarish Singh says:

    This one of the best and most simple to understand article I have ever read. Thanks Russ.

  16. Femi Bello says:

    The article states documents are stored in the index. I do not think this is the case. Otherwise index size will be equal to the sum of all the document sizes. The index stores the location of items in the content sources. Once th search result appears the actual location of the document will still need to be contacted to retrieve the document. Think of it like the index at the back of you book. The index at the back of your book tells you where to find content. You will still need to go the location of the content to retrieve it! And the number of index pages in book is not the same as the size of the enttire book! Just for clarity!

  17. Keerthan Shetty says:

    Really helpful,Thanks a lot

  18. sateesh says:

    We are added two new sql boxes to our farm and we are trying to configure the index role on new sql boxes ,as this index is already with the old sql boxes for the same SSA for the respective web application.

    Here at this point we are getting some confusion on below points.

    1)  if we created index role on new sql boxes will it disturb the old index that was already existing in the old sql servers???

    2) is this new index creation is going to modify/replace the existing index that is on old servers ???

    3)  in between are we able to search the items with help of old index ???