SQL Server 2008/R2 Fulltext Search Fix for improving queries performance during concurrent index updates

SQL Server 2008/R2 Fulltext Search Fix for improving queries performance during concurrent index updates - https://support.microsoft.com/kb/958947

As you may already know, Microsoft has made significant improvements since SQL Server 2005 to fully integrate full-text search components inside the SQL Server execution engine and storage (https://technet.microsoft.com/en-us/library/cc721269(SQL.100).aspx). Full-text index population and storage happen within SQL Server (not by a separate process and not in separate files in the file system anymore!). This provided a tightly integrated and optimized query execution and resource governance infrastructure for mixed queries (containing both tabular data and full-text search keywords in a single query – for example search for all documents that contain “SQL Server” and “Improvement” and that were created between October 2008 and April 2011) as well as tremendous manageability benefits. Now DBAs do not have to deal with a separate set of files, but simply the database itself. (Note: Filters and word breakers continue to run in a separate much leaner process – FDHost - to ensure that third-party components do not pose any security or reliability risk to the SQL engine).

That said, some of our customers who deployed SQL Server 2008 ran into a situation where one long running query created a backlog of many more pending queries that would have otherwise finished fairly quickly. This problem occurred under the following conditions:

  • A heavy transactional load that was also indexed for full-text searches. For example, end-users entering notes or comments that are also being continually searched upon at web scale.
  • A background index update started while there is a pending query.

Our engineering team promptly took on the issue and worked closely with customers to get to the bottom of it. The problem was caused by the following sequence of operations.

  1. A particular query lasts long enough in its execution cycle with a Schema Stability (Sch-S) lock on metadata for index fragments (see Schema Locks in SQL Server at https://msdn.microsoft.com/en-us/library/ms175519.aspx).
  2. A background index-merge kicks in (to reduce number of full-text index fragments) by the full-text engine and tries to acquire a Schema Modification (Sch-M) lock on the same metadata.
  3. New queries are issued while the background index merge is waiting on the original query to finish.

Now all new queries, irrespective of their individual duration (when run in isolation) get queued behind the background merge and the first query. Therefore all queries start to take a very long time, reducing the full-text query performance of the entire system.

As soon as the problem was discovered and reported, our dev team got to work tirelessly. They addressed customer concerns by suggesting query changes to ensure that live systems continued to work. In parallel they investigated code improvements and made a number of changes that were thoroughly reviewed, tested and deployed in customer environments before releasing them in this publicly downloadable QFE.

The fix entails following code enhancements:

  1. Release locks on index fragment metadata as soon as it is read by queries so that background index merge can proceed.
  2. Reduce memory blocks and associated computations in query preparation, compilation and execution code blocks to improve parallelism.
    1. Reduce size of histograms to ensure:
      1. They can be efficiently evaluated on parallel threads.
      2. They can be efficiently copied across different components (from compilation to optimization) and freed-up to improve throughput.
    2. Calculate cardinality separately and discard swiftly after use.
    3. Use significantly smaller allocation increments (by a factor of 64!).

Thus by releasing locks as soon as possible, coupled with reducing the amount of memory consumed during the query execution cycle we improved the throughput of the entire query subsystem and eliminated the problem with queued pending queries and an associated background index merge.

Now with the SQL Server Denali release, our development team has made significant further improvements:

  • We have improved parallelism in query execution (rewriting CONTAINS and FREETEXT internally as a Streaming TVF, optimizing execution tree with predicate folding whenever possible and slicing execution on parallel threads for document ranges based on histograms,…).
  • We have improved background index merges (moving away from locking the index fragments to a timestamp model where only timestamp update gets locked and therefore both query execution and index updates continue to happen more concurrently)

We invite our customers to download the latest CTP of the Denali release here and give the new and much improved fully integrated Full-text Search engine a run on their production systems. We have some very happy TAP customers who are already reporting significant performance boost.