Which one is better? PERSISTED Column or XML Indexes?

I had a chance to work on the XML area. The customer had a performance issue with some queries which were related to XML data. The customer was using a banking application called TEMENOS T24 on Microsoft SQL Server. The tables in the databases have just 2 columns RECID and XMLRECORD which has all the columns. Tables structure are like below

 

 [RECID] [varchar](128) NOT NULL,
[XMLRECORD] [xml] NULL,

I used a couple of queries to tune the performance issue

 select wait_type, 
coalesce (db_name(st.dbid), db_name(convert (int, pa.value)), 'Empty') as DBName, 
SUBSTRING(st.text, (qs.statement_start_offset/2)+1, 
        ((CASE qs.statement_end_offset
          WHEN -1 THEN DATALENGTH(st.text)
         ELSE qs.statement_end_offset
         END - qs.statement_start_offset)/2) + 1) AS StatementText,
st.text as ProcedureTextOrBatchText,         *
from sys.dm_exec_requests qs
cross apply sys.dm_exec_sql_text(sql_handle) st
cross apply sys.dm_exec_query_plan(plan_handle) qp
cross apply sys.dm_exec_plan_attributes(qs.plan_handle) pa
where attribute = 'dbid' and session_id <> @@SPID

 

Most of the queries were waiting on the RESOURCE_SEMAPHORE wait type.

RESOURCE_SEMAPHORE: Occurs when a query memory request cannot be granted immediately because of other concurrent queries. High waits and wait times can indicate excessive number of concurrent queries or excessive memory request amount.

COMMON for DSS like workload and large queries such as hash joins; must wait for memory quota (grant) before it is executed.

SQL Server 2005 Waits and Queues

https://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/Performance_Tuning_Waits_Queues.doc

The queries below can be used to troubleshoot memory related issues

 SELECT grantee_count, waiter_count
FROM sys.dm_exec_query_resource_semaphores
where pool_id > 1

grantee_count 4

waiter_count 5

It looks that there are 5 queries waiting for the memory

Used another query like below

 SELECT text, session_id, requested_memory_kb, granted_memory_kb, required_memory_kb 
FROM sys.dm_exec_query_memory_grants
cross apply sys.dm_exec_sql_text(sql_handle)

 

text

session_id

requested_memory_kb

granted_memory_kb

required_memory_kb

a

175

12840320

12840320

512

b

147

12840320

NULL

512

c

158

34456

34456

512

d

162

12840320

NULL

512

e

164

12840320

12840320

512

f

160

12840320

NULL

512

g

174

12840320

12840320

512

h

168

12840320

12840320

512

It looks that 3 queries (b, d, f) are waiting for the memory (granted_memory_kb is null) and those queries require 12.8GB memory. It is too much. Each of the other queries (a, c, e, g, h) has already allocated 12.8GB memory. Those queries are really expensive.

The server has 98GB memory and max server memory has been set to 86GB memory.

Check the queries below to double check the memory pending issue

 select  cntr_type as [Memory Grants Outstanding]
from sys.dm_os_performance_counters
where object_name = 'SQLServer:Memory Manager'
and counter_name = 'Memory Grants Outstanding'

select cntr_value as [Memory Grants Pending]
from sys.dm_os_performance_counters
where object_name = 'SQLServer:Memory Manager'
and counter_name = 'Memory Grants Pending'

It look that “Memory Grants Pending” is 4.

So it is understood that the queries which are killing the memory should be improved.

Let’s work on the performance improvement of one of the queries. As you remember there is an XML data so which one is better? To create XML indexes or PERSISTED column?

Table design

 CREATE TABLE [dbo].[XMLDEMO_XMLINDEX](
    [RECID] [varchar](128) NOT NULL,
    [XMLRECORD] [xml] NULL,
 CONSTRAINT [PK_XMLDEMO] PRIMARY KEY CLUSTERED 
(
    [RECID] ASC
)
)

Fill the table with XML data. The below script is just an example. You generate XML data for each row and insert it to test table called XMLDEMO_XMLINDEX

 SELECT TOP 1 * FROM [AdventureWorksDW2012].[dbo].[FactResellerSalesPtnd]
FOR XML RAW, ELEMENTS

Create a PRIMARY XML Index and SECONDARY XML PATH Index on the table and test the query and see the results

 
CREATE PRIMARY XML INDEX [IX_PRIMARY_XML] ON [dbo].[XMLDEMO_XMLINDEX]
(
    [XMLRECORD]
)
GO
CREATE XML INDEX [IX_SECONDARY_XML_PATH] ON [dbo].[XMLDEMO_XMLINDEX]
(
    [XMLRECORD]
)
USING XML INDEX [IX_PRIMARY_XML] FOR PATH 
GO
 Check the size of the table and index
 sp_spaceused 'XMLDEMO_XMLINDEX'

rows = 616669

reserved = 5842864 KB

data = 1332880 KB

index_size = 4509240 KB

unused = 744 KB

Check the query performance below (Table has Clustered Index, PRIMARY XML Index and SECONDARY XML Indexes)

 SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO
SELECT RECID FROM [dbo].[XMLDEMO_XMLINDEX] 
WHERE XMLRECORD.exist(N'/row[SalesOrderNumber/text()="SO43676"]') = 1
 /*
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

(572 row(s) affected)
Table 'XMLDEMO_XMLINDEX'. Scan count 0, logical reads 2288, physical reads 0, read-ahead reads 0, 
Table 'xml_index_nodes_1298103665_256000'. Scan count 1, logical reads 2298, physical reads 0, read-ahead reads 0, 

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 18 ms.
*/

Create the similar table with the same data like below

 CREATE TABLE [dbo].[XMLDEMO_PERSISTEDCOLUMN](
    [RECID] [varchar](128) NOT NULL,
    [XMLRECORD] [xml] NULL,
 CONSTRAINT [PK_XMLDEMO_PERSISTEDCOLUMN] PRIMARY KEY CLUSTERED 
(
    [RECID] ASC
)
)
GO
INSERT INTO [XMLDEMO_PERSISTEDCOLUMN] (RECID, XMLRECORD)
SELECT RECID, XMLRECORD FROM [XMLDEMO_XMLINDEX]
GO
 sp_spaceused 'XMLDEMO_PERSISTEDCOLUMN'

rows = 616669

reserved = 991944 KB

data = 986672 KB

index_size = 5128 KB

unused = 144 KB

Check the query performance below (Table has Clustered Index)

 SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO
SELECT RECID FROM [dbo].XMLDEMO_PERSISTEDCOLUMN 
WHERE XMLRECORD.exist(N'/row[SalesOrderNumber/text()="SO43676"]') = 1
 /*
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

(572 row(s) affected)
Table 'XMLDEMO_PERSISTEDCOLUMN'. Scan count 9, logical reads 140107, physical reads 0, read-ahead reads 0, 

 SQL Server Execution Times:
   CPU time = 55677 ms,  elapsed time = 7168 ms.
*/

Create a PERSISTED Column on the table and Proper index on the persisted column like below

 CREATE FUNCTION [dbo].[udf_XMLDEMO_PERSISTEDCOLUMN_SalesOrderNumber](@xmlrecord XML) 
RETURNS nvarchar(14) 
WITH SCHEMABINDING 
BEGIN
RETURN @xmlrecord.value('(/row/SalesOrderNumber/text())[1]','nvarchar(20)')
END
GO
ALTER TABLE XMLDEMO_PERSISTEDCOLUMN
ADD SalesOrderNumber AS dbo.udf_XMLDEMO_PERSISTEDCOLUMN_SalesOrderNumber(XMLRECORD) PERSISTED
GO
CREATE INDEX [ix_XMLDEMO_PERSISTEDCOLUMN_SalesOrderNumber] ON XMLDEMO_PERSISTEDCOLUMN(SalesOrderNumber)
 sp_spaceused 'XMLDEMO_PERSISTEDCOLUMN'

rows = 616669

reserved = 1011536 KB

data = 986672 KB

index_size = 24640 KB

unused = 224 KB

Check the query performance below (Table has Clustered Index, Persisted Column, Non-Clustered Index)

 SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO
SELECT RECID FROM [dbo].XMLDEMO_PERSISTEDCOLUMN  
WHERE SalesOrderNumber = 'SO43676'
 /*
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

(572 row(s) affected)
Table 'XMLDEMO_PERSISTEDCOLUMN'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, 

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
*/

Summary of this comparison

Table with a PRIMARY KEY CLUSTERED

Table 'XMLDEMO_PERSISTEDCOLUMN'. Scan count 9, logical reads 140107, physical reads 0, read-ahead reads 0,

SQL Server Execution Times: Elapsed time = 7168 ms

Table with a PRIMARY KEY CLUSTERED and PRIMARY XML INDEX and SECONDARY XML PATH INDEX

Table 'XMLDEMO_XMLINDEX'. Scan count 0, logical reads 2288, physical reads 0, read-ahead reads 0,

Table 'xml_index_nodes_1298103665_256000'. Scan count 1, logical reads 2298, physical reads 0, read-ahead reads 0,

SQL Server Execution Times: Elapsed time = 18 ms

Table with a PRIMARY KEY CLUSTERED and PERSISTED Column and NON-CLUSTERED Index

Table 'XMLDEMO_PERSISTEDCOLUMN'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0,

SQL Server Execution Times: Elapsed time = 0 ms