SQL Server 2005 / 2008 table partitioning : Important things to consider when switching-out partitions.

An ISV recently found an anomoly in their implementation of a "drop table partition" function which could lead to unexpected partitions being switched out.

Typically, to do this "drop partition logic" SQL Server partitions are manipulated using the following operations:

· switch-out the partition which isn't needed anymore (or being archived) into an empty target table

· merge / reset the partition function ranges of the partitioned source table. For example; move some of the remaining data into a new partition / filegroup

· possibly drop the target table

The simple, standard way to switch-out a partition is to specify a partition number. The ISV implemented a select statement which used a boundary value of a partition range to return the corresponding partition number. Then the TSQL switch command was used to switch out this partition. This works fine as long as there is only one process doing the "switching".

But now let's assume that two processes would try this at the same time. Both processes will run the selects to get the partition numbers. Then the first one does the switch-out and the merge of the partition function. Unfortunately, the latter command will change the partition numbers as they are dynamically maintained by SQL Server. An activity like merge or splits of partitions will trigger a re-enumeration of all or parts of the partitions of a the table.

Afterwards the partition number which the second process got before the re-enumaration (tiggered by the first process) might not be accurate any more. A switch-out using the old partition number could result in switching out the wrong partition.

The solution is simple. We recommend using the $PARTITION function for the switch-out which allows you to specify a partition boundary value instead of a partition number.

The boundary values are not dynamic and therefore this issue won't come up.

Repro :

The repro script below does the following:

· create a partitioned test table with 5 boundary values: 1960, 1970, 1980, 1990, 2000

· insert 1 row into the 1970 range, 2 rows into the 1980 range and 5 rows into the 1990 range

· now switch out the 1970 range and the 1980 range by specifying the boundary values

· as expected, three ranges remain: 1960, 1990 with 5 rows and 2000

· now repeat the same test by specifying a partition number instead of the boundary value

· the output after creating the test table shows partition number 2 for boundary 1970 and 3 for boundary 1980

· using "hard-coded" partition numbers for the "switch partition" function simulates the select mentioned above

· however, now the results look different. Like before the boundaries 1960, 1990 and 2000 remain as expected. But the number of rows are not at all what we would expect!

· instead of 5 rows for boundary 1990 we see only 2 ! What happened ?

· well - the first process did the merge of the partition function. This changed the partition number of boundary 1990 from 4 to 3 and the one of boundary value 1980 from 3 to 2. The call of the "switch partition" function with partition number 3 will now switch-out the rows of boundary value 1990. Therfore the 5 rows are gone and the 2 rows of 1980 will be kept. The merge function would still use the correct boundary value. So the list of boundary values looks ok but the content is wrong !

The solution to use the $PARTITION function also works in case two processes interfere between switch-out and the partition function merge.

TSQL Repro:

if exists ( select * from sys.procedures where name = 'dp_reset_test' )

drop procedure dp_reset_test

if exists ( select * from sys.procedures where name = 'dp_list_partitions' )

drop procedure dp_list_partitions

if exists ( select * from sys.procedures where name = 'dp_switch_partition_via_boundary' )

drop procedure dp_switch_partition_via_boundary

if exists ( select * from sys.procedures where name = 'dp_switch_partition_via_partno' )

drop procedure dp_switch_partition_via_partno

go

set nocount on

go

-- stored procedure to create partitioned test table

create procedure dp_reset_test

as

begin

if exists ( select * from sys.objects where name = 'dp_test1' and type = 'U' )

drop table dp_test1

if exists ( select * from sys.objects where name = 'dp_test1_clone' and type = 'U' )

drop table dp_test1_clone

if exists ( select * from sys.partition_schemes where name = 'ps_year' )

drop partition scheme ps_year

if exists ( select * from sys.partition_functions where name = 'pf_year' )

drop partition function pf_year

CREATE PARTITION FUNCTION pf_year (int)

AS

RANGE LEFT FOR VALUES ( 1960, 1970, 1980, 1990, 2000 )

CREATE PARTITION SCHEME ps_year AS PARTITION pf_year ALL TO ([PRIMARY])

create table dp_test1

( col1 int default 99 ,

col2 int,

col3 int primary key

) on ps_year(col3)

create table dp_test1_clone

( col1 int default 99 ,

col2 int,

col3 int primary key

)

insert into dp_test1 values ( 1966,1966,1966 )

insert into dp_test1 values ( 1971,1971,1971 )

insert into dp_test1 values ( 1972,1972,1972 )

insert into dp_test1 values ( 1984,1984,1984 )

insert into dp_test1 values ( 1985,1985,1985 )

insert into dp_test1 values ( 1986,1986,1986 )

insert into dp_test1 values ( 1987,1987,1987 )

insert into dp_test1 values ( 1988,1988,1988 )

print ''

end

go

-- stored procedure to print partitions info

create procedure dp_list_partitions ( @tabname char(20) )

as

begin

declare @p_number int

declare @p_rows int

declare @boundary_value int

declare p_details cursor for

select partition_number, rows, convert(int,sprv.value)

from sys.partitions sp,

sys.partition_functions spf,

sys.partition_range_values sprv

where object_id = OBJECT_ID(@tabname) and

spf.function_id = sprv.function_id and

sprv.boundary_id = sp.partition_number and

( sp.index_id = 1 or sp.index_id = 0 ) and

spf.name = 'pf_year'

order by partition_number

open p_details

FETCH NEXT FROM p_details

INTO @p_number, @p_rows, @boundary_value

print 'part no ' +

'# rows ' +

'boundary'

WHILE @@FETCH_STATUS = 0

begin

print convert(char(10), @p_number) +

' ' +

convert(char(10), @p_rows) +

' ' +

convert(char(10), @boundary_value)

FETCH NEXT FROM p_details

INTO @p_number, @p_rows, @boundary_value

end

close p_details

deallocate p_details

print ''

end

go

-- stored procedure to get rid of a partition by specifying the partion number

create procedure dp_switch_partition_via_partno ( @partno int, @boundary int )

as

begin

truncate table dp_test1_clone

alter table dp_test1 switch partition @partno to dp_test1_clone

alter partition function [pf_year]() merge range (@boundary)

end

go

-- stored procedure to get rid of a partition by specifying the boundary value

create procedure dp_switch_partition_via_boundary ( @boundary int, @merge_flag int )

as

begin

if( @merge_flag = 1 )

begin

truncate table dp_test1_clone

alter table dp_test1 switch partition $PARTITION.pf_year(@boundary) to dp_test1_clone

alter partition function [pf_year]() merge range (@boundary)

end

if( @merge_flag = 2 )

begin

truncate table dp_test1_clone

alter table dp_test1 switch partition $PARTITION.pf_year(@boundary) to dp_test1_clone

end

if( @merge_flag = 3 )

begin

alter partition function [pf_year]() merge range (@boundary)

end

end

go

-- test sample

-- create partitioned test table

execute dp_reset_test

print 'Test table with 5 boundary values : '

print ''

execute dp_list_partitions 'dp_test1'

print 'Switch partitions with boundary 1970,1980 via boundary value : '

print ''

execute dp_switch_partition_via_boundary 1970, 1

execute dp_switch_partition_via_boundary 1980, 1

execute dp_list_partitions 'dp_test1'

print 'as expected boundary values 1970 and 1980 are gone and 1990 remains with 5 rows'

print ''

print ''

-- reset test table

execute dp_reset_test

print 'Switch partitions with boundary 1970,1980 via partno : '

print ''

execute dp_switch_partition_via_partno 2, 1970

execute dp_switch_partition_via_partno 3, 1980

execute dp_list_partitions 'dp_test1'

print 'boundary values 1970 and 1980 are gone but 1990 remains with 2 rows which is unexpected !'

print ''

print ''

-- reset test table

execute dp_reset_test

print 'Switch partitions with boundary 1970,1980 via boundary value with '

print '"deferred merge of the partition function" : '

print ''

-- switch out only

execute dp_switch_partition_via_boundary 1970, 2

execute dp_switch_partition_via_boundary 1980, 2

-- merge partition function only

execute dp_switch_partition_via_boundary 1970, 3

execute dp_switch_partition_via_boundary 1980, 3

execute dp_list_partitions 'dp_test1'

print 'as expected boundary values 1970 and 1980 are gone and 1990 remains with 5 rows'

print ''