Minimal Logging changes in SQL Server 2008 (part-3)

 


So far we had discussed  (http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/23/minimal-logging-changes-in-sql-server-2008-part-2.aspx ) how minimal logging changes impact when you are moving data from one table to another table. Now let us look at how does this change more conventional bulk import. As you will see, conventional bulk import takes advantage on these changes as well but these changes are only useful when importing into a btree as conventional bulk import already provides minimal logging for heaps. Here are the series of scenarios that I tried


 


 


 


(1)  Inserting into an HEAP. No changes in this behavior


 


begin tran


 


— this is optimized load


— (1) show that individual rows are not logged


— (2) show the BU lock


 


bulk insert t_heap


from ‘C:\sql-server-test\minimal-logging\t_source-dat.dat’


  with (TABLOCK)


 


Logging:: Minimal Logging


 


 


— if we don’t specify TABLOCK, it leads to full logging. This is same what we always had


— (1) without TABLOCK, no minmal logging.


bulk insert t_heap


from ‘C:\sql-server-test\minimal-logging\t_source-dat.dat’


 


 


LOGGING: Fully logged



 


 


 


(2) Insert into BTREE


 


a)    Into empty BTREE


create table t_ci (c1 int,  c2 int, c3 char(100), c4 char(1000))


go


 


create clustered index ci on t_ci(c1)


go


 


begin tran


 


— this is optimized load


— (1) show that individual rows are not logged


— (2) Lock: X lock on the table


bulk insert t_ci


from ‘C:\sql-server-test\minimal-logging\t_source-dat.dat’


  with (TABLOCK, datafiletype = ‘char’)



 


 


 


— Now, what if don’t specify TABLOCK? This is a change from earlier versions. and this is also optimized load. You will need to use TF-610


— this functionality


— (1) we get minimal logging


— (2) IX: at table level


— (3) you can use this to do parallel load as long as you are


— (4) importing into disjoint ranges


 


bulk insert t_ci


from ‘C:\sql-server-test\minimal-logging\t_source-dat.dat’


  with (datafiletype = ‘char’)


 


 


Here are the top 10 log records. Note, in this case, only the range lock is taken


 


b)    Into non-empty btree (when inserting into an increasing range)


 


— Inserting rows with disjoint range, we still get minimal logging


— if you are inserting into a btree but the range you are inserting overlaps partially with the existing data,


— the logging will go between minimal and full logging. The rule is that you will get minimal logging


— only when you allocate a new page.


 


 


bulk insert t_ci


from ‘C:\sql-server-test\minimal-logging\t_newrange-dat.dat’


  with (datafiletype = ‘char’)


Here are the top 10 log records. Note, in this case, only the range lock is taken. Since the SQL Server does not take X lock on the table, you can do parallel bulk import with minimal logging into a btree.


 


c)   Into a table with clustered and non-clustered indexes


 


create table t_ci_nci (c1 int, c2 int, c3 char(100), c4 char(1000))


go


 


create clustered index ci on t_ci_nci(c1)


go


 


create index nci on t_ci_nci(c2, c3)


go


 


begin tran


 


— this is optimized load


— (1) show that individual rows are not logged


— (2) Lock: IX lock on the table


bulk insert t_ci_nci


from ‘C:\sql-server-test\minimal-logging\t_source-dat.dat’


  with (datafiletype = ‘char’)


 


Query Plan:: Notice NCI and CI data is being sorted



Logged Records:: For the clustered index


 


 



Logged Records:: For the non-clustered index. Note, that index key is atleast > 100 bytes, so clearly it is minimally logged


 



 


As you can see, with the minimal logging changes in SQL2008, you have lot more options to bulk import the data with minimal logging. These changes are available on SQL2008/RTM.


 Thanks


Sunil