SQL 2016 - It Just Runs Faster - BULK INSERT Uses Vector Instructions (SSE/AVX)


SQL Server 2016 enhanced the bulk insert activities (BULK INSERT), leveraging the CPUs vector instructions.   The vector based approach allows the bulk data to be processed faster.

 

For example, when inserting data from a text file the integer conversion leverages SSE instructions when the character length is more than 4 characters and trace flag 4809 is disabled. 

 

Take the value of '12345678' needing to be converted to the integer value 1,2345,678.  A common algorithm used to determine the integer value of each position one might loop over the character array subtracting the value of character '0' from each entry so it can be casted to a BYTE value and shifted to accommodate base 10 mathematics.

 

Using a single vector instruction, all 8 characters (BYTES) can subtract the '0' character value.  The pseudocode looks like this:

 

vASCIIZero  = { '0','0','0','0','0','0','0','0'};
p128Bits     = "12345678";

 

SSE Subtract Instruction(p128Bits, vASCIIZero)

 

In a single SSE instruction the value is adjusted instead of requiring 8 loops to process the bytes.

 

'It Just Runs Faster' - SQL Server 2016 takes advantage of CPU vector instructions to improve bulk insert performance.

 

Suresh Kandoth -Senior SQL Server Escalation Engineer

Bob Dorr - Principal SQL Server Software Engineer

 

DEMO - It Just Runs Faster: Bulk Insert Character To Integer Conversion

Overview

This demonstration shows the use of vector instructions to perform character string to integer conversions during the BULK INSERT command.

 

In order to demonstrate the vector instruction capabilities, this demonstration is a narrow reproduction focused on a single vector improvement, not the only vector based improvement.  The demonstration highlights ~10% performance increase on a very simple data set.
 

Steps

  1. Use SQL Server Management Studio (SSMS) to connect to a SQL Server 2012 or 2014 instance.
     
  2. Paste the script below in a new query window and execute (ATL+X) the script to populate demonstration data. 
     

use tempdb

go

 

create table tblBCP

(

iVal1        bigint                NOT NULL,

iVal2        bigint                NOT NULL,

iVal3        bigint                NOT NULL,

iVal4        bigint                NOT NULL,

iVal5        bigint                NOT NULL,

iVal6        bigint                NOT NULL,

iVal7        bigint                NOT NULL,

iVal8        bigint                NOT NULL

)

go

 

set nocount on

go

 

insert into tblBCP values (12345678, 98765432, 91827364, 54321678, 1234567812345678, 9876543212345678, 9182736412345678, 5432167812345678)

go

 

insert into tblBCP select * from tblBCP

go 20

 

  1. From a Windows Command prompt export the demonstration data into a text file.

    bcp.exe tempdb..tblBCP out "c:\temp\Ints.bcp" -T -S.\S33 -c -t~

     
  2. Copy the script below into an SSMS Query Window and execute it, noting the execution outcomes.
     

create table tblTests

(

iFlag                int NOT NULL,

iMSDuration        int        NOT NULL

)

 

dbcc traceoff(-1, 4809)

go

 

dbcc freeproccache

go

 

declare @iTraceFlag int = 0

declare @iLoops int = 8

 

while(@iLoops > 0)

begin

set @iLoops = @iLoops -1

truncate table tblBCP

 

declare @dtStart datetime = GetUTCDate()

 

BULK INSERT tempdb..tblBCP FROM 'c:\temp\Ints.bcp' WITH (DATAFILETYPE = 'CHAR', FIELDTERMINATOR='~')

 

insert into tblTests values (@iTraceFlag, DATEDIFF(ms, @dtStart, GetUTCDate()) )

end

go

 

 

--        Disable use of SSE

dbcc traceon(-1, 4809)

go

 

dbcc freeproccache

go

 

declare @iTraceFlag int = 4809

declare @iLoops int = 8

 

while(@iLoops > 0)

begin

set @iLoops = @iLoops -1

truncate table tblBCP

 

declare @dtStart datetime = GetUTCDate()

 

BULK INSERT tempdb..tblBCP FROM 'c:\temp\Ints.bcp' WITH (DATAFILETYPE = 'CHAR', FIELDTERMINATOR='~')

 

insert into tblTests values (@iTraceFlag, DATEDIFF(ms, @dtStart, GetUTCDate()) )

end

go

 

select (([Avg No SSE] / [AvgSSE]) -1.0) * 100.0 as [PERCENT Faster], [Avg No SSE], [AvgSSE]

from

(

select (select avg(iMSDuration) * 1.0 from tblTests where iFlag = 0) as [AvgSSE],

   (select avg(iMSDuration) * 1.0 from tblTests where iFlag = 4809) as [Avg No SSE]

) as a

go
 

Actual Scenarios

SQL Server 2016 has been vetted by a wide range of customers.   The positive impact of these changes has been realized by:
 

  • Online retailer can load and respond to competitive price changes faster.
     
  • Credit Card fraud service improved response times from incoming data feed inception to detection.
     

Sample Results  (~10% faster)
 

Machine

32GB RAM, 4 Core Hyper-threaded enabled 2.8Ghz, SSD Storage

SQL Server

Out of the box, default installation

 

SSE Enabled

3157ms

SSE Disabled

3479ms

Comments (3)

  1. Sean McCown says:

    Are there any actual performance numbers? How much faster is faster? And what types of data can take advantage of this? Does it only matter for the example you gave?

  2. Glenn Berry says:

    Does the code take advantage of newer versions of SSE (like SSE2, SSE3, SSE4.1, SSE4.2, etc.) that are supported in modern processors?

  3. Chris Adkin says:

    The title of the article mentions SSE/AVX, can you please elaborate on the circumstances under which AVX instructions will be used instead of SSE, I ask this because the code provided explicitly refers to the use of a trace flag to enable/disable SSE.

Skip to main content