Preliminary findings to get good performance of SqlBulkCopy

We want to develop a C# application which needs load data into local DB quickly, and found the initially implemented SqlBulkCopy has poor performance (only 14 MB/S). Here are the findings when I try to improve the performance.

 

Test BaseLine:

  • Implement a mock data reader. The pure read throughput is several GB/S, which won’t be the bottleneck for DB insertion.
    • The schema is 4 columns of string, and each field is a Guid.
    • The data size is set as 1.6 GB with around 10 million rows.
  • Use SqlBulkCopy.WriteToServerAsync(reader, token) to insert data, and hard code BatchSize as 1000.
  • The throughput is 14 MB/S

 

Improvement 1:

  • Use the sync API in SqlBulkCopy instead of the async API.
  • The improved throughput is 30 MB/S.
    • The performance of sync API is better than the async one, which is same as my previous finding in SqlDataReader.

 

Improvement 2:

  • Use an adaptive BatchSize logic based on row size, instead of a hard coded value.
    • My logic is to set the BatchSize as (8 MB)/(row size), where the row size can be estimated in our application.
  • The further improved throughput is 42 MB/S.

 

Improvement 3:

 

Row size and Performance:

  • Did some tests based on different schema. The performance can reach to 60 MB/S if the row size is larger than 100 bytes.
  • For those very small rows (e.g. only contain 4 int value), the performance will be downgraded to 35 MB/S.