Load data to Azure Data Lake Store with high throughput

In general, a suggested way is to leverage Azure Data Factory to copy data from/to Azure Data Lake Store to/from many data sources like Azure Blob, SQL Sever, Azure SQL DW, etc. It is code-free, and it handles performance optimization, resilience handling and scheduling for you. You could even leverage multiple VMs to have an…


Column mappings in SqlBulkCopy are case sensitive

When you use SqlBulkCopy and set the ColumnMappings property, the column names are case sensitive regardless of the case sensitivity setting in target DB. A general error message “The given ColumnMapping does not match up with any column in the source or destination” would be thrown in such case, whch is very hard to identify what column…


Be Careful when using Table-Valued Parameter

Recently we found two implicit traps when using Table-Valued Parameter, which may easily lead to unexpected behaviors. First is that it needs additional precision/scale information when passing decimal type data, otherwise the decimal data would be rounded like long type value. It is very hidden as there is no exception thrown, and we can only…


Build warning msb3247 and msb3276 and msb3277

Frequently we may meet build warnings if a project indirectly reference same assembly with different versions. Recently I just noticed that such conflict version assembly may cause three similar/different build warnings: msb3247, msb3276 and msb3277. The error messages are very similar, all contains words like “Found conflicts between different versions of the same dependent assembly”. However, the solution…


Use PsTools to view certificate installed by System account

If you want to view certificates installed by System account in an Azure VM, directly remote to the VM cannot view the certificates as you are login in as a remote account. I just found an easy way to accomplish this purpose by using PsTools. Download PsTools from https://technet.microsoft.com/en-us/sysinternals/bb897553.aspx. Copy the bits to the VM….


Use Data-Driven Test for Parameterized Tests

In daily development, we usually have a requirement to write paramterized test cases. I used to use the following two simple methods Method 1 and Method 2 to accomplish such requirments, but their short comings are obvious. The first method works for few variations, but it is ineffient if there are too many parameters. The second…


A quite efficient compression algorithm: Blosc

Recently I just learnt about a new compression algorithm Blosc (http://www.blosc.org/trac). It has amazing performance, but with poor compression ratio (by design). I have done a performance comparison between the Blosc and the .Net built in Deflate compression algorithm. Here is my PC environment: CPU: Intel Xeon CPU E5-1620 3.6 GHz RAM: 16.0 GB For Blosc,…


Preliminary findings to get good performance of SqlBulkCopy

We want to develop a C# application which needs load data into local DB quickly, and found the initially implemented SqlBulkCopy has poor performance (only 14 MB/S). Here are the findings when I try to improve the performance.   Test BaseLine: Implement a mock data reader. The pure read throughput is several GB/S, which won’t…


Study Notes of "Pattern-Oriented Design"

Here are the study notes of my recently attended training “Pattern-Oriented Design”. The lecturer is Amir Kolsky. Six Do’s Do the right thing   Process/PMs Do it right Understanding Do it efficiently Do it safely Do it predictably (in estimated time) The above three items is about design, which is developer’s responsibility. Do it sustainably Manager…


Azure Storage Blobs: the API DownloadToStream is always faster than OpenRead

An interesting finding when using the Azure Storage is that the API ICloudBlob.DownloadToStream is always over 35% faster than ICloudBlob.OpenRead, no matter the Blob is page Blob or block Blob. The Storage Client Library we use is 2.0.0. Here is the different codes for the two APIs: DownloadToStream: blob.DownloadToStream(stream); OpenRead:                var blobStream = blob.OpenRead();      blobStream.CopyTo(stream); The following table…