Comparing data between two tables in SQL Server

As a database developer sometime or the other you might have come across this requirement and it can be achieved using many ways including built-in features like Checksum and TableDiff utility. However, I would like to share an alternate way to achieve the objective, which might be useful in some situations.

Usually the intention of comparison is to find out the missing rows from either of the tables or both. This where the operator "EXCEPT" comes very handy. Let us see it with an example.

--Create two Tables--

CREATE TABLE TableA(ID Int, Name Varchar(256))

GO

CREATE TABLE TableB(ID Int, Name Varchar(256))

GO

INSERT INTO TableA VALUES(1,'A'),(2,'B'),(3,'D')

INSERT INTO TableB VALUES(1,'A'),(2,NULL),(3,'C')

--Get rows from TableA that are not found in TableB--

SELECT * FROM TableA

EXCEPT

SELECT * FROM TableB

--Get rows from TableB that are not found in TableA--

SELECT * FROM TableB

EXCEPT

SELECT * FROM TableA

I have used the above concept for comparing data between heavy tables and the performance is satisfactory. I could get the result in 3 Min 55 sec when I compared a table having 1.18 million rows. The comparison can't be performed for some of the data types of the columns like XML, Text and Image etc. So, I have written a generic stored procedure that does the comparison between two tables by excluding the columns that can't be compared. It also allows to compare tables across databases and schemas. Validations and error handling are not added as the intention is to just present the idea.

----------------------------------------------------------------------------------------------------------------------

CREATE PROCEDURE CompareTableData

(

@SourceDB sysname

,@SourceSchema sysname

,@SourceTable sysname

,@TargetDB sysname

,@TargetSchema sysname

,@TargetTable sysname

)

AS

BEGIN

SET NOCOUNT ON

DECLARE @SQL NVarchar(Max)

DECLARE @ColList Varchar(Max)

--Concatenate the column list by excluding the data types that can't be used in comparision--

SET @SQL = 'SELECT @ColList = CASE WHEN @ColList IS NULL THEN '''' ELSE @ColList + '','' END + SC.Name FROM ' + @SourceDB + '.sys.columns SC '

SET @SQL = @SQL + ' INNER JOIN ' + @SourceDB + '.sys.Types ST ON SC.system_type_id = ST.system_type_id WHERE object_id = object_id(@SourceTable) '

SET @SQL = @SQL + ' AND ST.Name NOT IN (''xml'',''Text'',''Image'',''Geometry'',''Geography'')'

EXEC sp_executesql @SQL,N'@ColList varchar(Max) OUTPUT,@SourceTable sysname',@ColList OUTPUT,@Sourcetable

--Get the rows that are missing from Target table--

SET @SQL = 'SELECT ''' + @TargetTable + ''' as RowMissingFrom,' + @ColList + ' FROM ' + @SourceDB + '.' + @SourceSchema + '.' + @SourceTable

SET @SQL = @SQL + ' EXCEPT '

SET @SQL = @SQL + ' SELECT ''' + @TargetTable + ''' as RowMissingFrom,' + @ColList + ' FROM ' + @TargetDB + '.' + @TargetSchema + '.' + @TargetTable

EXEC sp_executesql @SQL,N'@ColList varchar(Max)',@ColList

--Get the rows that are missing from Source table--

SET @SQL = ' SELECT ''' + @SourceTable + ''' as RowMissingFrom,' + @ColList + ' FROM ' + @TargetDB + '.' + @TargetSchema + '.' + @TargetTable

SET @SQL = @SQL + ' EXCEPT '

SET @SQL = @SQL + 'SELECT ''' + @SourceTable + ''' as RowMissingFrom,' + @ColList + ' FROM ' + @SourceDB + '.' + @SourceSchema + '.' + @SourceTable

EXEC sp_executesql @SQL,N'@ColList varchar(Max)',@ColList

END

-----------------------------------------------------------------------------------------------------------

Sample usgae of the procedure is:  EXEC  CompareTableData 'DB1','dbo','TableA','DB2','HR','TableB'

 Please share your feedback if you find this interesting.