Part2: Creating a Custom SSMA Report

This is part 2 of the article for creating custom SSMA report using SSIS and SSRS. In the previous post, we discussed how to use SSIS package to extract XML files containing SSMA assessment report information into SQL Server table. This week, we will discuss how to parse the XML data.

First, let's create tables to hold the parsed data:

 CREATE TABLE tblSSMAReport_Object (
     Project VARCHAR(255),
     Session VARCHAR(26),
     ObjectID CHAR(36),
     Path VARCHAR(255),
     Category Varchar(32),
     Name Varchar(32),
     ConvertedWithError TINYINT,
     SourceStatement Varchar(max),
     PRIMARY KEY (Project, Session, ObjectID)
     )
  
 CREATE TABLE tblSSMAReport_MessageDetails (
     Project VARCHAR(255),
     Session VARCHAR(26),
     ObjectID CHAR(36),    
     Code CHAR(8), 
     SourceCodeLink VARCHAR(800),
     Description VARCHAR(MAX),
     PRIMARY KEY (Project, Session, ObjectID, Code)
     )

tblSSMAReport_Object table contains the information about the database object and its overall status. tblSSMAReport_MessageDetails table stores conversion message .

We will need to parse text in order to extract information based on relative location. I created a function which we will use several times in this example:

 CREATE FUNCTION [dbo].[UDF_ParsePath]
     (
     @string VARCHAR(800),
     @delimiter VARCHAR(16),
     @mode VARCHAR(6) = 'last',
     @keyword VARCHAR(16) = null
     )
 RETURNS VARCHAR(100)
 AS
 BEGIN
     IF CHARINDEX(@delimiter, @string) > 0
     BEGIN
         DECLARE 
             @first_position INT,
             @last_position INT 
         SET @first_position = 
         CASE  
             WHEN LOWER(@mode) = 'last' THEN 
                 LEN(REPLACE(@string,' ','|')) - 
                 (CHARINDEX(REVERSE(ISNULL(@keyword,@delimiter)), REVERSE(@string))  + 
                 LEN(REPLACE(ISNULL(@keyword,@delimiter),' ','|')) - 1) + 
                 LEN(REPLACE(@delimiter,' ','|'))
             WHEN LOWER(@mode) = 'after' AND CHARINDEX(@keyword, @string) > 0 THEN 
                 CASE 
                     WHEN CHARINDEX(@delimiter, @string, CHARINDEX(@keyword, @string) + LEN(REPLACE(@keyword,' ','|'))) = 0 
                         THEN 0 
                     ELSE
                         CHARINDEX(@delimiter, @string, CHARINDEX(@keyword, @string) + 
                         LEN(REPLACE(@keyword,' ','|')))+ LEN(REPLACE(@delimiter,' ','|')) 
                 END
             WHEN LOWER(@mode) = 'before' AND CHARINDEX(@keyword, @string) > 0 THEN 
                 LEN(REPLACE(@string,' ','|')) - 
                     (CHARINDEX(REVERSE(@delimiter), REVERSE(@string), 
                                 CHARINDEX(REVERSE(@keyword), REVERSE(@string)) + 
                                 LEN(REPLACE(@keyword,' ','|')) + 
                                 LEN(REPLACE(@delimiter,' ','|')))
                     ) + 2
             ELSE 0
         END
         SET @last_position = CASE 
             WHEN CHARINDEX(@delimiter, @string, @first_position) = 0 
                 THEN LEN(REPLACE(@string,' ','|')) + 1 
                 ELSE CHARINDEX(@delimiter, @string, @first_position) 
             END
         IF @first_position > 0   AND @last_position - @first_position > 0
             RETURN SUBSTRING(@string, @first_position, @last_position - @first_position)
     END
     RETURN NULL
 END

The UDF returns segment of the string based on the following parameter value:

  • Last: return an entire value of last segment (as identified by delimiter character) containing the keyword
  • After: return the segment after the segment containing the keyword
  • Before: return the segment before the segment containing the keyword

For example:

 DECLARE @string VARCHAR(255) = 'C:\SSMA\Projects\ProjectName\report\report_2010_09_22T09_34_43\GUID\path.xml'
 SELECT  
     dbo.UDF_ParsePath(@string, '\','before','report\')  as Project, 
     dbo.UDF_ParsePath(@string, '\','last','report')     as Session,
     dbo.UDF_ParsePath(@string, '\','before','path.xml') as ObjectID

results in the following:

Project Session ObjectID
ProjectName

report_2010_09_22T09_34_43

0027b853-988c-4be4-b937-9859503d014d

We can then use the user defined function to parse the XML and populate tblSSMAReport_Object:

 INSERT INTO tblSSMAReport_Object (ObjectID, Project, Session, Path)
 SELECT  ObjectID, Project, SESSION,
     MAX(CASE WHEN rn=6 THEN path + ' > ' ELSE '' END)
     + MAX(CASE WHEN rn=5 THEN path + ' > ' ELSE '' END)
     + MAX(CASE WHEN rn=4 THEN path + ' > ' ELSE '' END)
     + MAX(CASE WHEN rn=3 THEN path + ' > ' ELSE '' END)
     + MAX(CASE WHEN rn=2 THEN path + ' > ' ELSE '' END)
     + MAX(CASE WHEN rn=1 THEN path  ELSE '' END) 
     As PATH
  FROM  
     (SELECT     ROW_NUMBER() OVER (PARTITION BY FileName ORDER BY FileName ) as rn,
                 dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
                 dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
                 dbo.UDF_ParsePath(rpt.FileName, '\','before','path.xml') as ObjectID,  
                 col.value('./@name','VARCHAR(100)') As Path
            FROM    tblRawData rpt
     CROSS APPLY rpt.ObjectPath.nodes('//path/node') Tab(col) ) AS Data
 GROUP BY ObjectID, Project, Session

The inner statement parse object path and XML values while the outer statement perform string aggregation to form the object path across multiple rows. The tblSSMAReport_Object table needs to be updated with additional attribute such as object category, name and conversion status. The object category can be parsed from the object path. I use the following UDF to help with the parsing:

 CREATE FUNCTION [dbo].[UDF_IdentifyObjectType]
     (
     @string VARCHAR(800) 
     )
 RETURNS VARCHAR(32)
 AS
 BEGIN
     DECLARE @ObjectType VARCHAR(32)
     DECLARE @ObjectTypes TABLE(Seq INT IDENTITY(1,1), Name VARCHAR(32), Match VARCHAR(32))
     INSERT INTO @ObjectTypes (Name, Match)
     VALUES ('packaged function', 'Packaged Functions'),('packaged type', 'Packaged Types'),('private packaged function', 'Private Packaged Functions'),('private packaged procedure', 'Private Packaged Procedures'),('private packaged type','Private Packaged Types'),('index','Indexes'),('trigger','Triggers'),('function','Functions'),('package','Packages'),('procedure','Procedures'),('sequence','Sequences'),('statement','Statements'),('synonym','Synonyms'),('table','Tables'),('user defined type','User Defined Types'),('view','Views')
     SELECT TOP 1 @ObjectType=  Name  from @ObjectTypes
     WHERE CHARINDEX(Match,@string) > 0
     ORDER BY Seq
     RETURN @ObjectType
 END
 GO

The function locates the specified keyword from the path and returns the first value found based on the sequence specified in the function.

For example:

SELECT [dbo].[UDF_IdentifyObjectType]('Schemas > HR > Tables > LOCATIONS > Indexes > LOC_COUNTRY_IX')

returns 'Index' even though both 'Tables' and 'Indexes' keyword exists, but since index appears first in the sequence, the index keyword is returned. With UDF_IdentifyObjectType created, I can now update the object table and cross apply with the value from cat.xml values in the tblRawData to get the conversion status.

 UPDATE  obj 
 SET
 obj.Category = dbo.UDF_IdentifyObjectType(Path),
 obj.Name = dbo.UDF_ParsePath(Path, ' > ','after',dbo.UDF_IdentifyObjectType(Path)),
 obj.ConvertedWithError = CASE WHEN NotCvt.NotConvertedCount > 0 THEN 1 ELSE 0 END 
 FROM tblSSMAReport_Object obj 
 LEFT JOIN tblRawData rawdata  ON dbo.UDF_ParsePath(rawdata.FileName, '\','before','src.sql.txt') = obj.ObjectID
     AND dbo.UDF_ParsePath(rawdata.FileName, '\','before','report\') = obj.Project
     AND dbo.UDF_ParsePath(rawdata.FileName, '\','last','report') = obj.Session
 LEFT JOIN (
 SELECT        dbo.UDF_ParsePath(rpt.FileName, '\','before','cat.xml') as ObjectID, 
     dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
     dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
             col.value('./@category','VARCHAR(32)') As Category, 
             col.value('./@not-converted-count','INT') As NotConvertedCount 
 FROM        tblRawData rpt
 CROSS APPLY rpt.ObjectCategory.nodes('//statistics/statistic') Tab(col)) NotCvt ON NotCvt.ObjectID = obj.ObjectID
 AND NotCvt.Session = obj.Session and NotCvt.Project = obj.Project
 and   dbo.UDF_IdentifyObjectType(Path) = NotCvt.Category
 where obj.Category is null
  
  
 INSERT INTO tblSSMAReport_MessageDetails (Project, Session, ObjectID, Code, Description)
 SELECT     
     dbo.UDF_ParsePath(rpt.FileName, '\','before','report\') as Project,
     dbo.UDF_ParsePath(rpt.FileName, '\','last','report') as Session,
     dbo.UDF_ParsePath(rpt.FileName, '\','before','messages.xml') as ObjectID,
     LEFT(col.value('./@name','VARCHAR(max)'),8) as Code, 
     col.value('./@name','VARCHAR(max)') As Description
 FROM        tblRawData rpt
 CROSS APPLY rpt.ObjectMessage.nodes('//categories/category/node/record') Tab(col)

In the part 3 of this article, I will show an example of SSRS report which consume from the tables we just created.