IIS SEO Toolkit – Report Comparison

One of my favorites features in the IIS Search Engine Optimization (SEO) Toolkit is what we called Report Comparison. Report Comparison basically allows you to compare two different versions of the results of crawling the same site to see what changed in between. This is a really convenient way to track not only changes in terms of SEO violations but also to be able to compare any attributes on the pages such as Title, Heading, Description, Links, Violations, etc.

How to access the feature

There are a couple of ways to get to this feature.

1) Use the Compare Reports task. While in the Site Analysis Reports listing you can select two reports by using Ctrl+Click, and if both reports are compatible (e.g. they use the same Start URL) the task "Compare Reports" will be shown. Just clicking on that will get you the comparison.

CompareReportsTask

2) Use the Compare to another report menu item. While in the Dashboard view of a Report you can use the "Report->Compare To Another Report" menu item which will show a dialog where you can either select an existing report or even start a new analysis to compare with.

CompareReportsMenu

Report Comparison Page

In both cases you will get the Report Comparison Page displaying the results as shown in the next image.

CompareResults

The Report Comparison page includes a couple of "sections" with data. At the very top it includes links showing the Name and the Date when the reports were ran. If you click on them it will open the report directly just as if you had used the Site Analysis report listing view.

The next sections shows a lot of interesting built-in data such as:

Total # of URLs This basically shows the total # of URLs found in both versions. When clicking the link you will get the listing of URLs based on the version of the report you choose.
New and Removed These are the number of new URLs that were either added in the new version or removed from the old version. When clicking the added link you will get the listing of URLs based on the new version of the report and if you click the removed link you will get the listing based on the old URLs.
Changed and Unchanged These are the number of URLs that were modified or not modified. These are calculated by comparing the hashes of the files in both versions. When clicking the links you will get a query that displays a comparison of both versions of URLs showing their content length. (See below)
Total # of Violations This shows the total # of violations found in both versions.
New in existing pages and Fixed in existing pages These are the number of violations introduced or removed on URLs that exist in both reports. When clicking the added link you will get the listing of violations based on the new version of the report and if you click the removed link you will get the listing based on the old violations.
Introduced in new pages These are the number of violations introduced on URLs that are found only in the new report. When clicking the added link you will get the listing of violations based on the new version of the report.
Fixed by page removal These are the number of violations that were removed due to the fact that their URLs were no longer found in the new report. When clicking the added link you will get the listing of violations based on the old version of the report.
Others There are a number of additional reports which basically compare different attributes in URLs that are found in both reports. They compare things like Time Taken, Content Length, Status Code and # of Links. When clicking the links you will get the query that displays a comparison of both versions of the reports showing the relevant fields. (See below)

Whenever you click the links you get a query dialog that you can customize just as any Query in the Query builder, where you can Add/Remove columns, add filters, etc.

My favorite one is the "Modified URLs" source when you actually can add filters that compare URLs coming from the two different reports.

QueryDialog

Note that when you double click or "right-click –> Compare Details" any of the rows you get a side-by-side comparison of everything in the URL:

SideBySideDialog

Again, you can use any of the tabs to see side-by-side things like the Content of the pages or the Links both versions have or the violations, or pretty much everything that you can see for just one.

SideBySideDialog2

Finally, you can also right click on the Query dialog and choose "Compare Contents". This will launch whatever File Comparison tool you have configured using the "Edit Feature Settings". In this case I have configured WinDiff.exe which shows something like:

SideBySideContents

Summary

As you can see Report Comparison offers is a powerful feature that allows you to keep track of changes between two different reports. This easily allows you to understand over time how your site has been affected by changes. For Site managers it will allow them to query and maintain a history with all the changes. You can imagine that using an automated build process that runs IIS SEO Toolkit crawling whenever a build is made that keeps the report stored somewhere and potentially annotate it with the build number you could even keep a correlation of changes in code with Web site crawling.