In the August CTP of SQL Server 2008 R2, we introduced the map report item in Reporting Services. As you might already be aware, the map in SSRS supports the spatial data types and functions that are available in SQL Server. In this post, we’ll look at what you can aggregate spatial data for your reporting needs.
In this scenario, we are going to build a map of the United States broken into regions (West, Central, and East).
Step 1: Getting the Map Data
To start with, we need to find a set of maps to support this exercise. The subject of finding maps is one of the most common questions we receive when people are using the map report item. In this case, all of the maps we need can be found on the http:///www.census.gov web site.
- The state files are located at :http://www.census.gov/cgi-bin/geo/shapefiles/national-files
- The county files are located at: http://www.census.gov/geo/www/cob/co2000.html
For those of you who are already working with the map report item in SSRS, at this point you might be asking yourself “Why not just use the included maps in the map gallery for this effort?”. We certainly could do that, however, there’s another wrinkle to this scenario that would make this approach difficult and we’ll discuss that later in this post.
Step 2: Converting Shapefiles into SQL Server Spatial Data
Now that we have the shape files for states and counties, how do we get them into SQL Server spatial? To do this, we will use a tool from http://www.sharpgis.net/page/Shape2SQL.aspx called “Shape2SQL”. All you need to do here is point the tool at the shapefiles you downloaded previously, and now your data is stored as spatial data types in SQL Server.
Step 3: Creating the Regions
Now that we have the US states and counties in our database as SQL spatial columns, we need to address how to create the regions. Below is the image of the map that shows the regions we need to create.
I suspect that this will be a very common requirement when building map reports. You’ll have a set of shapefile data based on political units, but your company or organization has units of analysis that are made up of these units. These units of analysis, in our case regions, we’ll change over time. Stores move between regions, regions get realigned, etc. Do you want to have to create new shapefiles every time this happens? I doubt it, plus if you have dimensional tables in your data warehouse that track the composition of these regions, wouldn’t it be great to just rely on those definitions that are already defined. We have a couple of options we can rely on here.
Step 3: (Option 1) Aggregate Polygons in SSRS
One of the features that we added in this release is support for the UNION() aggregate on spatial data types directly in Reporting Services.
Let’s look at how we would do this. Add a dataset to your report using the following query based on the database this article links to. This is simply a join between a table that defines our regions and the data that stores the spatial data for our states. I’m taking the liberty of removing the non continental states from the query, but that’s just for the sake of the appearance in the map :).
FROM dbo.Regions r, test s
WHERE r.statefips= s.statefp
and stateFIPS < 60 and State <> ‘Alaska’
ORDER by statefips desc
Now, add a map to the report. You could use the wizard here, but to demonstrate this particular feature I would just insert a blank map.
On the map, choose the option to add a Polygon Layer.
Next, open up the Layer Data property page by right-clicking on the Polygon Layer. On the General page, set the spatial field to an expression below.
The Union() aggregate now supports SQL spatial data types! Great, but we need to group it by something.
To do the grouping, open up the property grid and under the data node, open up the Group Property Page.
Add a group based on the [Region] field.
Now, when you run the report, you should have three distinct regions (West, Central, and East).
I would like to thank Robert Bruckner, who has a great blog over at http://blogs.msdn.com/robertbruckner/, who championed and built the prototype of this very cool feature.
Step 3: (Option 2) Aggregate Polygons in SQL Server
Support for the UNION() aggregate in SSRS is a great feature, but you will notice that if you are aggregating a lot of polygons at report run-time, that your report is going to take awhile to run. This aggregate is not cheap. If you are looking for a best practice in this area, I would recommend that you look at aggregating this data in SQL Server and storing the results of those aggregations in your database, just as if you were building aggregate tables in a traditional data warehouse.
While there is an STUnion() function in SQL Server spatial, it operates on two spatial objects, so you can’t group multiple items and use a traditional GROUP BY query in SQL. Thankfully, the SQL Spatial team has built a function library that can be added to SQL Server that will enable you to use a standard GROUP BY clause in your queries! You can download this function library at:
Follow the instructions for enabling the functions on your SQL Server, and when you’re done, you should be able to write a query like this:
MAX(Region), dbo.GeographyUnionAggregate(s.geom.Reduce(2)) as geo2
FROM dbo.Regions r, USStates s
WHERE r.statefips= s.statefp
and s.statefp < 60
GROUP BY r.region
This query is a join against the table that defines our regions and the table that contains the imported shape data that is now stored in SQL spatial.
The reduce function that is included above is designed to work around a current issue in the product in which the way the globe is projected in SQL Server 2008 spatial.
Here is what the query result looks like.
You can include this query as part of your SSIS packages in which you do your ETL operations. Then, in your report, you can you can write a simple SELECT statement against this new table that returns the aggregated shapes in your query. You can then add a layer to your map based on this dataset. This would give you the best performance at report run-time.
I’ve added the database backup and report that shows the RDL UNION() aggregate to the location below.