I recently had the opportunity to build a data mart in order to analyse web logs from IIS servers. The solution utilised SQL Server 2012 Database Engine, Integration Services, Analysis Services and Excel 2010 (to slice/dice the data).
Loading IIS logs (text files with fixed width spaces) was fairly challenging due as the column definitions would alter throughout. I therefore used the .NET StreamReader class to read the metadata and detect changes in the fields.
The data flow is shown below.
A snippet of the code in the Load Data (Source Script Component) is presented below
Extracting the data from the file was relatively straightforward. I placed the string into an array based on the fixed spacing between fields. From the data, I was able to extract useful information such as browser type i.e. Safari, Chrome, IE and even browser version.
IP addresses were mapped to geolocation using the free GeoLite information http://www.maxmind.com/app/geolite (CSV data imported into the database). I converted longitude and latitude to the spatial geography data type and presented this against a world map (I wanted to validate that the GeoLite data correctly mapped to the IP address e.g. country/city to IP address).
1: USE BiKitchen;
3: DECLARE @SRID int = 4326
4: DECLARE @pLat nvarchar(max)
5: DECLARE @pLong nvarchar(max)
6: DECLARE @g geography
8: -- Check longitude and latitude for London
9: SET @pLat = (SELECT CAST(Latitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)
10: SET @pLong = (SELECT CAST(longitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)
12: SET @g = geography::STPointFromText('POINT(' +
13: @pLong + ' ' +
14: @pLat + ')', @SRID).BufferWithTolerance(11000, 1000, 0)
16: SELECT @pLat AS Latitude, @pLong AS Longitude
17: -- SELECT @g.Lat, @g.Long
19: -- Map the geography type to base world map data
20: -- View the result in the spatial tab to validate coordinates
21: SELECT @g AS spatiallocation
22: UNION ALL SELECT geog FROM World_Borders
The star schema was built and deployed in SQL Server 2012 Analysis Services (UDM). I found named calculations to be incredibly powerful way of extending the data model and making attributes more meaningful for end-users
The data was presented using Excel 2010, a screenshot is shown below. I found slicers to be extremely useful
I thought it would be interesting to see what Excel 2013 had to offer so I tried to create a Power View report but this is not currently supported against the UDM. There are however some nice enhancements to chart types so I’ll be looking at this in more detail.