Building a data mart to analyse web log traffic

I recently had the opportunity to build a data mart in order to analyse web logs from IIS servers.  The solution utilised SQL Server 2012 Database Engine, Integration Services, Analysis Services and Excel 2010 (to slice/dice the data).   

Loading IIS logs (text files with fixed width spaces) was fairly challenging due as the column definitions would alter throughout.  I therefore used the .NET StreamReader class to read the metadata and detect changes in the fields. 

The data flow is shown below.


A snippet of the code in the Load Data (Source Script Component) is presented below

Code Snippet
  1. // Get variables
  2. strSourceFile = Variables.vCurrentSourceFileName;
  3. intAuditLogFileIdentifier = Variables.vAuditExecutionFileIdentifier;
  5. try
  6. {
  7.     // Create an instance of StreamReader to read from a file.
  8.     // The using statement also closes the StreamReader.
  9.     using (StreamReader sr = new StreamReader(strSourceFile))
  10.     {
  11.         String line;
  12.         int intNumberOfFields = 0;
  13.         string[] strListOfFields = null;
  15.         Trace.WriteLine("Log File: " + strSourceFile);
  17.         // Output the source file name as the first line (debugging purposes)
  18.         OutputLogFileRawDataBuffer.AddRow();
  19.         OutputLogFileRawDataBuffer.colRawData = strSourceFile;
  21.         // Read and display lines from the file until the end of the file is reached.
  22.         while ((line = sr.ReadLine()) != null)


Extracting the data from the file was relatively straightforward.  I placed the string into an array based on the fixed spacing between fields.  From the data, I was able to extract useful information such as browser type i.e. Safari, Chrome, IE and even browser version. 

IP addresses were mapped to geolocation using the free GeoLite information (CSV data imported into the database).  I converted longitude and latitude to the spatial geography data type and presented this against a world map (I wanted to validate that the GeoLite data correctly mapped to the IP address e.g. country/city to IP address).

   1:  USE BiKitchen;


   3:  DECLARE @SRID int = 4326

   4:  DECLARE @pLat nvarchar(max)

   5:  DECLARE @pLong nvarchar(max)

   6:  DECLARE @g geography


   8:  -- Check longitude and latitude for London

   9:  SET @pLat = (SELECT CAST(Latitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)

  10:  SET @pLong = (SELECT CAST(longitude AS nvarchar(max)) FROM [GeoData].[GeoLiteCity-Location] WHERE locid = 13547)


  12:  SET @g =  geography::STPointFromText('POINT(' +

  13:          @pLong + ' ' +

  14:          @pLat + ')', @SRID).BufferWithTolerance(11000, 1000, 0)


  16:  SELECT @pLat AS Latitude, @pLong AS Longitude

  17:  -- SELECT @g.Lat, @g.Long


  19:  -- Map the geography type to base world map data

  20:  -- View the result in the spatial tab to validate coordinates

  21:  SELECT @g AS spatiallocation

  22:  UNION ALL SELECT geog FROM World_Borders


The star schema was built and deployed in SQL Server 2012 Analysis Services (UDM).  I found named calculations to be incredibly powerful way of extending the data model and making attributes more meaningful for end-users


The data was presented using Excel 2010, a screenshot is shown below.  I found slicers to be extremely useful




I thought it would be interesting to see what Excel 2013 had to offer so I tried to create a Power View report but this is not currently supported against the UDM.   There are however some nice enhancements to chart types so I’ll be looking at this in more detail.


Comments (4)

  1. Anonymous says:

    ah. but did you get a good mark for it 🙂

  2. Glenn Wilson says:

    Any chance of publishing the project?

  3. Anonymous says:

    Very nice, superlatch .

  4. Anonymous says:

    I have also created a comprehensive series (8 parts) on how to build a data mart from scratch using MS BI stack. You can view the introductory post (with the remaining 7 as per links provided) @

Skip to main content