For my #SQLPASS Summit 2012 talk SQLCAT: Big Data – All Abuzz About Hive (slides available to all | recording available to PASS Summit 2012 attendees) I showed a mash-up of Hive, SQL Server, and Excel data that had been imported to PowerPivot and then displayed via Power View in Excel 2013 (using the new SharePoint-free self-service option). PowerPivot brings together the new world of unstructured data from Hadoop with structured data from more traditional relational and multi-dimensional sources to gain new business insights and break down data silos. We were able to take very recent data from Hurricane Sandy, which occurred the week before the PASS Summit, and quickly build a report to pinpoint some initial areas of interest. The report provides a sample foundation for exploring to find additional insights. If you need more background on Big Data, Hadoop, and Hive please see my previous blogs and talks.
I will walk you through the steps to create the report including loading population demographics (census), weather (NOAA), and lookup table (state abbreviations) data into Hive, SQL Server, Excel, and PowerPivot then creating visualizations in Power View to gain additional insights. Our initial goal is to see if there are particular geographic areas in the path of Hurricane Sandy that might need extra assistance with evacuation. One hypothesis is that people above a given age might be more likely to need assistance, so we want to compare age data with the projected rainfall patterns related to the path of the hurricane. Once you see this basic demonstration you can envision all sorts of additional data sets that could add value to the model, along with different questions that could be asked given the existing data sets. Data from the CDC, pet ownership figures, housing details, job statistics, zombie predictions, and public utility data could be added to Hive or pulled directly from existing sources and added to the report to gain additional insights. Those insights might, for example, help first responders during future storms, assist your business to understand various ways it can help after a storm or major cleanup effort, or aid future research into reducing the damage done by natural disasters.
Cindy Gross Microsoft SQLCAT PM