Microsoft is BIG in Big Data
Last week I chaired our High Tech Executive Customer Advisory Board (CAB) made of a dozen of CIOs of the High Tech ecosystem and the focus was Big Data. The conversation was lead by one of our Big Data expert Shoshanna Budzianowski that triggered a lot of interested amongst the CAB members.
What if we could find insights of early detection of industry trends that would influence the timing of capital investments such as semiconductor factories? What if we could harness Supply Chain industry trends? What if we could extract best practice knowledge by mining the vast amount of support, design or manufacturing operation data? Are companies about to realize they are sitting on an gold mine of structured AND unstructured data?
What is Big Data?
There is a set of broad industry trends which are each putting pressure on traditional data management and business intelligence platforms and tools. These include:
- Increasing Data Volumes: The annual growth of WW information volume is 59% and continues to rise. This explosion of new data driven by a full range of traditional and non-traditional sources like sensors, devices, bots and crawlers. According to an IDC report, the volume of digital records is forecasted to hit 1.2M Zetabytes (1021 bytes) this year – and predicted to grow 44x over the next decade
- Increasing Data & Analysis Complexity: The real growth in data mentioned earlier is coming from unstructured data and the myth that 80% of unstructured data has no value has been debunked by examples like the success of search engine providers and e-retailers who unlocked the value of click-stream data. The requirement to store, analyze and mine structured and unstructured data together is becoming the new norm
- Changing Economics, and Emerging Technologies: Cloud computing and commodity hardware have radically reduced the acquisition cost of computational and storage capacity and is fundamentally changing the economics of data processing. commodity hardware is being complimented by new distributed parallel processing frameworks like Hadoop, which when combined with a rich ecosystem of tools provides a platform for tackling massive data processing tasks.
The buzzword “Big Data” captures in a nutshell the trends, technologies and promise for businesses to generate real insights from Petabytes of structured and unstructured data. Simply put it’s about the 3Vs; Volume, Velocity and Variety of data.
What problems are customers trying to solve?
Organizations have always produced data that is unstructured and high-volume e.g. medical images, Web logs, RFID, sensor, locality, etc. Historically customers threw away most of the data they could collect to avoid data deluge. Spurred by plummeting storage and computation costs, coupled with the value resident in this data, customers are demanding new business insight from every bit of data they have access to in a cost effective and scalable way. Examples include
- Understanding user behavior and interactions online
- Identifying trends and popular topics in social media sentiment analytics
- Optimizing and targeting advertising campaigns
- Discovering medical epidemiological trends (e.g. identifying the next flu outbreak)
- Identifying financial fraud within public sector transactions
What is Microsoft’s experience with addressing these challenges?
Microsoft has been doing “Big Data” long before it became a mega-trend and brings over a decade of Big Data expertise to the market. Microsoft is in-fact one of the largest users of “Big Data” technologies. E.g. we use it at Bing to deliver you the best search results (over 100 PBs of data), we use it at Microsoft Advertising for Ad targeting (14 billion ads per month), we use it at XBOX Kinect for machine learning for facial and bodily gestures to tune Kinect’s response to you, we use it for Exchange Hosted Services for Spam detection (2-4 billion emails per day) and the list goes on.
What is Microsoft’s approach to Big Data?
Our approach focuses in four key areas:
Second, our goal is to make Hadoop enterprise ready. This means enabling data movement between Hadoop and SQL Server and delivering a distribution of Hadoop that can integrate with existing IT infrastructure on windows i.e. Systems Center, Active Directory etc. and provide enterprise capabilities like security, predictable performance with the option to deploy Hadoop in Hybrid IT scenarios on premises and in the cloud.
Third, we want to facilitate access to the world’s data through the combination of internal and external data and services. The Windows Azure Marketplace offers a Data Market service that provides with wide variety of data including demographic, environment, financial, retail and sports. Microsoft is pioneering other services such as Microsoft Codename "Social Analytics" that enables customers to improve profitability by integrating social media data with business applications. See new concepts/ideas on the SQL Labs website.
Fourth, our goal is to enable insights into ‘Big Data’ to all users wherever they are. By focusing on providing integration with our industry leading BI Platform including SQL Server Analysis and Reporting Services and SharePoint as well as enabling accessibility through award winning Self Service BI tools like PowerPivot for Excel and Power View on any device, we will make insights into Big Data also available to a broader class of end users as given in the illustration below.
For more information on Microsoft and Big Data, please use the resource link below:
Have a BIG fantastic day!