Data Insights Global Practice – Blog Kick-off

Today we’re kicking off a new Blog intended to provide technical and architectural guidance for developing analytics solutions on Microsoft Azure. Over the past 12 months, the breadth managed services capabilities that enable end-to-end analytics workloads on the Azure platform has grown substantially. At the same time, existing services have continued to mature their feature sets, which may present an opportunity to review an existing cloud architecture for optimization, or lower the barrier for migrating solutions into the cloud. As more customers move to adopt Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) capabilities, we identified an opportunity to provide real-world examples of existing implementations to help accelerate the learning in this relatively new space. Thus – this blog.

The content published here is primarily intended for developer and architect audiences that focus on delivering analytics workloads. The majority of this content will be provided by Architects and Delivery Consultants from within Microsoft Services, each of whom have first-hand experience delivering Azure based solution to customers as part of the Data Insights Global Practice.

Our primary goal is to help customers and partners realize the true potential of leveraging the characteristics of cloud for analytics focused solutions. These characteristics include the scale and performance on-demand, low cost of entry, and also the economics of scalable, elastic solutions that often have a finite lifespan.

We engage with Microsoft customers and partners on a day-to-day basis, and one of the common pain points in cloud adoption is often related to scale. Building performant and robust applications for the cloud is reliant on leveraging scale-out architecture patterns, which is actually quite the opposite to the traditional building blocks. Two key principals to always keep in mind when developing cloud solutions are “Distribution” and “Federation”.

The principal of distribution relates to the concept of separating data and compute across multiple physical nodes, effectively breaking a large problem into multiple smaller parts. Microsoft managed services offerings that make up the Cortana Analytics Suite are designed to enable distribution under the covers, so developers need only worry about how data is distributed, and not how to manage a distributed platform. Federation is a little different in that describes an implementation where data can be processed locally at rest rather than moving large volumes for the sake of centralized for processing. A lot of the content on this blog will explain various scale-out approaches to data processing in Azure, often citing one of both of these principals in the explanation and examples.

It’s easy to write about product features, but we’ve got a lot of engineering documentation for that already. Instead, we intend to deliver posts in a context that describe the capabilities of one or more Azure services that can interoperate to support part of an all-up solution, as well as providing the rationale for a given architecture vs. potential alternate options. The scope of this blog will align tightly with the bundle of services that comprise the recently announced Cortana Analytics Suite, including:

  • Azure Stream Analytics

  • HDInsight (Spark, Hive, Pig etc.)

  • Azure SQL Data Warehouse (and by association Azure SQL Database)

  • Azure Machine Learning

  • Azure Data Factory

  • Power BI

Given the rapid growth in the Azure portfolio for analytics services, we’re taking on a pretty big surface area for a space that is moving incredibly quickly.  New features are added to one service or another on a weekly basis and so change is happening very quickly. In an effort to support this pace and maintain the validity of content on an on-going basis, we’re committed to publishing frequently, so come back regularly!

And finally, we’re always looking for feedback on the content produced so that we can continue to refine our approach and potentially provided more targeted scenarios to support areas where customers and partners of Microsoft are planning to develop and deploy solutions. Please feel free to provide your feedback to any of the articles posted.

Charles Feddersen
charlesf@microsoft.com