Most SSIS Developers know that they can use Integration Services for more than just moving data from A to B. In the era of cloud computing you can also use Integration Services to create your own compute resources before you actually do the typical datawarehouse work (ETL, Data Enrichment, Feature Engineering, Machine Learning, etc.).
The combination of Integration Services and Powershell has several advantages:
- You can provide azure resources in parallel. (For example you can create an HDInsight Cluster and several VMs at the same time.)
- Metadata about your environment can be stored inside SQL Server and generated using T-SQL.
- A nice looking user interface shows the progress of provisioning, data movement and computing tasks.
- The SSIS Package is a visual documentation of your environment and workflows.
- Data movement, processing and environment provisioning is orchestrated in a single place.
- Visual Studio can be used as development environment. (including TFS Source Control)
- You pay only the resources you need. The final step in your package could be to remove the environment.
- No money tied up in potentially suboptimal hardware investments. The environment will be extremely flexible. You can create and test a new configuration every day.
- (Practically) infinite scalability. You can copy and paste parts of your SSIS Package to provide additional resources.
Here is an example:
The basics are quite simple:
1) You store your azure credentials by executing the AddAzureAccount Commandlet in the PowerShell ISE.
2) It`s important to know how to execute Powershell commands inside SSIS Packages:
For this purpose you can use the “Execute Process Task”. Here is an example on how to execute a single Powershell Command in Integration Services:
3) You can also execute an entire Powershell Script file and also pass parameters to it:
It might happen that your azure credentials expire. In this case just open a Powershell ISE Window again and execute the AddAzureAccount commandlet and enter your Subscription Credentials. This should fix execution issues. Btw if you have an MSDN Subscription you get around 100$ azure usage for free every month.
A cool use case for this budget is to create several SQL Server VMs with slightly different server and database settings and run a workload script against them. This allows you to figure out which configuration works best. It’s a brute force approach to performance tuning. The data can be collected from all VMs using SQL Server Extended Events and can be stored in a SQL Azure Datawarehouse. Usually it is a good idea to create a Tabular Model and use Power BI to explore the result.