This article helps you to understand functional testing with pipelines in Azure Data Factory.
This article applies to Azure Data Factory V2.
Credits - Thanks to ADFv2 Product Group (Anudeep Sharma, Hermine Vermishyan & Abhishek Narain ) for reviewing and all assistance with my ADFv2 and my partner in awesomeness David Kuo for testing the script
During development phase of an Azure Data Factory pipeline you have a debugging functionality in UX which you can use to debug a pipeline and find out if anything is failing due to reasons like incorrect connection string, object not present in database, wrong query, etc.
Keeping same idea in mind I have extended the functionality of debugging from pipelines' development stage to release stage, where you can use PowerShell script to perform pipelines' functional testing in Test or Integration or UAT environment ensuring all issues are caught at that stage instead of hitting in Production environment and in this article series will share the process of achieving same with PowerShell script and good news is that you can use same script in your environment without making any code change.
Function test with ADF can be used to validate if pipeline is working and if anything related to functionality is failing then stop release, fix issue and iterate. It's more aligned with build verification ensuring objects like linked service, tables, storage, etc. which are required for a pipeline are successfully validated before releasing to production environment.
Testing functionality of pipelines ensuring configurations like databases, server, storage, etc. are configured correctly and pipeline is not failing due to misconfiguration like database not exist or something on same lines. Errors due to data quality, performance or anything related to data sanctity is outside the scope of functional testing.
How it Works
Let's go in details of understanding working of functional testing with ADFv2.
First of all you need to understand how debugging works in pipeline and good news is that there is very informative video done by Gaurav Malhotra from ADF team (highly recommended if you want to get insight of pipeline debugging): Iterative development and debugging using Data Factory
As you have seen in this video that for debugging you need to run a pipeline and for running a pipeline it requires parameters defined for your pipeline and run the pipeline in debug mode with an output showing whether pipeline succeeded or failed, keeping same approach in mind I attempted to write this functionality with following flow:-
Functional Testing Flow: -
Now let's look into each step in this flow:-
1. Azure Active Directory (AAD) Authentication - AAD is required for silent authentication of PowerShell script which is used during automation of testing and PowerShell script needs to access Azure Data Factory for getting list of pipelines so we need to ensure Application ID should also have access to Azure Data Factory. There are few handy articles which you can follow for creating Application ID mentioned as follow:-
b. How to give access to Application ID
i. Go to Azure Data Factory in Azure Portal
ii. Under All settings select Users and click Add as shown in following figure: -
iii. After hitting Add it will open another windows which will allow you to add Application ID which you had crated in Step1, as there is need of only getting list of pipelines so we are making Application member of Reader Role as shown in following figure where Application name is FunctionalTesting and added member of Reader Role:-
iv. In PowerShell we will be leveraging Application ID for silent authentication and its recommended to use Key Vault for pulling secrets in this case we need a Application ID secret (password), in PowerShell I have parameterized secret field so you can either pass it as plain text or make a call to Key Vault during release process and pass along the secret. Code snippet as follow in which we are doing silent authentication using Application ID which is also knowns as Service Principal, you can refer this article for getting TenantID
2. Functional Testing Logic -
a. Get list of Pipeline from ADF
b. Check for Pipeline Exclusion
Example of exclusion file:-
c. Pipeline Test file - Following is an example of test file, in my case I have a pipeline named as pl-testValidation in data factory and that pipeline expects a parameter known as SubFolder, I need to ensure then I am creating a parameter file with respect to my pipeline which will be used a functional test file by testing logic, also need to ensure that test file name matches with the pipeline name and should have all parameters stated in a pipeline, and if this file is missing then it will be marked as failed with missing test file
Example of test file as the pipeline name is pl-testValidation so I have created test file as pl-testValidation.json (PipelineName.json) with a parameter consumed by pipeline and TestTimeout value as I want to run this test for 80 seconds, if I will miss this value then it will default to 30 seconds.
d. In following code if you notice we are iterating through pipelines and checking if PipelineName.json exist and also checking what is the TestTimeout value if missing then defaulting it to 30 seconds.
e. Running Pipeline test by using following command
f. Using value of $runId to get current status (refer GIT - PS script for full code)
g. Using value of $runId to get status (refer GIT - PS script for full code)
h. Fail test if it exceeds timeout setting
i.If you are integrated with Release Manager then it will generate exception in-case of failure resulting in failed release and give an output of failed pipelines which needs to be fixed before deploying to production environment but don’t be dis-heartened you can still code even though you are not using Release Manager by using following command.
.\AzureDatafactoryFunctionalTesting.ps1 -dataFactoryName "Your Factory Name" -resourceGroupName "Factory Resource Group Name" -azureSubscription "Subscription ID" -applicationSecret "Application Secret from Point 1" -applicationID "Application ID from Point 1" -tenantID "Tenant ID from Point 1" -excludedPipelinesFilePath ".\" -functionalTestJsonFilePath ".\"
As you have seen in this article I have gone through step by step of achieving functional testing with ADFv2 and in subsequent article I will be discussing more about how to integrate same with Release Manager so that you can have this testing framework as a part of Pipeline Release process to Production.
Git – feel free to download and contribute