PowerShell Script To Invoke ML Scoring Part I

By Earle Sinnatamby, Consultant

Objective

The purpose of this blog post is to provide PowerShell alternative to utilizing Azure Data Factory to perform Machine Learning (ML) scoring.

The Pilot engagement required daily on-premises data to be uploaded into Azure Blob Storage. Each data file uploaded required daily rescoring with ML script provided by the client the results of which was used to alert possible maintenance required for factory equipment.

The initial plan was to utilize ADF to perform this task. However, it was not possible to identify the daily files uploaded using parameter settings in Azure Data Factory. This was primarily because all the files contained the date and timestamp in yyyyMMdd_hhmmss.ddd format. Wide variation of the values of the hhmmss.ddd was observed thus could not be hardcoded.

The resulting PowerShell script was deployed on a Virtual Machine that had been previously provisioned and scheduled for execution daily using the TaskScheduler.

PowerShell Solution

The following PowerShell code snippets was utilized:

The Import-AzurePublishSettingFile cmdlet Imports a publish settings file that lets you manage your Azure accounts in Windows PowerShell. The file itself is an XML file with a .publishsettings file name extension. The file contains an encoded certificate that provides management credentials for your Azure subscriptions

#Import the publish settings file to setup Azure credentials

Import-AzurePublishSettingsFile C:\file-conaining-Azure-credentials.publishsettings

 

Set-AzureSubscription Creates or changes an Azure subscription in order to Changes the current and default Azure subscriptions with Select-AzureSubscription cmdlet

$subscriptionName = "mySubscription" #Azure Subscription

$storageAccountName = "myStorageAccountName" #Blob storage account name

$containerName = "myContainerName" #Container where files to be processed are stored

Set-AzureSubscription -SubscriptionName $subscriptionName -CurrentStorageAccountName $storageAccountName

Select-AzureSubscription -SubscriptionName $subscriptionName

 

Get-Date Gets the current date and time. And decrementing by 1 day with AddDays to get the prior day's date. Finally, using ToString was used to get the result in the yyyyMMdd format

$fileDate = (Get-Date).AddDays(-1).ToString('yyyyMMdd') #Date of previous day

 

Assign variable $prefixFileName the filename up to the date portion to be searched

$prefixFileName = "AzureBlobDirectoryContainingFilesToBeProcessed/fileName_${fileDate}"

 

Get-AzureStorageBlob obtains the list of blobs in the specified container and returns a list of blob present at that location

$files = Azure.Storage\Get-AzureStorageBlob -Container $containerName -Prefix $prefixKnifeLeft

 

Iterate over $files which consists of a list of files with the name prefix – note that $files.Count returns the count of blogs in the list

for ($i=0; $i -lt $files.Count; $i++) {

 

Break if the file is empty i.e. zero-byte size, hence, no ML scoring is required

if($files[$i].Length -le 0) {Break;}

 

Obtain the name of the file with the following cmdlet and passed on to the ML scoring function

$filename = split-path $files[$i].Name –leaf

 

Invoke the ML scoring function which will be covered in a future blog

# call ML Scoring function – to be covered by my colleague in a future blog

}

 

Scheduling the PowerShell Script

Use Windows TaskScheduler to schedule the script for execution.

Ensure the following is set correctly:

  1. The setting for Program/script in Action is cmd.exe
  2. The setting for Add Arguments is "/c –File C:\PowerShell directory\PowerShell script.ps1"

Alternatively, to ensure console logging use "Write-Host [-foreground Color]" cmdlet in the script with the following setting:

  1. The setting for Program/script in Action is cmd.exe
  2. The setting for Add Arguments is "/c –File C:\PowerShell directory\PowerShell script.ps1 >> C:\DailyJobLogs\DailyMLScoring.log 2>&"