Retrieving Resource Metrics and Creating Alert Rules via Azure PowerShell

UPDATE: This post has been updated to use the new Azure Resource Manager (ARM) cmdlets released in March 2016.

Azure Insights is intended to provide a coherent monitoring and alerting framework for all Azure resources. Michael Collier wrote a post describing how to retrieve Azure Insights metric data using the Azure Resource Manager Insights REST API and the Azure Insights .NET library. Resource metrics can be viewed and alert rules created using the Azure Portal.

This post shows how to use Azure PowerShell to download Azure Insights metric data and create alert rules triggered when specified metric thresholds are exceeded. It also shows how to configure alert rules triggered when a specified ARM management operation is performed.

The following Azure PowerShell cmdlets in the AzureRm.Insights module are used:

  • Get-AzureRmMetricDefinition – downloads the definitions of the available metrics for a resource
  • Get-AzureRmMetric– downloads the Azure Insights metric data for a resource
  • Add-AzureRmMetricAlertRule – creates an alert rule that is raised when a metric threshold is exceeded for a resource
  • Get-AzureRmAlertRule – retrieves the definition of an alert rule
  • Remove-AzureRmAlertRule – deletes an alert rule
  • New-AzureRmAlertRuleEmail - configures email addresses and webhooks used when an alert rule is triggered.

Metric Definitions

The Get-AzureRmMetric cmdlet downloads the definitions of an Azure Insights metric. For example, the following retrieves the definitions for a VM named myVM in a resource group named myRG:

$resourceId = '/subscriptions/SUBSCRIPTION_guid/resourceGroups/myRG/providers/Microsoft.Compute/virtualMachines/myVM'

 

Get-AzureRmMetricDefinition –ResourceId $resourceId `
-DetailedOutput

The result is a set of metric definitions. The following is the metric definition for average CPU % for a VM:

MetricAvailabilities :
Location :
Retention : 10675199.02:48:05.4775807
Values : 01:00:00
Location :
Retention : 10675199.02:48:05.4775807
Values : 00:01:00
Name : \Processor(_Total)\% Processor Time
Properties :
PrimaryAggregationType : Average
Unit : Percent

Each metric is collected and stored with one or more metric availabilities with different time-grain values, e.g., 1 hour and 1 minute in the example above.

Metrics

The Get-AzureRmMetric cmdlet retrieves the Azure Insights metrics for a resource. This cmdlet has the following parameters:

  • ResourceId - fully-qualified resource Id, including subscription Id, resource group name and resource name
  • MetricNames – name of the Azure Insights metric
  • TimeGrain – PowerShell TimeSpan matching one of the time grains for the metric
  • StartTime – start time for data retrieval
  • EndTime – end time for data retrieval

For example, the following downloads the average CPU % for the last 40 minutes collected at 1 minute intervals.

$endTime = Get-Date

$startTime = $endTime.AddMinutes(-40)

$timeGrain = '00:01:00'

$resourceId = '/subscriptions/SUBSCRIPTION_guid/resourceGroups/myRG/providers/Microsoft.Compute/virtualMachines/myVM'

$metricName = '\Processor(_Total)\% Processor Time'

 

(Get-AzureRmMetric -ResourceId $resourceId `
-TimeGrain $timeGrain -StartTime $startTime `
-EndTime $endTime `
-MetricNames $metricName).MetricDefinitions

This retrieves an output set comprising a collection of metric points, like the following, for the average CPU %:

Average        : 1.238538
Count          : 1
Last           :
Maximum        : 2.39034
Minimum        : 0.659043
Total          : 3.715614
Timestamp      : 4/27/2016 2:25:00 AM

As well as the MetricDefinitions property, Get-AzureRmMetric also retrieves an output set comprising a collection of metric points, like the following, for the average CPU %:

Name           : 1.238538
EndTime        : 4/27/2016 3:27:46 AM
StartTime      : 4/27/2016 2:57:46 AM
TimeGrain      : 00:01:00
Unit           : Percent

Alert Rules

Azure Insights supports the concept of an alert rule that is raised automatically whenever an ARM management operation is invoked for a VM or an Azure Insights metric for a resource exceeds a specified threshold. When an alert is raised an email can be sent to the service administrators and other email accounts which have been configured to receive them. The Add-AzureRmLogAlert cmdlet is used to create management operation alerts and Add-AzureRmMetricAlert is used to create metric alerts. Get-AzureRmAlert and Remove-AzureRmAlert can be used respectively to retrieve or delete an alert of either type.

A management operation alert rule specifies which management operation on a specified resource is the trigger for the alert, and whether the alert should be sent to the service administrators and other email addresses. Typically, an alert email is sent a few minutes after the operation is initiated.

The Add-AzureRmLogAlertRule cmdlet has the following basic parameters:

  • Name – name of the alert rule
  • Description – description of the alert rule
  • ResourceGroup – resource group of the alert rule resource
  • TargetResourceGroup - resource group containing the target resource
  • Location – location for the alert rule resource
  • OperationName – fully-qualified name of the management operation which triggers the alert
  • Actions – object describing specific email addresses to which the alert email should be sent as well as any webhooks which should be invoked
  • Status – optional specification of a management log status that will trigger the alert
  • SendToServiceOwners – indicates that an alert email should be sent to the service owners when the alert is raised
  • EmailAddress – comma-separated list of additional email addresses to send alerts to

 

For example, the following commands configure a management operation alert rule that sends an alert to the specified email address whenever a VM is started in the myRG resource resource group:

$actionEmail = New-AzureRmAlertRuleEmail `
-CustomEmail me@contoso.com

Add-AzureRmLogAlertRule -Name StartAlert `
-Location 'East Asia' -ResourceGroup myRG `
-OperationName Microsoft.Compute/virtualMachines/start/action `
-TargetResourceGroup myRG -Actions $actionEmail

A metric alert rule has a condition and an action. The condition specifies the threshold to be exceeded and how long it must be exceeded – for example, average percentage CPU greater than 90% for 5 minutes. The action specifies whether emails should be sent to the system administrators and / or other email accounts.

The Add-AzureRmMetricAlert cmdlet has the following basic parameters:

  • Name – name of the alert rule
  • Description – description of the alert rule
  • ResourceGroup – resource group of the alert rule resource
  • TargetResourceId - fully-qualified resource Id, including subscription Id, resource group name and resource name of the target resource
  • Location – location for the alert rule resource
  • Operator – operation for the alert rule: GreaterThan, GreaterThanOrEqual, LessThan, LessThanOrEqual
  • Threshold – threshold value for the alert rule
  • WindowSize – timespan over which the threshold value must satisfy the operator
  • MetricName – name of the metric for the alert rule
  • TimeAggregationOperator – aggregation to apply to the metric for the WindowSize: Average, Total
  • SendToServiceOwners – indicates that an alert email should be sent to the service owners when the alert is raised
  • EmailAddress – comma-separated list of additional email addresses to send alerts to

 

For example, the following commands create an alert rule that sends alert emails to service owners whenever the average CPU % for the VM exceeds 90% for more than 5 minutes:

$alertName = 'Threshold_Alert_CPU_Over_90'

$resourceGroupName = myRG

$resourceId = '/subscriptions/SUBSCRIPTION_GUID/resourceGroups/myRG/providers/Microsoft.Compute/virtualMachines/myVM'

$metricName = '\Processor(_Total)\% Processor Time'

$alertDescription = 'CPU Percentage > 90%'

 

Add-AzureRmMetricAlertRule -Name $alertName -Location westus `
-ResourceGroup $resourceGroupName `
–Description $alertDescription -SendToServiceOwners `
-RuleType Metric -Operator GreaterThan -Threshold 90 `
-WindowSize 00:05:00 -ResourceId $resourceId `
   -MetricName $metricName -TimeAggregationOperator Average

When the alert is raised an email with the following subject is sent to all of the service owners:

  • [ALERT ACTIVATED] - \Processor(_Total)\% Processor Time GreaterThan 90 (Percent) in the last 5 minutes‏

 

Note that alert rules can also be configured to be raised following Azure management events. There is Azure Resource Manager REST API documentation for alert rules and management events.