Azure monitor

Reading time: 0 minute(s) (0 words)

Azure Monitor is a monitoring solution that collects and aggregates data for further interpretation and response.

Nobl9 integration with Azure Monitor supports collecting Azure Monitor metrics, including Application Insights. With it, Nobl9 users can retrieve metrics and build SLOs based on them.

Azure Monitor parameters and supported features in Nobl9

General support:: Release channel: Beta; Connection method: Agent, Direct; Replay and SLI Analyzer: Historical data limit 30 days; Event logs: Supported; Query checker: Not supported; Query parameters retrieval: Supported; Timestamp cache persistence: Supported
Query parameters:: Query interval: 1 min; Query delay: 5 min; Jitter: 15 sec; Timeout: 60 sec
Agent details and minimum required versions for supported features:: Plugin name: n9azure_monitor; Query delay environment variable: AZURE_MONITOR_QUERY_DELAY; Replay and SLI Analyzer: 0.69.0-beta01; Query parameters retrieval: 0.71.0-beta; Timestamp cache persistence: 0.69.0-beta01
Additional notes:: Support for Azure Monitor Metrics and Azure Monitor Logs; Learn more

Creating SLOs with Azure Monitor

Nobl9 integration with Azure Monitor supports Azure Monitor Metrics and Azure Monitor Logs data types. For the both data types, you can create threshold metric and ratio good or bad over total metrics.

These methods are available in the UI and by applying YAML via sloctl.

Nobl9 Web

Follow the instructions below to create your SLOs with Azure Monitor in the UI:

Navigate to Service Level Objectives.
Click .
Select a Service.
It will be the location for your SLO in Nobl9.
Select your Azure Monitor data source.
Modify Period for Historical Data Retrieval, when necessary.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on Azure Monitor.
- A longer period can extend the data loading time for your SLO.
- Must be a positive whole number up to the maximum period value you've set when adding the Azure Monitor data source.
Select the Data Type:

Azure Monitor Metrics
Azure Monitor Logs

Azure Monitor Metrics to capture numeric data from your monitored resources.

Specify the Resource you need to collect metrics for.
Set the path to your required resource using the Subscription, Resource Group, and Resource fields. Make sure your selected resource holds the metrics you need to collect.
Select Namespace in the list of available, when required.
It's a way Azure Monitor groups similar metrics together.
Configure Metric:

Select the metric type.
- A Threshold metric where a single time series is evaluated against a threshold.
- A Ratio metric that allows you to enter two-time series for comparison. You can choose one of the following metric types:
  - Good metric, meaning a ratio of good requests and total requests.
  - Bad metric, meaning a ratio of bad requests and total requests.
Choose the Data Count Method for your ratio metric:

Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.

Select the Metric Name in the list of available metrics.
When you cannot find the required metric name in the list or the list displays No matching data, check if your selected resource holds the metric you need.
Choose metric Aggregation to determine processing of the incoming data.
- Sum: the sum of all datapoints in an aggregation window
- Average: the average of datapoints per aggregation window. Usually, it's Sum/Count
- Maximum: the greatest datapoint value in an aggregation window
- Minimum: the lowest datapoint value in an aggregation window
- Count: the number of datapoints in an aggregation window. This type considers only how many data points are received, instead of datapoint values
Select Dimensions if any are applied to your chosen metric in Azure.
You can select the Value in the list or enter the required one. Make sure your entered value is up to 255 characters (ASCII only)

Azure Monitor Logs to capture log and performance data from your monitored resources

Specify the Resource you need to collect metrics for.
Set the path to your required logs workspace using the Subscription, Resource Group and Workspace.
Configure Metric:

Select the metric type.
- A Threshold metric where a single time series is evaluated against a threshold.
- A Ratio metric that allows you to enter two-time series for comparison. You can choose one of the following metric types:
  - Good metric, meaning a ratio of good requests and total requests.
  - Bad metric, meaning a ratio of bad requests and total requests.
Choose the Data Count Method for your ratio metric:

Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.

Enter the Query.
The query uses the Kusto Query Language and is case-sensitive.

Include n9_time and n9_value to the query.
The query must return n9_time and n9_value to Nobl9.

Creating Azure Monitor SLOs with sloctl

SLI values for good and total

When choosing the query for the ratio SLI (countMetrics), keep in mind that the values resulting from that query for both good and total:

Must be positive.
While we recommend using integers, fractions are also acceptable.

If using fractions, we recommend them to be larger than 1e-4 = 0.0001.

Shouldn't be larger than 1e+20.

Define the Time window for your SLO:
- Rolling time windows constantly move forward as time passes. This type can help track the most recent events.
- Calendar-aligned time windows are usable for SLOs intended to map to business metrics measured on a calendar-aligned basis.
Configure the Error budget calculation method and Objectives:
- Occurrences method counts good attempts against the count of total attempts.
- Time Slices method measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
Add the Display name, Name, and other settings for your SLO:
- Name identifies your SLO in Nobl9. After you save the SLO, its name becomes read-only.
  Use only lowercase letters, numbers, and dashes.
- Select No data anomaly alert to receive notifications when your SLO stops reporting data for a specified period:
  - Choose up to five supported Alert methods.
  - Specify the delay period before Nobl9 sends an alert about the missing data.
    From 5 minutes to 31 days. Default: 15 minutes
- Add alert policies, labels, and links, if required.
  Limits per SLO: 20 alert policies or links, 30 labels.
Click CREATE SLO.

SLO configuration use case

Check the SLO configuration use case for a real-life SLO example.

sloctl

SLI values for good and total

When choosing the query for the ratio SLI (countMetrics), keep in mind that the values resulting from that query for both good and total:

Must be positive.
While we recommend using integers, fractions are also acceptable.

If using fractions, we recommend them to be larger than 1e-4 = 0.0001.

Shouldn't be larger than 1e+20.

Azure Monitor Metrics

Threshold (rawMetric)
Ratio (countMetric) good over total
Ratio (countMetric) bad over total

Sample Azure Monitor threshold metrics SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200.0
      name: ok
      target: 0.95
      rawMetric:
        query:
          azureMonitor:
            dataType: metrics
            resourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/api-server
            metricName: Percentage CPU
            aggregation: Avg
            metricNamespace: azure.applicationinsights
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Azure Monitor good over total ratio metrics SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1.0
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          azureMonitor:
            dataType: metrics
            resourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/api-server
            metricName: Http2xx
            aggregation: Sum
        total:
          azureMonitor:
            dataType: metrics
            resourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/api-server
            metricName: Requests
            aggregation: Sum
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Azure Monitor bad over total ratio metrics SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1.0
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        bad:
          azureMonitor:
            dataType: metrics
            resourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/api-server
            metricName: Http4xx
            aggregation: Sum
        total:
          azureMonitor:
            dataType: metrics
            resourceId: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/myResourceGroup/providers/Microsoft.Compute/virtualMachines/api-server
            metricName: Requests
            aggregation: Sum
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Azure Monitor Logs

Threshold (rawMetric)
Ratio (countMetric) good over total
Ratio (countMetric) bad over total

Sample Azure Monitor threshold logs SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200
      name: ok
      target: 0.95
      rawMetric:
        query:
          azureMonitor:
            dataType: logs
            workspace:
              subscriptionId: 00000000-0000-0000-0000-000000000000
              resourceGroup: myResourceGroup
              workspaceId: 11111111-1111-1111-1111-111111111111
            kqlQuery: |-
              AppRequests
              | where AppRoleName == "api-server"
              | summarize n9_value = avg(DurationMs) by bin(TimeGenerated, 15s)
              | project n9_time = TimeGenerated, n9_value
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Azure Monitor good over total ratio logs SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          azureMonitor:
            dataType: logs
            workspace:
              subscriptionId: 00000000-0000-0000-0000-000000000000
              resourceGroup: myResourceGroup
              workspaceId: 11111111-1111-1111-1111-111111111111
            kqlQuery: |-
              AppRequests
              | where AppRoleName == "my-app"
              | where ResultCode >= 200 and ResultCode < 400
              | summarize n9_value = count() by bin(TimeGenerated, 15s)
              | project n9_time = TimeGenerated, n9_value
        total:
          azureMonitor:
            dataType: logs
            workspace:
              subscriptionId: 00000000-0000-0000-0000-000000000000
              resourceGroup: myResourceGroup
              workspaceId: 11111111-1111-1111-1111-111111111111
            kqlQuery: |-
              AppRequests
              | where AppRoleName == "my-app"
              | summarize n9_value = count() by bin(TimeGenerated, 15s)
              | project n9_time = TimeGenerated, n9_value
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Azure Monitor bad over total ratio logs SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Azure Monitor SLO
  indicator:
    metricSource:
      name: azure-monitor
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        bad:
          azureMonitor:
            dataType: logs
            workspace:
              subscriptionId: 00000000-0000-0000-0000-000000000000
              resourceGroup: myResourceGroup
              workspaceId: 11111111-1111-1111-1111-111111111111
            kqlQuery: |-
              AppRequests
              | where AppRoleName == "my-app"
              | where ResultCode == 0 or ResultCode >= 400
              | summarize n9_value = count() by bin(TimeGenerated, 15s)
              | project n9_time = TimeGenerated, n9_value
        total:
          azureMonitor:
            dataType: logs
            workspace:
              subscriptionId: 00000000-0000-0000-0000-000000000000
              resourceGroup: myResourceGroup
              workspaceId: 11111111-1111-1111-1111-111111111111
            kqlQuery: |-
              AppRequests
              | where AppRoleName == "my-app"
              | summarize n9_value = count() by bin(TimeGenerated, 15s)
              | project n9_time = TimeGenerated, n9_value
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Querying the Azure Monitor API

The Nobl9 agent leverages the Azure Monitor Data Plane API to get data.

Useful links

For a more in-depth look, consult additional resources:

Adding Azure Monitor as a data sourceAdding data sources

Creating SLOs via TerraformTerraform

Azure Monitor Metrics overviewAzure docs

Azure Monitor Logs overviewAzure docs

Best practices for Kusto Query Language queriesAzure docs

Creating SLOs with Azure Monitor​

Nobl9 Web​

sloctl​

Azure Monitor Metrics​

Azure Monitor Logs​

Querying the Azure Monitor API​

Useful links​