Azure monitor
Azure Monitor is a monitoring solution that collects and aggregates data for further interpretation and response.
Nobl9 integration with Azure Monitor supports collecting Azure Monitor metrics, including Application Insights. With it, Nobl9 users can retrieve metrics and build SLOs based on them.
Azure Monitor parameters and supported features in Nobl9
- General support:
- Release channel: Beta
- Connection method: Agent, Direct
- Replay and SLI Analyzer: Historical data limit 30 days
- Event logs: Supported
- Query checker: Not supported
- Query parameters retrieval: Supported
- Timestamp cache persistence: Supported
- Query parameters:
- Query interval: 1 min
- Query delay: 5 min
- Jitter: 15 sec
- Timeout: 60 sec
- Agent details and minimum required versions for supported features:
- Plugin name: n9azure_monitor
- Query delay environment variable: AZURE_MONITOR_QUERY_DELAY
- Replay and SLI Analyzer: 0.69.0-beta01
- Query parameters retrieval: 0.71.0-beta
- Timestamp cache persistence: 0.69.0-beta01
- Additional notes:
- Support for Azure Monitor Metrics and Azure Monitor Logs
- Learn more
Creating SLOs with Azure Monitorβ
Nobl9 integration with Azure Monitor supports Azure Monitor Metrics and Azure Monitor Logs data types.
For the both data types, you can create threshold metric and ratio good
or bad
over total
metrics.
These methods are available in the UI and by
applying YAML via sloctl
.
Nobl9 Webβ
Follow the instructions below to create your SLOs with Azure Monitor in the UI:
-
Navigate to Service Level Objectives.
-
Click .
-
Select a Service.
It will be the location for your SLO in Nobl9. -
Select your Azure Monitor data source.
-
Modify Period for Historical Data Retrieval, when necessary.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on Azure Monitor.
- A longer period can extend the data loading time for your SLO.
- Must be a positive whole number up to the maximum period value you've set when adding the Azure Monitor data source.
-
Select the Data Type:
- Azure Monitor Metrics
- Azure Monitor Logs
Azure Monitor Metrics to capture numeric data from your monitored resources.
- Specify the Resource you need to collect metrics for.
Set the path to your required resource using the Subscription, Resource Group, and Resource fields. Make sure your selected resource holds the metrics you need to collect. - Select Namespace in the list of available, when required.
It's a way Azure Monitor groups similar metrics together. - Configure Metric:
Select the metric type.
- A Threshold metric where a single time series is evaluated against a threshold.
A Ratio metric that allows you to enter two-time series for comparison. You can choose one of the following metric types:
- Good metric, meaning a ratio of
good
requests andtotal
requests. - Bad metric, meaning a ratio of
bad
requests andtotal
requests.
- Good metric, meaning a ratio of
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
- Select the Metric Name in the list of available metrics.
When you cannot find the required metric name in the list or the list displays No matching data, check if your selected resource holds the metric you need. Choose metric Aggregation to determine processing of the incoming data.
- The following aggregations are available:
- Sum: the sum of all datapoints in an aggregation window
- Average: the average of datapoints per aggregation window. Usually, it's
Sum/Count
- Maximum: the greatest datapoint value in an aggregation window
- Minimum: the lowest datapoint value in an aggregation window
- Count: the number of datapoints in an aggregation window. This type considers only how many data points are received, instead of datapoint values
- Select Dimensions if any are applied to your chosen metric in Azure.
You can select the Value in the list or enter the required one. Make sure your entered value is up to 255 characters (ASCII only)
Azure Monitor Logs to capture log and performance data from your monitored resources
- Specify the Resource you need to collect metrics for.
Set the path to your required logs workspace using the Subscription, Resource Group and Workspace. - Configure Metric:
Select the metric type.
- A Threshold metric where a single time series is evaluated against a threshold.
A Ratio metric that allows you to enter two-time series for comparison. You can choose one of the following metric types:
- Good metric, meaning a ratio of
good
requests andtotal
requests. - Bad metric, meaning a ratio of
bad
requests andtotal
requests.
- Good metric, meaning a ratio of
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
- Enter the Query.
The query uses the Kusto Query Language and is case-sensitive. - Include
n9_time
andn9_value
to the query. - The query must return
n9_time
andn9_value
to Nobl9. π‘ See query examples in Creating Azure Monitor SLOs with sloctl
countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
- Define the Time window for your SLO:
- Rolling time windows constantly move forward as time passes. This type can help track the most recent events.
- Calendar-aligned time windows are usable for SLOs intended to map to business metrics measured on a calendar-aligned basis.
- Configure the Error budget calculation method and Objectives:
- Occurrences method counts good attempts against the count of total attempts.
- Time Slices method measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
Similar threshold values for objectivesTo use similar threshold values for different objectives in your SLO, we recommend differentiating them by setting varying decimal points for each objective.
For example, if you want to use threshold value1
for two objectives, set it to1.0000001
for the first objective and to1.0000002
for the second one.
Learn more about threshold value uniqueness. - Add the Display name, Name, and other settings for your SLO:
- Name identifies your SLO in Nobl9. After you save the SLO, its name becomes read-only.
Use only lowercase letters, numbers, and dashes. - Create composite SLO: with this option selected, you create a composite SLO 1.0. Composite SLOs 1.0 are deprecated. They're fully operable; however, we encourage you to create new composite SLOs 2.0.
You can create composite SLOs 2.0 withsloctl
using the provided template. Alternatively, you can create a composite SLO 2.0 with Nobl9 Terraform provider. - Set Notifications on data. With it, Nobl9 will notify you in the cases when SLO won't be reporting data for more than 15 minutes.
- Add alert policies, labels, and links, if required.
Up to 20 items of each type per SLO is allowed.
- Name identifies your SLO in Nobl9. After you save the SLO, its name becomes read-only.
- Click CREATE SLO
sloctlβ
countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
Azure Monitor Metricsβ
- Threshold (rawMetric)
- Ratio (countMetric) good over total
- Ratio (countMetric) bad over total
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-slo
project: my-project
spec:
# Enum: Occurrences | Timeslices
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
# Your data source name identifier
name: azure-data-source
service: my-service
objectives:
- displayName: My objective 1
name: my-objective-1
# Enum: lte | lt | gte | gt
op: lte
rawMetric:
query:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
# Enum: Sum, Count, Average, Min, Max
aggregation: Avg
# Number, float64
value: 1
target: 0.5
timeSliceTarget: 0.5
- displayName: My objective 2
name: my-objective-2
op: lte
rawMetric:
query:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
aggregation: Avg
value: 0.5
target: 0.6
timeSliceTarget: 0.6
timeWindows:
# Number of units in a time window, integer
- count: 1
# Boolean: true for rolling time windows
isRolling: true
# Enum: Minute | Hour | Day for rolling time windows
unit: Hour
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-slo
project: my-project
spec:
# Enum: Occurrences | Timeslices
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
# Your data source name identifier
name: azure-data-source
service: my-service
objectives:
# Number, float64
- target: 0.9
countMetrics:
good:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
# Enum: Sum, Count, Average, Min, Max
aggregation: Sum
# Boolean
incremental: false
total:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
aggregation: Sum
displayName: ""
# Number, float64
value: 1
name: my-objective
timeWindows:
# Number of units in a time window, integer
- count: 1
# Boolean: true for rolling time windows
isRolling: true
# Enum: Minute | Hour | Day for rolling time windows
unit: Hour
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-slo
project: my-project
spec:
# Enum: Occurrences | Timeslices
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
# Your data source name identifier
name: azure-data-source
service: my-service
objectives:
# Number, float64
- target: 0.9
countMetrics:
bad:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
# Enum: Sum, Count, Average, Min, Max
aggregation: Sum
# Boolean
incremental: false
total:
azureMonitor:
dataType: metrics
resourceID: /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<PROVIDER_NAMESPACE>/<PROVIDER_TYPE>/<APP_NAME>
metricName: <METRIC_NAME>
aggregation: Sum
displayName: ""
# Number, float64
value: 10
name: my-objective
timeWindows:
# Number of units in a time window, integer
- count: 1
# Boolean: true for rolling time windows
isRolling: true
# Enum: Minute | Hour | Day for rolling time windows
unit: Hour
Azure Monitor Logsβ
- Threshold (rawMetric)
- Ratio (countMetric) good over total
- Ratio (countMetric) bad over total
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Azure Monitor SLO
indicator:
metricSource:
name: azure-monitor
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 200
name: ok
target: 0.95
rawMetric:
query:
azureMonitor:
dataType: logs
workspace:
subscriptionId: 00000000-0000-0000-0000-000000000000
resourceGroup: myResourceGroup
workspaceId: 11111111-1111-1111-1111-111111111111
kqlQuery: |-
AppRequests
| where AppRoleName == "api-server"
| summarize n9_value = avg(DurationMs) by bin(TimeGenerated, 15s)
| project n9_time = TimeGenerated, n9_value
op: lte
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Azure Monitor SLO
indicator:
metricSource:
name: azure-monitor
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 1
name: ok
target: 0.95
countMetrics:
incremental: true
good:
azureMonitor:
dataType: logs
workspace:
subscriptionId: 00000000-0000-0000-0000-000000000000
resourceGroup: myResourceGroup
workspaceId: 11111111-1111-1111-1111-111111111111
kqlQuery: |-
AppRequests
| where AppRoleName == "my-app"
| where ResultCode >= 200 and ResultCode < 400
| summarize n9_value = count() by bin(TimeGenerated, 15s)
| project n9_time = TimeGenerated, n9_value
total:
azureMonitor:
dataType: logs
workspace:
subscriptionId: 00000000-0000-0000-0000-000000000000
resourceGroup: myResourceGroup
workspaceId: 11111111-1111-1111-1111-111111111111
kqlQuery: |-
AppRequests
| where AppRoleName == "my-app"
| summarize n9_value = count() by bin(TimeGenerated, 15s)
| project n9_time = TimeGenerated, n9_value
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Azure Monitor SLO
indicator:
metricSource:
name: azure-monitor
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 1
name: ok
target: 0.95
countMetrics:
incremental: true
bad:
azureMonitor:
dataType: logs
workspace:
subscriptionId: 00000000-0000-0000-0000-000000000000
resourceGroup: myResourceGroup
workspaceId: 11111111-1111-1111-1111-111111111111
kqlQuery: |-
AppRequests
| where AppRoleName == "my-app"
| where ResultCode == 0 or ResultCode >= 400
| summarize n9_value = count() by bin(TimeGenerated, 15s)
| project n9_time = TimeGenerated, n9_value
total:
azureMonitor:
dataType: logs
workspace:
subscriptionId: 00000000-0000-0000-0000-000000000000
resourceGroup: myResourceGroup
workspaceId: 11111111-1111-1111-1111-111111111111
kqlQuery: |-
AppRequests
| where AppRoleName == "my-app"
| summarize n9_value = count() by bin(TimeGenerated, 15s)
| project n9_time = TimeGenerated, n9_value
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
Querying the Azure Monitor APIβ
The Nobl9 agent leverages the Azure Monitor Data Plane API to get data.