Azure Monitor managed service for Prometheus
Azure Monitor managed service for Prometheus is a part of Azure Monitor Metrics. It allows collecting Prometheus metrics and analyzing them with Azure Monitor tools.
Integration with Nobl9 lets you collect metrics from Azure Monitor managed service for Prometheus and create SLOs based on them.
Azure Monitor managed service for Prometheus parameters and supported features in Nobl9
- General support:
- Release channel: Beta
- Connection method: Agent, Direct
- Replay and SLI Analyzer: Supported
- Event logs: Supported
- Query checker: Not supported
- Query parameters retrieval: Supported
- Timestamp cache persistence: Supported
- Query parameters:
- Query interval: 1 min
- Query delay: 0
- Jitter: 15 sec
- Timeout: 30 sec
- Agent details and minimum required versions for supported features:
- Environment variable:
PROM_QUERY_DELAY
- Plugin name:
n9prometheus
- Replay and SLI Analyzer:
0.78.0-beta
- Maximum historical data retrieval period:
30 days
- Query parameters retrieval:
0.78.0-beta
- Timestamp cache persistence:
0.78.0-beta
- Custom HTTP headers:
0.83.0-beta
- Additional notes:
- Support for Prometheus metrics
- Learn more
Authenticationβ
To query an Azure Monitor workplace, authenticate with your Microsoft Entra ID with client_id
and client_secret
.
For this:
- Register an Azure application with Microsoft Entra ID.
- Assign your application the Monitoring Data Reader role to your required Azure Monitor workspace.
This role meets the Nobl9 requirements for metric collection.
You can also use sloctl
.
This way, you can configure SLOs for your Azure Cloud application without the resource and metric autocompletion.
We recommend granting the Monitoring Data Reader role on the subscription or resource group level rather than a specific resource. A broader scope provides a more comprehensive choice of subscriptions, resource groups, resources, and metrics in the Nobl9 platform.
URLβ
The Azure Monitor managed service for Prometheus agent requests the Range queries API endpoint in the /api/v1/query_range
form. For example:
GET /api/v1/query_range
POST /api/v1/query_range
Omit the /api/v1/query_range
API path from the URL. Specify only the base URL for
your Prometheus server.
For example, if your Prometheus server is available under <http://prometheus.example.com>
and
you access API via <http://prometheus.example.com/api/v1>
, then use only the <http://prometheus.example.com>
part.
Other APIs or Web UIs have similar path endings. Omit them in the URL: for example, the /graph
part of the path.
This integration focuses on querying the Prometheus servers, not fetching metrics directly from services.
Avoid using URLs pointing to service endpoints
that expose data in the Prometheus format
(often under the /metrics
path).
Learn about how to find a query endpoint.
Adding Azure Monitor managed service for Prometheus as a data sourceβ
You can connect the Azure Monitor managed service for Prometheus data source using the direct or agent connection methods.
Direct connection methodβ
Nobl9 Webβ
- Navigate to Integrations > Sources.
- Click .
- Click the required Source button.
- Choose Direct.
-
Enter the URL of your required base Prometheus server.
-
Enter your Azure Tenant ID.
It is an8-4-4-4-12
-character code containing digits0-9
and lettersAa-Ff
. -
Enter your Microsoft Entra Client ID and Client Secret.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Azure Monitor managed service for Prometheus integration for Query delay is
0 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Azure Monitor managed service for Prometheus integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on this data source.
- The maximum period value depends on the data source.
Find the maximum value for your data source. - A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
sloctlβ
-
Create a YAML resource definition for your Azure Monitor managed service for Prometheus using the provided template.
-
Run
sloctl apply
to proceed.
- apiVersion: n9/v1alpha
kind: Direct
metadata:
# Optional
displayName: Azure Prometheus data source (direct connection)
name: azure-prometheus-data-source
project: my-project
spec:
azurePrometheus:
# Replace the clientId and clientSecret placeholders with your Microsoft Entra ID credentials
clientId: "<YOUR_CLIENT_ID>" # secret
clientSecret: "<YOUR_CLIENT_SECRET>" # secret
tenantId: <YOUR_AZURE_TENANT_ID>
url: "<YOUR_PROMETHEUS_SERVER_URL>"
# Replay configuration
historicalDataRetrieval:
maxDuration:
# Numeric, up to 30 days for Azure monitor managed service for Prometheus
value: 30
unit: Day
defaultDuration:
value: 15
unit: Day
# The default query delay value for Azure Monitor managed service for Prometheus is 0 seconds
queryDelay:
value: 0
unit: Second
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
azurePrometheus.clientId mandatory | string, secret | Your Microsoft Entra ID client ID. |
azurePrometheus.clientSecret mandatory | string, secret | Your Microsoft Entra ID client secret. |
azurePrometheus.tenantID mandatory | string | The identifier of your Microsoft Entra tenant. |
azurePrometheus.url mandatory | string | Base URL to Prometheus server. See authentication section above for more details. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
Agent connection methodβ
Use the agent method to run an agent alongside your Prometheus server. Once connected, the agent will periodically connect to Nobl9 using an outbound connection.
Nobl9 Webβ
To connect Azure Monitor managed service for Prometheus, do the following:
- Navigate to Integrations > Sources.
- Click .
- Click the required Source button.
- Choose Agent.
-
Enter the URL of your required base Prometheus server.
-
Enter your Azure Tenant ID.
It is an8-4-4-4-12
-character code containing digits0-9
and lettersAa-Ff
.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Azure Monitor managed service for Prometheus integration for Query delay is
0 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Azure Monitor managed service for Prometheus integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on this data source.
- The maximum period value depends on the data source.
Find the maximum value for your data source. - A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
sloctlβ
-
Create a YAML resource definition for your Azure Monitor managed service for Prometheus using the provided template.
-
Run
sloctl apply
to proceed.
- apiVersion: n9/v1alpha
kind: Agent
metadata:
name: azure-prometheus-data-source
# Optional
displayName: Azure Prometheus data source (agent connection)
project: my-project
spec:
# Optional
description: My sample Azure Prometheus data source (agent connection)
# Enum: stable || beta. Currently, available in beta only
releaseChannel: beta
azurePrometheus:
url: "<YOUR_PROMETHEUS_SERVER_URL>"
tenantId: "<YOUR_AZURE_TENANT_ID>"
# Replay configuration
historicalDataRetrieval:
maxDuration:
# Numeric, up to 30 days for Azure monitor managed service for Prometheus
value: 30
unit: Day
defaultDuration:
value: 15
unit: Day
# The default query delay value for Azure Monitor managed service for Prometheus is 0 seconds
queryDelay:
value: 0
unit: Second
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
azurePrometheus.url mandatory | string | Base URL to Prometheus server. See authentication section above for more details. |
azurePrometheus.tenantID mandatory | string | The identifier of your Microsoft Entra tenant. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
Agent deploymentβ
When you add a data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available on the Nobl9 Web, under the Agent Configuration section.
Swap in your credentials
(e.g., replace the <YOUR_AZURE_APPLICATION_CLIENT_ID>
and <YOUR_AZURE_APPLICATION_CLIENT_SECRET>
with your client ID and client secret).
- Kubernetes
- Docker
To deploy the created agent to a Kubernetes cluster, do the following:
- Create a YAML config file using the provided template.
- Run
sloctl apply
to proceed.
The agent facilitates Nobl9 to import your service metrics.
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description.
apiVersion: v1
kind: Secret
metadata:
name: azure-prometheus-data-source
namespace: default
type: Opaque
stringData:
azure_client_id: "<AZURE_MONITOR_CLIENT_ID>"
azure_client_secret: "<AZURE_MONITOR_CLIENT_SECRET>"
client_id: "<NOBL9_CLIENT_ID>"
client_secret: "<NOBL9_CLIENT_SECRET>"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-prometheus-data-source
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "azure-prometheus-agent"
nobl9-agent-project: "my-project"
nobl9-agent-organization: "my-organization"
template:
metadata:
spec:
containers:
- name: agent-container
image: nobl9/agent:0.88.0-beta
resources:
requests:
memory: "700Mi"
cpu: "0.2"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: azure-prometheus-data-source
- name: N9_AZURE_MONITOR_CLIENT_ID
valueFrom:
secretKeyRef:
key: azure_client_id
name: azure-prometheus-data-source
- name: N9_AZURE_MONITOR_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: azure_client_secret
name: azure-prometheus-data-source
- name: N9_INTAKE_URL
value: "<YOUR_VALUE>"
- name: N9_QUERYENGINE_URL
value: "<YOUR_VALUE"
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: azure-prometheus-data-source
- name: N9_AUTH_SERVER
value: "<YOUR_VALUE>"
- name: N9_OKTA_ORG_URL
value: "<YOUR_VALUE>"
- name: N9_METRICS_PORT
value: "9090"
- name: N9_NATS_URL
# A wss:// URL
value: "<YOUR_VALUE>"
Important notes:
- The
N9_METRICS_PORT
is a variable specifying the port to which the/metrics
and/health
endpoints are exposed. 9090
is the default value and can be changed.- If you donβt want the metrics to be exposed, comment out or delete the
N9_METRICS_PORT
variable. N9_DIAGNOSTIC_QUERY_LOG_SAMPLE_INTERVAL_MINUTES
sets the frequency of log emission for SLI Analyzer. The default value is10
. To deactivate log emission, set this value to0
.
To deploy the agent, run the provided Docker command, replacing the placeholder values with your actual values.
The agent facilitates Nobl9 to import your service metrics.
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
docker run -d --restart on-failure \
--name azure-prometheus-data-source \
-e N9_INTAKE_URL="<YOUR_VALUE>" \
-e N9_QUERYENGINE_URL="<YOUR_VALUE>" \
-e N9_OKTA_ORG_URL="<YOUR_VALUE>" \
-e N9_AUTH_SERVER="<YOUR_VALUE>" \
-e N9_CLIENT_SECRET="<YOUR_VALUE>" \
-e N9_METRICS_PORT=9090 \
-e N9_NATS_URL="<YOUR_VALUE>" \
-e N9_CLIENT_ID="<YOUR_VALUE>" \
-e N9_AZURE_MONITOR_CLIENT_ID="<AZURE_MONITOR_CLIENT_ID>" \
-e N9_AZURE_MONITOR_CLIENT_SECRET="<AZURE_MONITOR_CLIENT_SECRET>" \
nobl9/agent:0.88.0-beta
Important notes:
- The
N9_METRICS_PORT
is a variable specifying the port to which the/metrics
and/health
endpoints are exposed. 9090
is the default value and can be changed.- If you donβt want the metrics to be exposed, comment out or delete the
N9_METRICS_PORT
variable. N9_DIAGNOSTIC_QUERY_LOG_SAMPLE_INTERVAL_MINUTES
sets the frequency of log emission for SLI Analyzer. The default value is10
. To deactivate log emission, set this value to0
.
Creating SLOs with Azure Monitor managed service for Prometheusβ
Nobl9 integration with Azure Monitor managed service for Prometheus supports Prometheus metrics.
You can create SLOs based on Azure Monitor managed service for Prometheus
using the Nobl9 Terraform provider
or applying a YAML definition with sloctl
.
Nobl9 Webβ
Follow the instructions below to create your SLOs with Azure Monitor managed service for Prometheus on the Nobl9 Web:
-
Navigate to Service Level Objectives.
-
Click .
Step 1: Select the service the SLO will be associated with.
Step 2:
- Select your Azure Monitor managed service for Prometheus data source.
- Configure Replay: set the Period for historical data retrieval.
It can be0
or a positive integer up to30
. - Specify Metric and enter the PromQL query:
- Threshold metric
- Ratio metric
The threshold metric evaluates a single time series against a threshold value you set.
4. Enter the query. For example: sum(rate(prometheus_http_requests_total{code=~"^2.*"}[1h]))
With the ratio metric, you enter two-time series for comparison. It also requires specifying the ratio metric type.
4. Select the Data count method:
- Non-incremental counts incoming data points one-by-one. As a result, the SLO chart is pike-shaped.
- Incremental counts incoming data points incrementally, adding every next value to the previous values. It results in a constantly increasing SLO chart.
- Good query:
sum(rate(prometheus_http_requests_total{code=~"^2.*"}[1h]))
- Total query:
sum(rate(prometheus_http_requests_total[1h]))
countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
Step 3: define a Time Window for your SLO.
- Rolling time windows are better for tracking the recent user experience of a service.
- Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.
Step 4: specify the Error Budget Calculation Method and your Objective(s).
- Occurrences method counts good attempts against the count of total attempts.
- Time Slices method measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
See the use case example and the SLO calculations guide for more information on the error budget calculation methods.
Step 5: add the Display name, Name, and other settings for your SLO:
- Set notification on data, if this option is available for your data source.
When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes. - Add alert policies, labels, and links, if required.
You can add up to 20 links per SLO.
Click Create SLO.
sloctlβ
Azure Monitor managed service for Prometheus is case-insensitive.
Refer to the YAML SLO reference for details.
- rawMetric
- countMetric good over total
- countMetric bad over total
Hereβs an example of Azure Monitor managed service for Prometheus Metrics using a rawMetric
(the threshold metric):
# Metric type: threshold
# Budgeting method: Occurrences
# Time window type: Calendar
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-threshold-slo
# Optional
displayName: My threshold SLO based on Azure Monitor managed service for Prometheus
project: my-project
# Optional
labels:
key-1:
- value-1
- value-2
key-2:
- value-1
- value-2
# Optional
annotations:
key-1: value-1
key-2: value-1
spec:
# Optional
description: My sample threshold SLO based on Azure Monitor managed service for Prometheus
indicator:
metricSource:
name: azure-prometheus-data-source
project: my-project
kind: Direct
# Enum: Occurrences || Timeslices
budgetingMethod: Occurrences
objectives:
# Optional
- displayName: My objective 1
value: 200.0
name: my-objective
target: 0.95
rawMetric:
query:
azurePrometheus:
promql: |-
sum((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m])
- on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"}))
* -1 >0)
op: lte
primary: true
service: my-service
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
# Optional. Up to 20 alert policies per SLO
alertPolicies:
- my-alert-policy
# Optional
attachments:
- url: https://my-url.com
displayName: My URL
# Optional, beta functionality
anomalyConfig:
noData:
alertMethods:
- name: my-alert-method
project: my-project
Hereβs an example of Azure Monitor managed service for Prometheus Metrics using a countMetric
(the good over total ratio):
# Metric type: good over total
# Budgeting method: Occurrences
# Time window type: Calendar
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-ratio-slo-good-over-total
# Optional
displayName: My ratio SLO good over total
project: my-project
# Optional
labels:
key-1:
- value-1
- value-2
key-2:
- value-1
- value-2
# Optional
annotations:
key-1: value-1
key-2: value-1
spec:
# Optional
description: My sample good-over-total ratio SLO based on Azure Monitor managed service for Prometheus
indicator:
metricSource:
name: azure-prometheus-data-source
project: my-project
kind: Direct
# Enum: Occurrences || Timeslices
budgetingMethod: Occurrences
objectives:
# Optional
- displayName: My objective 1
value: 1.0
name: my-objective-1
target: 0.95
countMetrics:
# Boolean
incremental: true
good:
azurePrometheus:
promql: sum(api_server_requests_total{code="2xx"})
total:
azurePrometheus:
promql: sum(api_server_requests_total{})
primary: true
service: my-service
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
# Optional. Up to 20 alert policies per SLO
alertPolicies:
- my-alert-policy
# Optional
attachments:
- url: https://my-url.com
displayName: My URL
# Optional, beta functionality
anomalyConfig:
noData:
alertMethods:
- name: my-alert-method
project: my-project
Hereβs an example of Azure Monitor managed service for Prometheus Metrics using a countMetric
(the bad over total ratio):
# Metric type: bad over total
# Budgeting method: Occurrences
# Time window type: Calendar
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: my-ratio-slo-bad-over-total
# Optional
displayName: My ratio SLO bad over total
project: my-project
# Optional
labels:
key-1:
- value-1
- value-2
key-2:
- value-1
- value-2
# Optional
annotations:
key-1: value-1
key-2: value-1
spec:
# Optional
description: My sample bad-over-total ratio SLO based on Azure Monitor managed service for Prometheus
indicator:
metricSource:
name: azure-prometheus
project: my-project
kind: Direct
# Enum: Occurrences || Timeslices
budgetingMethod: Occurrences
objectives:
# Optional
- displayName: My objective 1
value: 1.0
name: my-objective-1
target: 0.95
countMetrics:
# Boolean
incremental: true
good:
azurePrometheus:
promql: sum(api_server_requests_total{code="5xx"})
total:
azurePrometheus:
promql: sum(api_server_requests_total{})
primary: true
service: my-service
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
# Optional. Up to 20 alert policies per SLO
alertPolicies:
- my-alert-policy
# Optional
attachments:
- url: https://my-url.com
displayName: My URL
# Optional
anomalyConfig:
noData:
alertMethods:
- name: my-alert-method
project: my-project
Querying the Azure Monitor managed service for Prometheus APIβ
The Nobl9 agent leverages the Prometheus API parameters. It pulls data at a per-minute interval from the Prometheus server.