Prometheus
Prometheus is an open-source software application used for event monitoring and alerting. It records real-time metrics in a time series database built using an HTTP pull model, with flexible query language and real-time alerting.
Prometheus parameters and supported features in Nobl9
- General support:
- Release channel: Stable, Beta
- Connection method: Agent
- Replay and SLI Analyzer: Supported
- Event logs: Not supported
- Query checker: Not supported
- Query parameters retrieval: Supported
- Timestamp cache persistence: Supported
- Query parameters:
- Query interval: 1 min
- Query delay: 0
- Jitter: 15 sec
- Timeout: 30 sec
- Agent details and minimum required versions for supported features:
- Environment variable:
PROM_QUERY_DELAY
- Plugin name:
n9prometheus
- Replay and SLI Analyzer:
0.65.0
- Maximum historical data retrieval period:
30 days
- Query parameters retrieval:
0.73.2
- Timestamp cache persistence:
0.65.0
- Custom HTTP headers:
0.83.0-beta
Authenticationβ
Prometheus does not provide an authentication layer, the Nobl9 agent only collects the URL for the Prometheus integration definition. Authentication is up to the user. Operators are expected to run an authenticating reverse proxy in front of their services, such as NGINX using basic auth or an OAuth2 proxy.
URLβ
Prometheus agent makes requests to Range Queries | Prometheus documentation API endpoint in the form /api/v1/query_range
. For example:
GET /api/v1/query_range
POST /api/v1/query_range
Hence, do not include the above API path in the URL. Specify only the base URL for
the Prometheus server. For example, if your Prometheus server is available under <http://prometheus.example.com>
and you access API via <http://prometheus.example.com/api/v1>
, use only <http://prometheus.example.com>
.
Other APIs or Web UIs have similar path endings, which should also be omitted, for example, the /graph
part of the path.
The Prometheus integration does not integrate directly with data exposed from services in the Prometheus Format | Prometheus documentation, usually under /metrics
path. Do not set the URL to metrics exposed directly from such a service.
Basic authenticationβ
Since Prometheus does not provide an authentication layer, the authentication method is up to the users. Normally, Loki's users are expected to run an authenticating reverse proxy in front of their services, such as NGINX
using basic_auth
proxy.
If that's the method you use, the Nobl9 agent version equal to or higher than 0.40.0, allows you to send an additional Authorization request header with the basic_auth
. Refer to the section below for more details.
Bearer token authenticationβ
You can also authenticate the Nobl9 Prometheus agent using bearer token. If you choose this method, you need to specify the variables for AUTH_METHOD
and BEARER_TOKEN
when deploying your Prometheus agent in Docker or Kubernetes. Refer to the section below for more details.
Adding Prometheus as a data sourceβ
You can add the Prometheus data source using the agent connection method.
Nobl9 Webβ
Follow the instructions below to create your Prometheus agent connection:
- Navigate to Integrations > Sources.
- Click .
- Click the required Source button.
- Choose Agent.
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel can change.
- The
-
Add the URL to connect to your data source (mandatory).
Refer to the Authentication section for more details.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Prometheus integration for Query delay is
0 seconds
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Prometheus integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on this data source.
- The maximum period value depends on the data source.
Find the maximum value for your data source. - A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
sloctlβ
The YAML for setting up an agent connection to Prometheus looks like this:
apiVersion: n9/v1alpha
kind: Agent
metadata:
name: prometheus-agent
displayName: Prometheus agent
project: default
spec:
description: Agent settings for Prometheus datasource
sourceOf:
- Metrics
- Services
releaseChannel: stable
queryDelay:
unit: Minute
value: 720
prometheus:
url: http://prometheus.example.com
historicalDataRetrieval:
maxDuration:
value: 30
unit: Day
defaultDuration:
value: 0
unit: Day
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
prometheus.url mandatory | string | Base URL to Prometheus server. See authentication section above for more details. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
You can deploy only one agent in one YAML file by using the sloctl apply
command.
Agent deploymentβ
When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available in the Nobl9 Web, under the Agent Configuration section. Be sure to swap in your credentials.
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent. It will look something like this:
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-default-name
namespace: default
type: Opaque
stringData:
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-default-name
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus-agent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "prometheus-agent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.82.2
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
# Optional environment variable
# Use if you want Cortex to take the tenant ID from a header X-Scope-OrgID on each request of the Nobl9 Agent
# Replace the <X-Scope-OrgID> value with your X-Scope-OrgID
- name: PROMETHEUS_X_SCOPE_ORG_ID
value: <X-Scope-OrgID>
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-default-name
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-default-name
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the agent. It will look something like this:
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply command, and you will need to replace the placeholder values with your own values.
docker run -d --restart on-failure \
--name agent-prometheus \
-e N9_CLIENT_ID="unique_client_id" \
-e N9_CLIENT_SECRET="unique_client_secret" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
nobl9/agent:0.82.2
Basic authenticationβ
To activate basic_auth for the agent, you need to pass optional environmental variables to an agent:
AUTH_METHOD: basic_auth
- is a fixed value, but it must be passed to let know agent thatbasic_auth
will be used.-
USERNAME: REDACTED
- username forbasic_auth
. -
PASSWORD: REDACTED
- password forbasic_auth
.
-
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent using basic_auth
method. It will look something like this:
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: agent-prometheus
namespace: default
type: Opaque
stringData:
client_id: "REDACTED"
client_secret: "REDACTED"
basic_auth_username: "REDACTED"
basic_auth_password: "REDACTED"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: agent-prometheus
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "my-project"
nobl9-agent-organization: "my-organization"
template:
metadata:
labels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "my-project"
nobl9-agent-organization: "my-organization"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.82.2
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: agent-prometheus
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: agent-prometheus
- name: AUTH_METHOD
value: "basic_auth"
- name: USERNAME
valueFrom:
secretKeyRef:
key: basic_auth_username
name: agent-prometheus-with-basic-auth
- name: PASSWORD
valueFrom:
secretKeyRef:
key: basic_auth_password
name: agent-prometheus-with-basic-auth
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the agent with the basic_auth
method. It will look something like this:
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply command, and you will need to replace the placeholder values with your own values.
docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-stable-prometheus \
-e N9_CLIENT_SECRET="REDACTED" \
-e N9_CLIENT_ID="REDACTED" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
-e AUTH_METHOD="basic_auth" \
-e USERNAME="REDACTED" \
-e PASSWORD="REDACTED" \
nobl9/agent:0.82.2
Bearer token authenticationβ
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent using bearer_token
method. It will look something like this:
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: agent-prometheus
namespace: default
type: Opaque
stringData:
client_id: "REDACTED"
client_secret: "REDACTED"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: agent-prometheus
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "my-project"
nobl9-agent-organization: "my-organization"
template:
metadata:
labels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "my-project"
nobl9-agent-organization: "my-organization"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.82.2
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: agent-prometheus
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: agent-prometheus
- name: AUTH_METHOD
value: "bearer_token"
- name: BEARER_TOKEN
value: "/path/to/file"
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the agent with the bearer_token
method. It will look something like this:
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply command, and you will need to replace the placeholder values with your own values.
docker run -d --restart on-failure \
--name agent-prometheus \
-e N9_CLIENT_SECRET="REDACTED" \
-e N9_CLIENT_ID="REDACTED" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
-e AUTH_METHOD="bearer_token"
-e BEARER_TOKEN="path-to-file-with-token"
nobl9/agent:0.82.2
Creating SLOs with Prometheusβ
Nobl9 Webβ
Follow the instructions below to create your SLOs with Prometheus on the Nobl9 Web:
-
Navigate to Service Level Objectives.
-
Click .
Step 1: Select the service the SLO will be associated with.
Step 2:
- Select your Prometheus data source.
- Specify Metric and enter the query:
- Threshold metric
- Ratio metric
The threshold metric evaluates a single time series against a threshold value you set.
3. Enter the query. For example: myapp_server_requestMsec{host="*",job="nginx"}
With the ratio metric, you enter two-time series for comparison. It also requires specifying the ratio metric type.
3. Select the Data count method:
- Non-incremental counts incoming data points one-by-one. As a result, the SLO chart is pike-shaped.
- Incremental counts incoming data points incrementally, adding every next value to the previous values. It results in a constantly increasing SLO chart.
- Good query:
sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",status=~"2..|3..",le="1"})
- Total query:
sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",le="+Inf"})
countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
Step 3: define a Time Window for your SLO.
- Rolling time windows are better for tracking the recent user experience of a service.
- Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.
Step 4: specify the Error Budget Calculation Method and your Objective(s).
- Occurrences method counts good attempts against the count of total attempts.
- Time Slices method measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
See the use case example and the SLO calculations guide for more information on the error budget calculation methods.
Step 5: add the Display name, Name, and other settings for your SLO:
- Set notification on data, if this option is available for your data source.
When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes. - Add alert policies, labels, and links, if required.
You can add up to 20 links per SLO.
Click Create SLO.
sloctlβ
- rawMetric
- countMetric
Hereβs an example of Prometheus using a rawMetric
(threshold metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
displayName: My SLO
name: my-slo
project: my-project
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: prometheus
service: my-service
objectives:
- target: 0.8
op: lte
rawMetric:
query:
prometheus:
promql: myapp_server_requestMsec{host="*",job="nginx"}
displayName: average
value: 200
- target: 0.5
op: lte
rawMetric:
query:
prometheus:
promql: myapp_server_requestMsec{host="*",job="nginx"}
displayName: so-so
value: 150
timeWindows:
- calendar:
startTime: "2020-11-14 11:00:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
Hereβs an example of Prometheus using a countMetric
(ratio metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
displayName: My SLO
name: my-slo
project: my-project
spec:
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
name: prometheus
service: my-service
objectives:
- target: 0.75
countMetrics:
good:
prometheus:
promql: sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",status=~"2..|3..",le="1"})
incremental: true
total:
prometheus:
promql: sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",le="+Inf"})
name: my-objective
timeSliceTarget: 0.75
value: 1
timeWindows:
- calendar:
startTime: "2020-11-14 11:00:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
Specification for metric from Prometheus always has one mandatory field:
promql
β a Prometheus query in the language called PromQL | Prometheus documentation (Prometheus Query Language) that lets the user select and aggregate time series data in real time.
Querying the Prometheus serverβ
The Nobl9 agent leverages the Prometheus API parameters. It pulls data at a per-minute interval from the Prometheus server.
Useful linksβ
Cortex support with Nobl9 Prometheus agentβ
Cortex | Cortex documentation is a database based on Prometheus with compatible API. Therefore, it is possible to use Cortex with the Nobl9 Prometheus agent.
Cortex cluster setup is out of the scope of this document and is described in the Cortex documentation. Cortex deployment can be simplified with the official Helm chart.
As described in Cortex Architecture | Cortex documentation, Prometheus API is exposed by the Nginx under default address <http://cortex-nginx/prometheus
.> This address can be used as Prometheus URL in the agent configuration panel. The default Prometheus endpoint can be changed according to the API documentation | Cortex documentation. In that case, the agent needs to access the /api/v1/query_range
endpoint.
If you want the Nobl9 agent to support a multi-tenancy deployment mode in Cortex, use the following environment variable while deploying Nobl9 agent in Kubernetes (see section above):
env:
- name: PROMETHEUS_X_SCOPE_ORG_ID
value: <X-Scope-OrgID>
of in your Docker deployment:
docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-stable-prometheus \
-e PROMETHEUS_X_SCOPE_ORG_ID="<X-Scope-OrgID>"
Grafana Cloud support with Nobl9 Prometheus agentβ
Grafana Cloud is an observability platform that leverages Prometheus by directly interacting with the Prometheus HTTP API | Prometheus documentation. Therefore, it is possible to use Grafana Cloud solution with the Nobl9 Prometheus agent.
To use Grafana Cloud with Prometheus, you must authenticate your Prometheus agent with the basic_auth
proxy. Refer to the section above for more details.
As described in Analyzing metrics usage with the Prometheus API | Grafana Cloud documentation, Prometheus API is exposed through the /api/prom/api/v1/query_range
endpoint which is accessed by the Nobl9 agent.
To use Grafana Cloud with Nobl9,
you need
to append /api/prom/
to the end of the URL
you configure your Grafana Source in the Data source wizard for a regular Prometheus data integration.
Thus, instead of http://HOST/
, you need to enter http://HOST/api/prom/
in the Data source wizard.
For more details, check Grafana Cloud documentation.
Thanos direct with Nobl9 Prometheus agentβ
Thanos is High Availability Prometheus setup and can be used with Nobl9 Prometheus agent.
Thanos cluster setup is out of the scope of this document and is described in the Thanos Components documentation.
Thanos exposes Prometheus API using Querier. Querier address must be used as Prometheus URL in Nobl9 agent configuration.