Skip to main content

Prometheus

Reading time: 0 minute(s) (0 words)

Prometheus is an open-source software application used for event monitoring and alerting. It records real-time metrics in a time series database built using an HTTP pull model, with flexible query language and real-time alerting.

Scope of support​

Currently, Prometheus integration supports agent configuration only.

Authentication​

Prometheus does not provide an authentication layer, the Nobl9 agent only collects the URL for the Prometheus integration definition. Authentication is up to the user. Operators are expected to run an authenticating reverse proxy in front of their services, such as NGINX using basic auth or an OAuth2 proxy.

URL​

Prometheus agent makes requests to Range Queries | Prometheus documentation API endpoint in the form /api/v1/query_range. For example:

GET /api/v1/query_range
POST /api/v1/query_range

Hence, do not include the above API path in the URL. Specify only the base URL for the Prometheus server. For example, if your Prometheus server is available under <http://prometheus.example.com> and you access API via <http://prometheus.example.com/api/v1>, use only <http://prometheus.example.com>.

Other APIs or Web UIs have similar path endings, which should also be omitted, for example, the /graph part of the path.

The Prometheus integration does not integrate directly with data exposed from services in the Prometheus Format | Prometheus documentation, usually under /metrics path. Do not set the URL to metrics exposed directly from such a service.

Basic authentication​

Since Prometheus does not provide an authentication layer, the authentication method is up to the users. Normally, Loki's users are expected to run an authenticating reverse proxy in front of their services, such as NGINX using basic_auth proxy.

If that's the method you use, the Nobl9 agent version equal to or higher than 0.40.0, allows you to send an additional Authorization request header with the basic_auth. Refer to the section below for more details.

Bearer token authentication​

You can also authenticate the Nobl9 Prometheus agent using bearer token. If you choose this method, you need to specify the variables for AUTH_METHOD and BEARER_TOKEN when deploying your Prometheus agent in Docker or Kubernetes. Refer to the section below for more details.

Adding Prometheus as a data source​

You can add the Prometheus data source using the agent connection method. Start with these steps:

  1. Navigate to Integrations > Sources.
  2. Click .
    The Data Source wizard opens.
  3. Select Prometheus.

Nobl9 Web​

Follow the instructions below to create your Prometheus agent connection:

  1. Select one of the following Release Channels:
    • The stable channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of a beta release. Use it to avoid crashes and other limitations.
    • The beta channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel may be subject to change.
  2. Add the URL to connect to your data source (mandatory).
    Refer to the Authentication section for more details.

  1. Select a Project.
    Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses the default project.
  2. Enter a Display Name.
    You can enter a user-friendly name with spaces in this field.
  3. Enter a Name.
    The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example, my-project-1). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result.
  4. Enter a Description.
    Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it.
  5. Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
    • The default value in Prometheus integration for Query delay is 0 seconds.
    info
    Changing the Query delay may affect your SLI data. For more details, check the Query delay documentation.
  6. Enter a Maximum Period for Historical Data Retrieval.
    • This value defines how far back in the past your data will be retrieved.
    • The value for the maximum period of data retrieval depends on the data source. Check the Replay documentation for details.
    • A greater period can extend the loading time when creating an SLO.
      • The value must be a positive integer.
  7. Enter a Default Period for Historical Data Retrieval.
    • It is used by SLOs connected to this data source.
    • The value must be a positive integer or 0.
    • By default, this value is set to 0. When you set it to >0, you will create SLOs with Replay.
  8. Click Add Data Source.

sloctl​

The YAML for setting up an agent connection to Prometheus looks like this:

apiVersion: n9/v1alpha
kind: Agent
metadata:
name: prometheus-agent
displayName: Prometheus Agent
project: default
spec:
description: Agent settings for Prometheus datasource
sourceOf:
- Metrics
- Services
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
prometheus:
url: http://prometheus.example.com
historicalDataRetrieval:
maxDuration:
value: 30
unit: Day
defaultDuration:
value: 0
unit: Day
FieldTypeDescription
queryDelay.unit
mandatory
enumSpecifies the unit for the query delay. Possible values: Second | Minute.
β€’ Check query delay documentation for default unit of query delay for each source.
queryDelay.value
mandatory
numericSpecifies the value for the query delay.
β€’ Must be a number less than 1440 minutes (24 hours).
β€’ Check query delay documentation for default unit of query delay for each source.
releaseChannel
mandatory
enumSpecifies the release channel. Accepted values: beta | stable.
Source-specific fields
prometheus.url
mandatory
stringBase URL to Prometheus server. See authentication section above for more details.
Replay-related fields
historicalDataRetrieval
optional
n/aOptional structure related to configuration related to Replay.
❗ Use only with supported sources.
β€’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration.
maxDuration.value
optional
numericSpecifies the maximum duration for historical data retrieval. Must be integer β‰₯ 0. See Replay documentation for values of max duration per data source.
maxDuration.unit
optional
enumSpecifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day.
defaultDuration.value
optional
numericSpecifies the default duration for historical data retrieval. Must be integer β‰₯ 0 and ≀ maxDuration.
defaultDuration.unit
optional
enumSpecifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day.
warning

You can deploy only one agent in one YAML file by using the sloctl apply command.

Agent deployment​

When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available in the Nobl9 Web, under the Agent Configuration section. Be sure to swap in your credentials.

If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent. It will look something like this:

# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.

apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-default-name
namespace: default
type: Opaque
stringData:
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-default-name
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus-agent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "prometheus-agent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.76.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
# Optional environment variable
# Use if you want Cortex to take the tenant ID from a header X-Scope-OrgID on each request of the Nobl9 Agent
# Replace the <X-Scope-OrgID> value with your X-Scope-OrgID
- name: PROMETHEUS_X_SCOPE_ORG_ID
value: <X-Scope-OrgID>
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-default-name
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-default-name
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you don’t want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"

Basic authentication​

To activate basic_auth for the agent, you need to pass optional environmental variables to an agent:

  • AUTH_METHOD: basic_auth - is a fixed value but it must be passed to let know agent that basic_auth will be used.
    • USERNAME: REDACTED - username for basic_auth.

    • PASSWORD: REDACTED - password for basic_auth.

If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent using basic_auth method. It will look something like this:

# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.

apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-stable-prometheus
namespace: default
type: Opaque
stringData:
client_id: "REDACTED"
client_secret: "REDACTED"
basic_auth_username: "REDACTED"
basic_auth_password: "REDACTED"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-stable-prometheus
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "prometheus"
nobl9-agent-organization: "nobl9-dev-stable"
template:
metadata:
labels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "prometheus"
nobl9-agent-organization: "nobl9-dev-stable"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.76.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-stable-prometheus
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-stable-prometheus
- name: AUTH_METHOD
value: "basic_auth"
- name: USERNAME
valueFrom:
secretKeyRef:
key: basic_auth_username
name: nobl9-agent-nobl9-dev-prometheus-with-basic-auth
- name: PASSWORD
valueFrom:
secretKeyRef:
key: basic_auth_password
name: nobl9-agent-nobl9-dev-prometheus-with-basic-auth
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you don’t want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"

Bearer token authentication​

If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent using bearer_token method. It will look something like this:

# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.

apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-stable-prometheus
namespace: default
type: Opaque
stringData:
client_id: "REDACTED"
client_secret: "REDACTED"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-stable-prometheus
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "prometheus"
nobl9-agent-organization: "nobl9-dev-stable"
template:
metadata:
labels:
nobl9-agent-name: "prometheus"
nobl9-agent-project: "prometheus"
nobl9-agent-organization: "nobl9-dev-stable"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.76.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-stable-prometheus
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-stable-prometheus
- name: AUTH_METHOD
value: "bearer_token"
- name: BEARER_TOKEN
value: "/path/to/file"
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you don’t want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"

Creating SLOs with Prometheus​

Nobl9 Web​

Follow the instructions below to create your SLOs with Prometheus in the Nobl9 Web:

  1. Navigate to Service Level Objectives.

  2. Click .
  3. In step 1 of the SLO wizard, select the service the SLO will be associated with.

  4. In step 2, select Prometheus as the data source for your SLO, then specify the Metric. You can choose either a Threshold metric, where a single time series is evaluated against a threshold or a Ratio Metric, which allows you to enter two time series to compare (for example, a count of good requests and total requests).

    1. Choose the Data Count Method for your ratio metric:
      • Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
      • Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
  5. Enter a Query, or Good query and Total query for the metric you selected. The following are query examples:

    • Threshold metric for Prometheus:
      Query: myapp_server_requestMsec{host="*",job="nginx"}

    • Ratio metric for Prometheus:
      Good query: sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",status=~"2..|3..",le="1"})

      Total query: sum(production_http_response_time_seconds_hist_bucket{method=~"GET|POST",le="+Inf"})

      SLI values for good and total
      When choosing the query for the ratio SLI (countMetrics), keep in mind that the values ​​resulting from that query for both good and total:
      • Must be positive.
      • While we recommend using integers, fractions are also acceptable.
        • If using fractions, we recommend them to be larger than 1e-4 = 0.0001.
      • Shouldn't be larger than 1e+20.
  6. In step 3, define a Time Window for the SLO.

  • Rolling time windows are better for tracking the recent user experience of a service.

  • Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.

  1. In step 4, specify the Error Budget Calculation Method and your Objective(s).

    • Occurrences method counts good attempts against the count of total attempts.
    • Time Slicesmethod measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
    • You can define up to 12 objectives for an SLO. For more information, check the Composite SLOs Guide.

    See the use case example and the SLO calculations guide for more information on the error budget calculation methods.

  2. In step 5, add the Display name, Name, and other settings for your SLO:

    • Create a composite SLO
    • Set notification on data, if this option is available for your data source.
      When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes.
    • Add alert policies, labels, and links, if required.
      You can add up to 20 links per SLO.
  3. Click Create SLO.

sloctl​

Here’s an example of Prometheus using a rawMetric (threshold metric):

apiVersion: n9/v1alpha
kind: SLO
metadata:
displayName: prometheus-calendar-occurrences-threshold
name: prometheus-calendar-occurrences-threshold
project: my-prometheus
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: prometheus
service: my-prometheus-slo
objectives:
- target: 0.8
op: lte
rawMetric:
query:
prometheus:
promql: myapp_server_requestMsec{host="*",job="nginx"}
displayName: average
value: 200
- target: 0.5
op: lte
rawMetric:
query:
prometheus:
promql: myapp_server_requestMsec{host="*",job="nginx"}
displayName: so-so
value: 150
timeWindows:
- calendar:
startTime: "2020-11-14 11:00:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day

Specification for metric from Prometheus always has one mandatory field:

  • promql – a Prometheus query in the language called PromQL | Prometheus documentation (Prometheus Query Language) that lets the user select and aggregate time series data in real time.

Querying the Prometheus server​

The Nobl9 agent leverages the Prometheus API parameters. It pulls data at a per-minute interval from the Prometheus server.

Cortex support with Nobl9 Prometheus agent​

Cortex | Cortex documentation is a database based on Prometheus with compatible API. Therefore, it is possible to use Cortex with the Nobl9 Prometheus agent.

Cortex cluster setup is out of the scope of this document and is described in the Cortex documentation. Cortex deployment can be simplified with the official Helm chart.

As described in Cortex Architecture | Cortex documentation, Prometheus API is exposed by the Nginx under default address <http://cortex-nginx/prometheus.> This address can be used as Prometheus URL in the agent configuration panel. The default Prometheus endpoint can be changed according to the API documentation | Cortex documentation. In that case, the agent needs to access the /api/v1/query_range endpoint.

If you want the Nobl9 agent to support a multi-tenancy deployment mode in Cortex, use the following environment variable while deploying Nobl9 agent in Kubernetes (see section above):

          env:
- name: PROMETHEUS_X_SCOPE_ORG_ID
value: <X-Scope-OrgID>

of in your Docker deployment:

docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-stable-prometheus \
-e PROMETHEUS_X_SCOPE_ORG_ID="<X-Scope-OrgID>"

Grafana Cloud support with Nobl9 Prometheus agent​

Grafana Cloud is an observability platform that leverages Prometheus by directly interacting with the Prometheus HTTP API | Prometheus documentation. Therefore, it is possible to use Grafana Cloud solution with the Nobl9 Prometheus agent.

To use Grafana Cloud with Prometheus, you must authenticate your Prometheus agent with the basic_auth proxy. Refer to the section above for more details.

As described in Analyzing metrics usage with the Prometheus API | Grafana Cloud documentation, Prometheus API is exposed through the /api/prom/api/v1/query_range endpoint which is accessed by the Nobl9 agent.

To use Grafana Cloud with Nobl9, you need to append /api/prom/ to the end of the URL you configure your Grafana Source in the Data source wizard for a regular Prometheus data integration. Thus, instead of http://HOST/, you need to enter http://HOST/api/prom/ in the Data source wizard.

For more details, check Grafana Cloud documentation.

Thanos direct with Nobl9 Prometheus agent​

Thanos is High Availability Prometheus setup and can be used with Nobl9 Prometheus agent.

Thanos cluster setup is out of the scope of this document and is described in the Thanos Components documentation.

Thanos exposes Prometheus API using Querier. Querier address must be used as Prometheus URL in Nobl9 agent configuration.

Other​

For a more in-depth look, consult additional resources: