Remaining budget

Reading time: 0 minute(s) (0 words)

Each SLO has an error budget, which defines the number of acceptable errors that can occur in a given time window:

In the occurrences method, those errors correlate to actual events that make the SLO burn through the error budget.
In the time slices method, those errors correlate to the number of bad minutes that make the SLO burn through the error budget.

Although the methods use different units (i.e., number of errors for occurrences, minutes for time slices) to describe the error budget allocation, the remaining error budget can be represented as a percentage of the overall allocation.

Time slices vs occurrences

If you have an SLO with a 99.9% availability target and use the occurrences method, the total number of requests in a month will be 10000. You can have up to 10 bad requests in such a month and still meet the target.

It means that the error budget is 10 bad requests. If, at any point in time, you've already experienced 5 bad requests within a specific time window, then the remaining error budget is 50% (5/10).

Similarly, if the SLO calculation method is time slices for the same target and a month with 43200 minutes, the error budget is 43 minutes.

It means that 43 bad requests in such a month would still meet the target, indicating that the error budget is 43 minutes. If, at any point in time, you've already had 21 bad minutes in a time window, then the remaining error budget will be 50% (21/43).

tip

Using the remaining budget conditions in your alert policies is a simple yet effective way to monitor the health of your SLOs.

Consider this type of alerting more reactive than proactive, as it triggers when the error budget has already been consumed, no matter how slow or fast it has happened.

Basic YAML configuration

The following YAML defines an alert policy with the Remaining budget condition:

apiVersion: n9/v1alpha
kind: AlertPolicy
metadata:
  name: budget-below-20
  project: default
spec:
  alertMethods: []
  conditions:
  - measurement: burnedBudget
    value: 0.8
    op: gte
  coolDown: 5m
  description: "Error Budget is nearly exhausted (20%)"
  severity: Medium

note

The remaining budget calculation doesn't rely on the evaluation window, which can be configured using the alertingWindow parameter. This means you can't use alertingWindow in the burnedBudget measurement.

The value for this condition's lastsFor parameter defaults to 0. This configuration will alert you when you reach a specific budget level.

We don't recommend changing this value to <0, as such configuration might unnecessarily delay alerts when your SLO has already reached a specific budget level.

Custom mathematical operators

You can use all available mathematical operators to define the remaining budget condition:

lte - less than or equal to (≤)
gte - greater than or equal to (≥)
lt - less than (<)
gt - greater than (>)

You can find it useful to combine remaining budget conditions with a custom operator with other measurements such as timeToBurnBudget or averageBurnRate. This way, Nobl9 will only alert you when your SLO has used up its entire error budget, or there's no error budget. Check YAML samples below for more details.

Sample configuration
Sample configuration - 2

The following alert policy will be trigerred when there is no budget left and entire budget would be exhausted in 8h and this condition lasts for 15m.

apiVersion: n9/v1alpha
kind: AlertPolicy
metadata:
  name: entire-exhaustion-prediction-8h
  displayName: Entire budget exhaustion in 8h
  labels:
    type:
      - time-exhaustion
spec:
  description: Entire error budget allocation prediction for 99%, 30 Day Rolling.
  severity: Low
  coolDown: "15m"
  conditions:
    - measurement: timeToBurnEntireBudget
      value: "8h"
      lastsFor: "15m"
    - measurement: burnedBudget
      value: 1
      op: gte

The following alert policy will be trigerred when an SLO still has some budget left to burn (lines 18–20), remaining budget would be exhausted in 3d & this condition lasts for 15m (lines 14-19).

apiVersion: n9/v1alpha
kind: AlertPolicy
metadata:
  name: remaining-exhaustion-prediction-3d
  displayName: Remaining budget exhaustion in 3d
  labels:
    type:
      - time-exhaustion
spec:
  description: Remaining error budget Allocation prediction
  severity: Medium
  coolDown: "15m"
  conditions:
    - measurement: timeToBurnBudget
      value: "72h"
      lastsFor: "15m"
    - measurement: burnedBudget
      value: 1
      op: lt

tip

Check YAML guide for default operators used with all alerting conditions.

Time slices vs occurrences​

Basic YAML configuration​

Custom mathematical operators​

Time slices vs occurrences

Basic YAML configuration

Custom mathematical operators