Skip to main content

Budget adjustments
Beta

Reading time: 0 minute(s) (0 words)

Error budgets are invaluable for visualizing trends and understanding your services' performance. However, real-world data isn't always clean or fully representative. Outliers and specific eventsβ€”such as holidays, planned maintenance, or one-time occurrencesβ€”can distort error budget calculations and obscure meaningful insights.

This is where budget adjustments come in. With this feature, you can define exclusions to shield your error budget from events like scheduled downtime or deployments. By applying ad-hoc adjustments or scheduling cyclical events with flexible recurrence rules, you can ensure that these periods do not throw your SLOs off track.

With budget adjustments, you can:

Shield your error budget
Protect your SLOs from expected downtime, such as maintenance or deployments.
Customize your budget adjustment needs
Tailor exclusions to specific needs using flexible scheduling options.
Control your error budget
Exclude past or future periods where SLO compliance isn't required.

Budget adjustments refine your error budget calculations to account for practical realities, helping you maintain accurate insights, focus on long-term trends, and make informed decisions without being misled by temporary fluctuations.

beta feature

Budget adjustment is a beta functionality. Currently, you can only apply it using sloctl, adjustments API or the Nobl9 Terraform provider.

Overview​

Budget adjustments and RBAC​

Only Organization Admins can apply, update and delete budget adjustments.

Limits for budget adjustments​

The following limits apply to budget adjustments:

  • You can add up to 30 SLOs to a single budget adjustment definition.
  • You can modify or update up to 30 unique events per SLO in one action.

How budget adjustments work​

Budget adjustments allow you to take control of your error budget by excluding specific events and ensuring more reliable metrics about the performance of your services:

Identify events
Pinpoint specific timeframes or incidents that skew your data.
Apply adjustments
Apply adjustments to the relevant periods in historical data. Use sloctl or adjustments API to apply the budget adjustments to specific SLOs.
Recalculate budgets
Nobl9 automatically adjusts the chart’s values to account for the exclusion, recalculating Error budget and Reliability burn down by not taking Service level indicator values from the adjusted period into account.

Terms of the game​

There are two basic building blocks for handling budget adjustments in Nobl9:

Check the section below to understand the difference between them and how both terms are related.

Budget adjustment definition​

A budget adjustment definition refers to kind: BudgetAdjustment, specified in a .yaml format and managed through sloctl or the Nobl9 Terraform provider. This definition establishes the parameters for how budget adjustments are applied to SLO(s). Specifically, it outlines:

  • Time period: Specifies when the adjustment is active, including start and end times.
  • Recurrence: Determines whether the adjustment is a one-time event or repeats over time.
  • Target SLOs: Lists the specific SLOs to which the adjustment applies (the adjustment applies to all objectives of the specified SLO).

A definition of a budget adjustment serves as the source for generating adjustment events. In other words, the budget adjustment definition passed in YAML has a one-to-many relationship with its associated events, meaning a single definition can result in multiple adjustment events.

Budget adjustment event​

An adjustment event is a singular occurrence of an adjustment triggered for a particular SLO based on the parameters defined in a budget adjustment definition.

Adjustment events are not self-standing and can't exist independently of the definition. They are always derived from a budget adjustment definition, and the defined adjustment events are applied to all objectives within one SLO during the specified periods.

Adding budget adjustments​

You can add budget adjustment events using YAML definition in sloctl:

General YAML sample for kind: BudgetAdjustment
apiVersion: n9/v1alpha
kind: BudgetAdjustment
metadata:
name: string # Mandatory
displayName: string # Optional
spec:
description: string # Optional
firstEventStart: YYYY-MM-DDThh:mm:ssZ # Mandatory, defined start date-time point
duration: 1h
rrule: FREQ=DAILY;INTERVAL=1 # Optional
filters:
slos:
- name: string # Mandatory
project: string # Mandatory

FieldTypeDescription
spec.firstEventStart
mandatory
stringScheduled start time for the first adjustment event. The expected value must be a string representing the date and time in the RFC3339 format. firstEventStart is equivalent to to the DSTART property in iCal.
Constraints:
β€’ firstEventStart can be at most 30 days in the past.
spec.duration
mandatory
stringThe duration of the budget adjustment event.
Constraints:
β€’ The expected value for this field is a string formatted as a time duration
β€’ The duration must be defined with a precision of 1 minute
β€’ Example: 1h10m
spec.rrule
optional
stringThe iCalendar recurrence rule for the budget adjustment event.
Constraints:
β€’ The expected value is a string in the iCal RRULE format
β€’ Example: FREQ=MONTHLY;BYMONTHDAY=1
β€’ Use RRULE calculator to create the desired recurrence rule
β€’ rrule can't be applied to past events.
spec.filters.slos[]
mandatory
stringA list of SLOs that will be attached to the budget adjustment event. spec.filters.slos[n].name and spec.filters.slos[n].project are mandatory for each list item.
note

If you apply BudgetAdjustment with firstEventStart in the past and defined rrule, sloctl will return an error:

Error: Validation failed: Cannot apply BudgetAdjustment. firstEventStart is in the past, and RRULE cannot be applied to past events.

Types of budget adjustment events​

There can be three types of budget adjustment based on the last timestamp received by an SLO.

Let's assume that the last data point was received around Apr 24, 09:25:

Past adjustment events
These are budget adjustment events that were completed before Apr 24, 09:25 (a and b in the image below). Any changes to these events must be handled individually through the adjustments API.
Ongoing adjustment events
Budget adjustment events have already started before Apr 24, 09:25, but they haven't finished yet (c in the image below).

Any changes made to their definition may affect their duration, either by shortening or extending it. However, the minimum time to which they'll be shortened will be the time already processed for the budget adjustment event. The start time of these events will remain unaffected by any changes made in their definition.

Future adjustment events
These budget adjustment events will start after Apr 24, 09:25 (d in the image below). All changes made in the budget adjustment definition will affect them.
adjustment on the rbd chart
Image 4: An SLO with past (a, b), ongoing (c), and future (d) adjustment events

Actions applicable to budget adjustments​

Select a method type from the drop-down menu to see how you can manage your budget adjustment events and definitions:
Create
Period
As a Nobl9 user I want to
Type
sloctl
API
Terraform
Additional info
Future
Exclude a future adjustment event.
Definition
- - -
Future
Exclude multiple recurring future adjustment events.
Definition
- - -
Past
Exclude a past event.
Definition
- - -
Past
Exclude recurring past adjustment events (e.g., previous two weekends).
Event
Workaround: To exclude a recurring past period, e.g. the previous two weekends, user should create two separate budget adjustment definitions, one for each weekend.
editing non-recurring past adjustment events

When working with non-recurring past adjustment definitions, keep in mind the following:

Non-recurring adjustments trigger historical events with defined start and end dates. If an adjustment event’s start and end dates are fully in the past, editing the associated adjustment definition is restricted to avoid unintentional modifications to historical data.

To modify or delete such events, use these sloctl commands:

  • Update the event: sloctl budgetadjustment event update
  • Delete the event: sloctl budgetadjustment event delete

Recurrence rule format​

Using spec.rrule you can create one-time adjustments for ad hoc needs or define a rule for predictable events that happen regularly. The spec.rrule field follows iCalendar specification.

The format of the rrule field consists of key-value pairs separated by semicolons (;). Each key-value pair specifies a parameter of the recurrence rule. Nobl9 supports all iCalendar rules outlied in the iCalendar documentation.

Example:

Budget adjustment will repeat every three days for a total of 10 occurrences.
apiVersion: n9/v1alpha
kind: BudgetAdjustment
...
spec:
rrule: FREQ=DAILY;INTERVAL=3;COUNT=10
tip

Use the rrule generator to create a recurrence rule suited to your needs.

Deeper dive
Want to know more?

The FREQ value in the rrule definition specifies the frequency of the adjustment event. The value can be one of the following:

  • HOURLY
  • DAILY
  • WEEKLY
  • MONTHLY
  • YEARLY

The INTERVAL value specifies the interval between each recurrence. The value is an integer representing the number of units of the frequency type. For example, if FREQ=DAILY and INTERVAL=2, the event occurs every two days.

You can also include additional parameters such as BYDAY, BYMONTH, BYSETPS, for more complex recurrence patterns.

User experience​

Impact on SLI data​

Budget adjustment events don't affect SLI data. When the budget adjustment is active, Nobl9 collects data points and displays them on the SLI chart.

You can see the budget adjustment event on the SLI chart, marked by an annotation with the the icon. When you hover over the budget adjustment area, you can see the collected data points:

adjustment on sli chart
Image 1: SLI chart with a marked budget adjustment event

When you hover over the Reliability burn down and the Error budget burn rate charts, you can see data gaps in the budget adjustment event's area:

adjustment on the rbd chart
Image 2: Reliability burn down chart marked with adjustment event
adjustment on the rbd chart
Image 3: Burn rate chart marked with adjustment event

Adjustments and replay playlists​

When working with adjustments and replay processes, note that only one calculation can be performed per SLO at a time. Any new requests involving the same SLO will be queued and executed sequentially:

  1. Replay and adjustment conflict:

    • If a replay is running for SLO X and the user creates an adjustment for the same SLO X, the adjustment will be queued and will only begin once the replay is complete.
  2. Adjustment and Replay conflict:

    • If an adjustment is running for SLO X and the user initiates a replay for the same SLO X, the replay will be queued and will only start after the adjustment is complete.
  3. Multiple adjustments conflict:

    • If an adjustment is already running for SLO X and another adjustment is created for the same SLO X, the second adjustment will be queued and will only begin after the first adjustment is finished.

This ensures calculations are processed in the correct order without conflicts or data inconsistencies.

Managing adjustments for SLOs​

To maintain accurate SLO tracking, you may need to exclude certain events or recurring time windows from error budget calculations. These short use cases show how to set up recurring adjustment definitions, manage past adjustments events, and create adjustments for historical events.

Setting up recurring adjustments for known downtime periods​

In cases where downtime is predictable, such as routine maintenance or regular inactive hours, you can define a recurring adjustment to automatically exclude these periods from error budget calculations. This feature allows users to set up a schedule that repeats weekly, monthly, or at custom intervals, preventing the need to create new adjustments manually each time.

Let’s say that the service undergoes routine maintenance every Saturday from 2 a.m. to 4 a.m., during which it is temporarily taken offline. The adjustment definition for SLOs monitoring this service could look like this:

apiVersion: n9/v1alpha
kind: BudgetAdjustment
metadata:
name: maintenance-budget-adjustment
displayName: Maintenance budget adjustment
spec:
description: Budget adjustment event happening weekly on the Saturday for 2 hours.
firstEventStart: 2024-01-01T00:00:00Z
duration: 2h
rrule: FREQ=WEEKLY;INTERVAL=1;BYDAY=SA
filters:
slos:
- name: latency-slo
project: project-alpha
- name: uptime-slo
project: project-alpha
- name: throughput-slo
project: project-alpha

Reviewing and modifying past adjustments for accuracy​

Sometimes, historical adjustment events may need to be modified because of an error in the original exclusion setup or a change in the actual downtime that should have been recorded. This scenario includes two common actions:

  1. Updating a specific past adjustment event
  2. Deleting an incorrect adjustment event

Review and update past adjustments process using adjustments API​

During a routine maintenance window on Saturday, a regional outage extended the downtime beyond the scheduled period. Although the initial adjustment covered the planned maintenance time, the unexpected outage led to additional unplanned downtime. After reviewing the historical adjustment event for that date, the team realized they needed to adjust the exclusion to capture the entire downtime period.

Access adjustment events history

The team sends the following GET request to the adjustments API

curl -XGET
-H 'Organization: <organization_name>'
-H 'Authorization: Bearer <token>'
'http://app.nobl9.com/api/budgetadjustments/v1/maintenance-budget-adjustment/events?from=2024-01-01T00:00:00Z&to=2024-01-31T23:59:59Z'

The API returns the following response:

[
{
"eventStart": "2024-01-06T00:00:00Z",
"eventEnd": "2024-01-06T02:00:00Z",
"slos": [
{
"project": "project-alpha",
"name": "latency-slo"
},
{
"project": "project-alpha",
"name": "uptime-slo"
}
]
},
{
"eventStart": "2024-01-13T00:00:00Z",
"eventEnd": "2024-01-13T02:00:00Z",
"slos": [
{
"project": "project-alpha",
"name": "latency-slo"
},
{
"project": "project-alpha",
"name": "uptime-slo"
}
]
},
{
"eventStart": "2024-01-20T00:00:00Z",
"eventEnd": "2024-01-20T02:00:00Z",
"slos": [
{
"project": "project-alpha",
"name": "latency-slo"
},
{
"project": "project-alpha",
"name": "uptime-slo"
}
]
},
{
"eventStart": "2024-01-27T00:00:00Z",
"eventEnd": "2024-01-27T02:00:00Z",
"slos": [
{
"project": "project-alpha",
"name": "latency-slo"
},
{
"project": "project-alpha",
"name": "uptime-slo"
}
]
}
]

Identify the adjustment and update it

Having identified the event that needs to be updated, the team sends the following PUT request to the adjustment API:

curl -XPUT -H 'Organization: <organization>'
-H 'Authorization: Bearer <token>'
-H "Content-type: application/json" -d '[
{
"eventStart": "2024-01-20T00:00:00Z",
"eventEnd": "2024-01-20T02:00:00Z",
"slos": [
{
"project": "project-alpha",
"name": "latency-slo"
},
{
"project": "project-alpha",
"name": "uptime-slo"
}
],
"update": {
"eventStart": "2024-01-20T00:00:00Z",
"eventEnd": "2024-01-20T03:00:00Z"
}
}
]' 'https://app.nobl9.com/api/budgetadjustments/v1/maintenance-budget-adjustment/events/update'
tip

You can also update past adjustment events using sloctl budgetadjustments events command. To do so:

  1. Run budgetadjustments events get --adjustment-name=maintenance-budget-adjustment command to retrieve a list of events for the specified adjustment and related SLO.
  2. Identify the event that needs to be updated.
  3. Run budgetadjustments events update, providing the updated values in a YAML file

See the Adjustments use case for real-life examples managed through sloctl.

Delete an incorrect adjustment event using sloctl​

On another Saturday, the maintenance was canceled because the service needed to remain fully operational due to high demand. Despite the cancellation, the adjustment was still applied as usual, which prevented the service degradation that occurred during this period from impacting the error budget. Reviewing the historical adjustment event for that date, the team realized they needed to remove the adjustment event.

Access adjustment events history

The team accesses adjustment events history by running the following command in sloctl

sloctl budgetadjustments events get --adjustment-name=maintenance-budget-adjustment --from=2024-01-01T00:00:00Z --to=2024-01-31T23:59:59Z

Having run the command above, the team received the following response in sloctl:

- eventStart: 2024-01-06T00:00:00Z
eventEnd: 2024-01-06T02:00:00Z
slos:
- project: project-alpha
name: latency-slo
- project: project-alpha
name: uptime-slo
- eventStart: 2024-01-13T00:00:00Z
eventEnd: 2024-01-13T02:00:00Z
slos:
- project: project-alpha
name: latency-slo
- project: project-alpha
name: uptime-slo
- eventStart: 2024-01-20T00:00:00Z
eventEnd: 2024-01-20T02:00:00Z
slos:
- project: project-alpha
name: latency-slo
- project: project-alpha
name: uptime-slo
- eventStart: 2024-01-27T00:00:00Z
eventEnd: 2024-01-27T02:00:00Z
slos:
- project: project-alpha
name: latency-slo
- project: project-alpha
name: uptime-slo

Identify the event and delete it

The team identifies that the following adjustment event must be deleted:

maintenance-event-to-be-deleted.yaml
- eventStart: 2024-01-27T00:00:00Z
eventEnd: 2024-01-27T02:00:00Z
slos:
- project: project-alpha
name: latency-slo
- project: project-alpha
name: uptime-slo

And runs the following command in sloctl to delete it:

sloctl budgetadjustments events delete --adjustment-name=maintenance-budget-adjustment -f ./maintenance-event-to-delete.yaml
tip

You can also delete a past budget adjustment event using adjustments API.

Retrospectively excluding a historical event​

There may be cases where an event in the past should be excluded, but no adjustment definition was initially created. For example, an unanticipated maintenance period or a non-service-related incident (e.g., a regional outage) impacted the SLO. This feature allows users to create an adjustment definition after the event has occurred.

The team discovered that a misconfigured monitoring metric caused the SLO to appear broken, even though the service was functioning correctly. Recognizing that the issue was due to inaccurate data beyond their control, they decided to retroactively exclude this event from the error budget

apiVersion: n9/v1alpha
kind: BudgetAdjustment
metadata:
name: outage-budget-adjustment
displayName: Outage budget adjustment
spec:
description: Budget exclusion due to external outage incident.
firstEventStart: 2024-01-04T10:00:00Z
duration: 4h
filters:
slos:
- name: latency-slo
project: project-alpha
For a more in-depth look, consult additional resources: