Service level objectives inputs and outputs
Service Level Objective(s) (SLOs) is a complex term that describes various components of a complex system. This document aims to clarify the inputs and outputs of reliability approaches based on SLOs and how these values are presented in Nobl9. For meaningful comparisons between outputs, they must use the same inputs.
To make sure that an SLO works correctly, we need to define the following characteristics:
Service Level Indicator (SLI): A metric or set of metrics that can be used to determine if a system is currently performing in an acceptable state (or not).
Service Level Objective target: A percentage indicating the desired good state based on SLI status for events or time.
Error budget time window: The period over which we evaluate compliance with the SLO target.
Calculation method: To determine a system's SLO (Service Level Objective) status, we can look at the total number of good events or the time the system functions correctly. This evaluation is done by comparing the results against our predetermined error budget time window.
Remember that it's crucial to define all the inputs accurately and have meaningful and easy-to-understand outputs.
Reliability burn down: Indicates the percentage of the time a system has been in a good state, evaluated against the error budget time window.
Remaining error budget: An alternate representation of reliability burndown that uses the space between
100%and the SLO target as its field of operation. For example, if your SLO target is
90%and your reliability burndown is
95%, your remaining error budget would be
Burn rate: The number of observed bad events divided by the number of allowed bad events, as defined by your error budget time window. When your burn rate is
1, you are burning through your budget at an acceptable rate. Below
1, you are on track to retain the excess budget, and above
1, you are on track to exhaust all established error budget.
To understand your system's performance, you can only compare these outputs if they are calculated using the same inputs.
On the grid view, Nobl9 displays:
- The reliability burn down as a line chart
- The remaining error budget in both time and a percentage remaining, and
- The burn rate as a natural number
In the details view, Nobl9 displays the incoming SLI data as a line chart of raw data. You can hover over this line to know the exact value at any point in time.
In the details view, Nobl9 displays the reliability burn down as a line chart that you can hover over to know the exact value at any point in time. The SLO target is represented on the line chart as a dotted line (- - -) to spot how this burn down operates against the target over the configured time window.
≥) operators, SLO target is represented as a solid line on the SLI charts:
In the SLO details view, Nobl9 displays the burn rate as a line chart that you can hover over to see the exact value at any point in time. Number
1 is represented on the line chart as a dotted line (- - -) to quickly see if you’ve been burning too much of your budget.
When burn rate is calculated over your error budget time window, the burn rate determines when you experience periods of unreliability and how severe these events were.