Nobl9 alert conditions
Nobl9 offers multiple ways to set up alerts. These methods are all centered around the error budget, which determines the maximum duration a system can malfunction without consequences.
This approach helps you protect your error budget and ensures that it doesn't get burned or used up entirely. Nobl9 alerting logic is flexible, allowing you to create a model that suits your needs.
Nobl9 alerting logic rests on two key features:
- Feature 1:
Nobl9 fires alerts immediately only if an SLO is in a specific state, so you can act on it here and now. - Feature 2:
Nobl9 fires alerts if an SLO could enter a particular state at a given point in time, so you can prevent it.
In Nobl9, alert conditions are essential building blocks of alert policies that serve both assumptions: detecting or preventing specific error budget burn states. Each alert condition relies on specific measurements to perform calculations. Thus, it is critical to define each alert condition accurately to ensure it detects (or prevents) the desired error budget burn state.
This way, Nobl9 allows you to configure your alerts according to the following:
-
Measurement thatβs used for alerting logic. Each measurement is defined as the function of time (exhaustion conditions), burn rate (error budget burn rate conditions), or error budget (remaining error budget conditions). Also, each measurement corresponds to a specific condition type.
-
Observation model that's used under the hood to evaluate burn rate and other measurements. Specifying which observation model is done via selecting one of the following:
-
The error budgetβs burnout characteristic. The most popular types of alert conditions are:
You can use alert presets to quickly set up tested and commonly used alert conditions such as fast burn and slow burn. Learn more about alert presets.
Overview of available alert conditionsβ
Depending on the alert policy configuration, Nobl9 can notify you when:
-
Remaining error budget would be exhausted in the near or distant future. In this condition, exhaustion time prediction becomes more sensitive as your remaining budget decreases. Once your SLO has no error budget left, even the slightest amount of burn will trigger an alert.
-
Entire error budget would be exhausted in the near or distant future. This prediction is based on the allocation of your entire error budget and depends only on the current burn rate. Use it to define alerts based on time rather than the burn rate function and avoid the remaining budget value impacting the prediction.
noteThe Entire/Remaining error budget would be exhausted conditions are triggered when Nobl9 predicts the burn of the entire/remaining budget allocation based on the current burn rate during the configured period.
Remaining error budget would be exhausted uses the
timeToBurnBudget
measurement when verifying alert conditions, while Entire error budget would be exhausted uses thetimeToBurnEntireBudget
measurement. -
The average error budget burn rate is greater or equal to the threshold and lasts for some period. This alert condition helps detect burn rate spikes independently of the burned budget.
-
The remaining error budget is below the threshold. It allows for the most straightforward configurations that will alert you when you reach a specific level of error budget, regardless of how quickly, or slowly you reach it.
-
The budget drop condition measures a relative drop in the error budget expressed in the percentage values that can be observed on the Remaining error budget chart. You can use it as an alternative to the average burn rate condition.
Slow burn and fast burn conditionsβ
To measure slow burn and fast burn scenarios, you can use the Entire / Remaining Error budget would be exhausted or The average error budget burn rate conditions. For example:
-
Error budget would be exhausted in 3 days and lasts for 10m - fast burn
To detect short but significant spikes in burn rate over a brief timeframe -
Error budget would be exhausted in 3 days and lasts for 1h - slow burn
To detect a gradual budget burn over a prolonged timeframe
You can also configure fast and slow burn policies based on the budget drop condition. See how both conditions correlate with each other.
Also, check the Fast and slow burn guide to learn more.
Overviewβ
This section of the Nobl9 documentation offers a deeper dive into the ins and outs of Nobl9 error budget calculations and how they are tied to alert policies and alert methods. Check the specific guides to dive deeper into the inner workings of Nobl9 alerting: