SLOs
β Copy SLOsβ
You can use sloctl
to retrieve the SLO definition and copy/paste it in your YAML configuration.
Read more in sloctl user guide.
β Restore deleted SLOsβ
Deletion is permanent. The SLO is removed from Nobl9 and must be re-created.
β SLO not reporting dataβ
This message appears when no data is collected from an SLO objective during a selected time window.
You may encounter situations where the tiles display "No data" while the charts have information. It is also possible when both the tiles and charts are empty.
The charts are designed to provide a broad picture of the objective's status over the time window, while the tiles focus on the most recent data. This helps in assessing whether the objective is collecting data with sufficient granularity.
Here's a breakdown of the time ranges used:
- Tiles: data from the last seven days of the selected time window
- Charts: data from the entire time window chosen
For a more comprehensive understanding, let's consider the following example scenarios with different time windows. The examples assume that now is July 10, 2024, at 12:00.
Time window | The tiles time range | The charts time range |
---|---|---|
> 7-day rolling or calendar-aligned started more than seven days ago, ends in the future | July 4 12:00 β July 10 12:00 | The entire time window (elapsed) |
Past rolling or calendar-aligned May 10 12:00βJune 10 12:00 | July 4 12:00 β June 10 12:00 | The entire time window |
< 7-day rolling or current calendar-aligned started less than seven days ago | The entire time window (elapsed) | The entire time window (elapsed) |
The following can postpone data income:
Cause | How to diagnose | Solution |
---|---|---|
Query parameters | Both tiles and charts are empty |
|
Data source | A mismatch between tiles and charts and no data in both | Refer to the data source troubleshooting guide. |
Query | A mismatch between tiles and charts and no data in both |
|
Replay | A mismatch between tiles and charts and no data in both | Data appears upon Replay completion. If no, check the data source or query for any issues. |
β The reliability tile color mismatches its valueβ
When you change the reliability target for your SLO, Nobl9 recalculates the values for the error budget remaining and reliability after the following data income upon the target change.
Until the next data income, the following happens:
- The reliability target displays the actual (newly changed) value.
- The color of the Reliability tile depends on the target you set.
- The Error budget remaining and Reliability tiles display values based on the data already collected.
So, when you increase the reliability target, the Reliability tile can become red while its value is high and the error budget remaining is sufficient. Or when you decrease the target, the Reliability tile can turn green even with a small value and too little error budget remaining.
This will change once Nobl9 collects new data. However, it's saved in the SLO history, so you will still see it when rewinding the time window to the period of target modification unless you change the target again or replay1 the SLO.
1The maximum period for historical data retrieval limit per data source is applied.
β Dashboard sort and filteringβ
You can sort the dashboard by project in ascending and descending order. You can also filter the dashboard results using labels.
β Project-specific dashboardsβ
Although individual project dashboards aren't available, you can create project-focused views by filtering existing dashboard with labels.
The matched results feature dedicated link for sharing and bookmarking.
β Multiple-project agentβ
An agent can only be associated with one project when itβs created. However, when you create an SLO, you can use the agent from any project.
Read more about Nobl9 agent.
β Common client credentials for multiple agentsβ
Client IDs and secrets must be unique for each agent.
β Historical metrics retrievalβ
Yes, you can retrieve your historical metrics while creating a new SLO with activated Replay.
β Update SLOsβ
Our objects are managed in a declarative manner, similar to Kubernetes. As such, making changes to certain specifications creates a new SLO-type object.
If you update an SLO using the sloctl binary and try to change the project with the sloctl apply
command, the SLO is duplicated in the new project.
You can use sloctl to move SLOs to another service as long as they belong to the same project.
β Moving SLOs between projectsβ
You can't move your existing SLOs between projects without losing your SLO's historical data.
Read more about editing SLOs.
β Editing SLOsβ
Updating the following settings of an SLO:
- Target
- Error budget calculation method
- Time window
results in losing historical metrics data. Any changes in the above settings reset the error budget of your SLO and remove the budget history.
Read more about editing SLOs.
β SLO sample queriesβ
You can find sample queries in the Sources documentation.
Select the required
data source from the list and go to the Create SLOs section.
β An SLO fails to receive dataβ
First, try to restart your agent. If this doesnβt work, contact Nobl9 support.
β Limit SLOs per project, team, or userβ
To set limits on SLOs, assign user roles with specific permissions regulating access to them.
Read more about role-based access control.
β Sharing SLOsβ
To share an SLO, do the following:
- Go to Service Level Objectives.
- Click the required SLO to open its details.
- Copy a deep link (a URL) to the SLO in the address bar and share it as you need.
β Error budget units of measurementβ
In Nobl9, error budgets are measured in percentages. Additionally, we display them as time units to make them easier to comprehend.
Compare:
- We can sustain another 15 minutes of complete downtime this month, and
- 33% of a 0.1% error allowance over 28 days remaining.
β Burn rate measurementβ
Nobl9 measures burn rates in a standard way:
- A 1x burn rate means you will burn through (but not exceed) your error budget during your defined time window
- A burn rate below 1x over an entire time window means you will have an error budget remaining at the end
- A burn rate above 1x over an entire time window means you will exceed your error budget
β SLO shows a negative error budget burn rateβ
If your SLO shows a negative error budget, check if your query is correct or contact Nobl9 Support.
β Error budget is more than 100%β
If your error budget is above 100%, check if your query is correct or contact Nobl9 Support.
β Alert policy removal from an SLOβ
Run alert_policies = []
to remove an alert policy from the SLO.
β An SLI chart shows different values for the same time at different time scalesβ
This can happen when the sum
aggregation is set for a non-incremental ratio SLI.
For this SLI type, Nobl9 adds every next data point to the previous point, and the SLI chart displays the sum of these time series.
Zooming in the chart narrows down the timespan for the displayed data, so it covers fewer data, reducing the values you see.
In contrast, when you zoom the chart out, the timespan widens (and captures more data), so the displayed values grow.
Learn more about:
- SLO calculations
- SLI aggregations
- Discrepancy between Nobl9 SLI charts and the values from a data source
- SLO inputs and outputs
β Querying for dataβ
By default, Nobl9 queries the data source every minute for the last minute of data. However, the time depends on the data source and query configuration.
Read more about your required data source.
β Reliability increases after a bad event with a high burn rateβ
This behavior is natural for rolling time windows.
As the time window moves forward, data points expire and appear on a first-in, first-out basis. Once the time
window advances enough
for the bad event to fall out and be replaced with good ones, the reliability improves.
An old error may
expire exactly as soon as a new error arrives. In this case, reliability change depends on the weight of these two errors:
β The old error = the new error: reliability doesn't change
β The old error < the new error: reliability decreases
β The old error > the new error: reliability increases
Another scenario explaining why this happens is an SLO with the Occurrences budgeting method with any time window type. This method considers the total number of data points arrived during the period. The more data points come, the less a single data point weighs. Therefore, the good points can eventually outweigh the bad ones even when some bad points are still being registered. As a result, reliability improves.
Useful linksβ
Check Nobl9 features that will help you troubleshoot:
Event logs Metrics health notifier Query checker Nobl9 agent logging