Skip to main content

System Health Review report

Reading time: 0 minute(s) (0 words)

The System Health Review report provides a simplified way to monitor and report on your system's reliability and performance. It's specifically designed for recurring reliability check-in meetings, giving you an efficient overview of your system health over time.

System Health Review is targeted at:

  • System administrators monitoring day-to-day operations
  • Technical managers tracking team performance
  • Upper management making strategic decisions

The report presents SLO data in an easy-to-understand format that supports informed decision-making across all organizational levels.

Report overview​

System Health Review facilitates recurring reliability check-ins grouping your Nobl9 SLOs by projects, services, and labels of your choice through the remaining error budget metric in a table format.

The report gives both aggregated and detailed views of the overall system health for SLOs residing in selected projects or services. The four color-coded categories reflect the available error budget of the SLOs:

Healthy SLOs
This icon and color indicates SLOs with enough error budget left within their time window.
Exhausted SLOs
This icon and color indicates SLOs with error budget fully exhausted within their time window.
SLOs at risk
This icon and color indicates SLOs with error budget at risk of being fully exhausted within their time window.
No data SLOs
This icon and color indicates SLOs that haven't gathered data.
System Health Review report

The report aggregates SLO health information to provide quick glimpses into the performance of your system.

Depending on the grouping option set for the report, the identifier column contains project or service names, or label values. The first cell of the header row displays the grouping indicator:

The Project grouping configuration
The Project grouping report configuration

For the Label value rows configuration, the report displays your selected label key as the grouping indicator.

You can sort the table by SLO's remaining error budget. For this, click the required column header:

Sorting by SLO' remaining error budget

SLOs in the columns must match the following criteria:

  • Reside in projects or services selected in Step 2 Filter resources, or have the label values as specified for the row grouping
  • Be tagged with labels selected for each column

Label filtering logic for columns: When multiple labels are assigned to a column, Nobl9 applies the filter combination logic:

  • Same key, different values: SLOs must match ANY of the values (OR logic)
  • Different keys: SLOs must match ALL keys (AND logic)

Example: If a column has these labels: environment: staging OR environment: production AND team: operations, SLOs must have all label keys but can match either value.

I.e. SLOs will appear in this column if they have:

  • environment: staging AND team: operations, OR
  • environment: production AND team: operations
Grouping in Label value rows configuration

SLOs with the specified labels are displayed in the report, where:

  • Columns are populated by selected full labels (key-value pairs)
  • Rows are populated by a selected key, where all of its available values become the row headers
  • The Other row appears when your report includes SLOs that match report filters and have column labels, but miss the label selected for row grouping
  • The cells at the intersection of a column and a row contain the SLOs that match both the column's full label and the row's key value
Column groupings

You can view labels assigned to each column. For this, hover the cursor over the required column header:

Viewing labels

From the perspective of reporting time frame, the report can be:

Real-time
Real-time reports reflect the state of selected SLOs at the latest data point received by Nobl9 within the last hour. Choose this option to review the most up-to-date state of your system.
Retrospective
Retrospective reports show the historic state of selected SLOs. You can additionally define a recurrence rule (rrule) to update historic data in the report regularly [1]. Retrospective reports also display SLO health trends [2].
rrule
System Health Review report for the past timepoint with added recurrence rule (no. 1) and SLO health trends (no. 2)
SLO health trends

SLO health trends represent the remaining error budget metric trends for the latest reporting period.

For example, in reports generated every Monday (available with rrule), the trend will reflect the comparison between the previous and current Mondays.

Aggregated columns and rows​

The report shows SLO health as percentages in each cell, with color-coded categories you define when creating the report:

  • Healthy SLOs have sufficient error budget remaining
  • At-risk SLOs have an error budget running low
  • Exhausted SLOs have an error budget fully consumed
  • No data SLOs have insufficient data for calculation

Four levels of aggregation:

  • Overall (top-left cell) shows the health distribution across all SLOs in the report
  • By row (right column) shows health distribution for all SLOs in a particular project / service / label
  • By column (top row) shows health distribution for all SLOs matching that column's labels
  • By cell (intersections) shows health distribution for all SLOs that fall into particular cells
View the example of aggregation levels
  • In the entire report: 13% of all SLOs are exhausted, 26.1% are at risk, and 60.9% are healthy
  • Team Operations on all environments: 12.5% exhausted SLOs, 23% at risk, and 62.5% healthy SLOs
  • All teams on the staging environment: 50% exhausted SLOs, 0% at risk, and 50% healthy SLOs
  • Team Operations on staging: 33% exhausted SLOs and 25% healthy SLOs
SLO aggregations
SLO aggregation in columns and rows

Key points to consider​

  • Rows are sorted alphabetically by default
  • Cells display no matching SLOs when no SLOs in the selected project or service have the labels assigned to that column
  • When you add or remove labels from SLOs, reports containing those labels update dynamically
  • For sloctl users: When defining resources for filters, you must also specify the resource that contains them. For example, to add a service, you must define the project containing that service under filters.project

System limits and capabilities:

AspectLimit/BehaviorDescription
Maximum columns30Maximum number of columns allowed per report
Historical data retention2 yearsMaximum historical data available for retrospective reports
Minimum report frequencyDailyMost frequent schedule allowed for recurring reports
Empty content handlingNot displayedProjects or services containing no SLOs are automatically excluded from display
Query delay impact

If any of your SLOs use a data source with an extended query delay, "latest" type reports will show your system's state delayed by the configured query delay period.

Create a System Health Review report​

You can create the System Health Review report on the Nobl9 Web or applying a YAML configuration in sloctl.

  1. On the Nobl9 Web, go to Reports.
  2. Click .

Step 1: Name report and choose its type

  1. Enter the display name for your report.
    You can edit it at any time.
  2. Select the System Health Review report type.

Step 2: Filter resources

The resources you select define the scope of your report.

  1. Select at least one project, service, or service level objective to be included in your report.
    These fields are interdependent: selecting a project defines the list of available services and SLOs, and selected SLOs narrow down the list of available services and projects.
  2. Optional: Select labels to add more resources to your report.

Step 3: Define report layout

Your report layout depends on the row grouping you choose and the columns you configure.

Preview table

The table below shows the report structure previewβ€”your actual data will appear once you complete the configuration.

  1. Set Row grouping to determine what appears in the header rows:
    • Project rows β€” display project names in header rows
    • Service rows β€” display service names in header rows
    • Label value rows β€” display the values of the selected label key in header rows
  2. Configure table columns:
    • Enter the required column names and select labels for each column
    • To add columns, click + on the right side of the table (up to 30 columns per report)
    • To remove columns, hover over the column and click above it
      Adding and removing columns

Step 4: Configure thresholds

Thresholds define report categories: Exhausted, At risk, and Healthy.

  1. Specify how much of the SLOs' remaining error budget define exhausted and healthy SLOs.
    • SLOs with the error budget remaining between these values fall into the At risk category.
    • You can reset the thresholds to their default values set by your organization admin for the Service Health Dashboard.
  2. Set the visibility of the At risk category and SLOs that report no data.
    • Hide At risk and set the same values for Exhausted and Healthy, so your report includes only two categories.
    • Deselect SLOs without data to have these SLOs in your report.

Step 5: Select reporting time

You can create a one-time report based on the latest data or retrospective. Retrospective reports can be one-time or recurring.

To create the report based on the latest data, select Real-time and specify the time zone.

For Retrospective, do the following:

  1. Set the date, time, and time zone: your report will show the state of your selected SLOs as of the moment you specified.
  2. Specify the Repeat rule for your report:
    • With any option except for Don't repeat, Nobl9 will update the report as frequently as you select.
    • Select Custom when no repeat option fits your needs:
      • Enter your custom recurrence rule in the iCalendar format or use the rrule generator.
      • Omit specifying the date, time, and time zone in the rruleβ€”you already have them set.
      • The minimum recurrence frequency is DAILY.

RRULE​

Using spec.rrule you can create a rule for a System Health Report events that will generate them regularly. The spec.rrule field follows the iCalendar specification.

The format of the rrule field consists of key-value pairs separated by semicolons (;). Each key-value pair specifies a parameter of the recurrence rule. Nobl9 supports all iCalendar rules outlined in the iCalendar documentation.

Example:

The System Health Report will repeat every week on Monday at 10:00:00 AM.
apiVersion: n9/v1alpha
kind: Report
...
spec:
serviceHealthReport:
timeFrame:
snapshot:
point: past
rrule: FREQ=WEEKLY;BYDAY=MO;BYHOUR=10;BYMINUTE=0;BYSECOND=0
tip

Use the rrule generator to create a recurrence rule suited to your needs.

Check out these related guides and references: