Skip to main content

Disruptions & impact

Reading time: 0 minute(s) (0 words)

A disruption is any unplanned event that prevents components from operating as expected. Disruptions can be triggered in two ways:

  • Manually—an Organization admin registers the disruption.
  • Auto-detected—the number of issues received by the status page exceeds configured thresholds.
Disruption access permissions

While any organization user can view disruptions, only Organization admins can register the disruptions manually, update, clear, and delete them.

Any active disruptions—either auto-detected or registered manually—appear under the Status page tab in the Impacting state.

One disruption per component

Any component can have only one impacting disruption.

Every disruption is associated with a particular component. A disruption's severity is translated into the component's status. If a component has several subcomponents, each with an impacting disruption of different severities, the most critical severity determines the parent component's status. When a component is not impacted by a disruption, its status returns to Operational.

Disruption auto-detection

The system automatically switches components to Degraded performance when specific issue thresholds are met.

These thresholds are based on issues from three sources:

  • User flags
  • Nobl9 SLO alerts
  • External telemetry signals

The default thresholds are:

  • Five issues in 5 minutes for user flags or external telemetry.
  • Two issues in 5 minutes for Nobl9 SLO alerts.
Customization options

The default values of the issue thresholds prioritize early detection out of the box. Customization is available to ensure the detection sensitivity aligns with your organization's specific patterns and operational requirements. To request a threshold change, contact Nobl9 support.

The Major outage status can only be applied manually by an Organization admin.

Disruption management
Admin access

Organization admins can register, update, clear, and delete disruptions manually.

Registering a disruption forces a component status change immediately, without waiting for the threshold to be exceeded. To register a disruption, follow these steps:

  1. Go to the Status page or the details of the required component.
  2. Click Register disruption.
  3. Fill in the disruption form with the necessary details:
    1. Select the impacted component.
    2. Set disruption severity—Degraded performance or Major outage.
    3. Specify the disruption start date and time.
    4. Provide context for this disruption—your observations, assumptions, loss, etc.
  4. Click Register disruption.
Disruption registration form
Disruption registration form

Once a disruption is registered, the following options become available:

ButtonActionResult
UpdateChange the severity or context of the disruptionChanges appear in the Event timeline table. Any severity changes update the status of impacted components
Clear disruptionConclude the disruption's impact on the component once the disruption is addressedThe change appears in the Event timeline table. This disruption's record remains in the Disruption registry. The component returns to the Operational status. This action is final and cannot be reverted.
DeletePermanently remove the disruption from the systemThe disruption and all contributing issues (if any) are permanently removed with no trace kept in the Disruption registry. The component returns to the Operational status. This action is final and cannot be reverted.

Impact on component status

Nested components automatically pass their status upstream. If a component has children, it inherits the most severe status of any individual child. In the following example:

Nested component structure
Platform infrastructure
└── API gateway
└── Servers
└── EU cluster
└── US cluster

If API gateway experiences a Degraded performance, the parent Platform infrastructure will immediately reflect that same status.

The following sections explain status propagation in detail and describe what happens when a disruption is cleared or deleted.

Status propagation

Component statuses are determined by upstream propagation, where the most severe status of a child component defines the status of its parent.

For example, if API gateway and EU cluster are at Degraded performance, and US cluster is at Major outage:

  • Servers inherits the Major outage status.
  • Platform infrastructure also inherits the Major outage status, regardless of the degraded API gateway.
Status propagation
Status propagation

Disruption clearance or deletion

When a disruption is cleared or deleted, it no longer impacts the component.

  • Individual components: The component immediately returns to Operational.
  • Nested components: Parent components recalculate their status depending on whether any other subcomponents are still disrupted.
Restored componentResulting statusesDiagram
US clusterOperationalServersDegraded performance
Platform infrastructureDegraded performance
EU clusterOperationalServersMajor outage
Platform infrastructureMajor outage
US clusterOperational
EU clusterOperational
ServersOperational
Platform infrastructureDegraded performance

Impact monitor
Admin access

The Impact monitor tab provides an aggregated view of all issues and components impacted by issues or disruptions within your organization.

Impact monitor
Impact monitor
  • Summary tiles display the total count of impacted components and time-based issue volume breakdowns.
  • The Impact scope table lists impacted components and their associated issues (if any).
  • The Latest issues table provides detailed, chronological information on every issue, with the newest entries at the top.

Organization admins can permanently delete issues.

  • No backward impact: A deleted issue from an SLO alert or external telemetry does not impact its source.
  • Threshold impact: The deleted issue no longer counts toward the issue threshold.
  • No auto-clearance: Deleting a contributing issue does not clear or delete the disruption.
Check out these related guides and references: