Skip to main content

Replay restrictions

Reading time: 0 minute(s) (0 words)

Limitations and RBAC

  • Two slots are allocated for every organization. This means you can run two Replays at a time.
    Track job progress in the Job Status widget.
  • To replay SLOs, queue Replays, and remove them from the queue, your role must be Organization admin, Project owner, or Project editor.
    Learn more about RBAC.
Data sources can alter data from the past

Metric gathering systems usually downsample older data using different aggregate functions like mean or sum or simply by dropping data points. This is aimed at saving space and can affect the result of a query made against a time range in the past. Refer to the documentation of your required data source for more details.

Replaying a single SLO may take up to an hour depending on:

  • The length of the replayed period
  • The number of objectives in your SLO
  • The number of unique queries used in your SLO

Replay impact on connected resources

Running Replay for existing SLO has important consequences on SLI data, alerts, and composite SLOs.

Impact on SLI data

  • Live data is gathered while Replay is in progress but isn't considered in calculating SLO's error budget until the process is complete.

  • Replay queries your data source once again for the entire selected historical period.
    It can completely replace SLI data already gathered for the same period.

  • Data resolution can be lowered due to the downsampling of historical data. It depends on the data source.
    As a result, the SLI chart can look different upon replaying with the same query.

  • Replay won't always fill in missing data points. If there are gaps in data, Replay instead marks these gaps as shown in the examples below.
    This happens when the data source doesn't keep data for as long as you're trying to retrieve, for example, according to the data retention policy.
    To avoid this, always set the Maximum Period for Historical Data Retrieval less than or equal to data source's retention period.

The example of missing SLI data before and after replaying:

Original input SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
2023-01-01 01:22:00 = 270
2023-01-01 01:24:00 = 220
2023-01-01 01:25:00 = 130
2023-01-01 01:26:00 = 280
2023-01-01 01:27:00 = 200
Retrieved SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
[...] # Gap in the data stream
2023-01-01 01:28:00 = 90
2023-01-01 01:29:00 = 220
2023-01-01 01:30:00 = 270
2023-01-01 01:31:00 = 190
Replayed SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
[...] # Gap in the data stream
2023-01-01 01:28:00 = 90
2023-01-01 01:29:00 = 220
2023-01-01 01:30:00 = 270
2023-01-01 01:31:00 = 190

Impact on alerts

  • Alerting is suspended for the entire Replay duration and resumed once Replay is complete.
  • Once Replay is complete:
    • You won't receive already received alerts for the recalculated historical period again.
    • You receive missed alerts: the alerts triggered when Replay was running.
      These alerts are triggered based on the recalculated data.

Impact on composite SLOs

Currently, you can't replay a composite SLO, but only its components.

Replaying components of a composite causes no retroactive changes to the composite data. The replayed component stops reporting data until the process is complete. It is treated according to your maxDelay and, if longer, whenDelayed settings. The overall composite error budget calculations depend on the duration of the Replay process, the component's maxDelay settings, and the existence of components without a delay in the composite.

Non-delayed components?Replay vs. maxDelayResult
YesReplay<maxDelayThe composite pauses for the duration of Replay. Component's data collected after replaying is considered in calculations as usual.
YesReplay>maxDelayComponent's data is considered in calculations according to whenDelayed. Data delayed for the time surplus (once maxDelay ends) is calculated as usual.
NoAny ratioThe composite pauses for the duration of Replay. Upon replaying, the component's data fills the no-data gap.

No support for composite SLOs 1.0

Turning an existing SLO into a composite 1.0, while this SLO is being replayed, results in the following:

  • Replay continues for the original objectives of this SLO
  • Historical data won't be considered in calculating the composite SLO 1.0 error budget. It's calculated without Replay, from the moment of creating the composite 1.0 objective.
For a more in-depth look, consult additional resources: