Replay Beta
Replay provides the ability to retrieve SLI data collected by your data source in the past.
You can use Replay when you need to recover missing or corrupted SLI data from your data source. It also allows you to create Service Level Objectives (SLOs) by retrieving historical data from the start, eliminating the need to wait for data to accumulate naturally over time. Additionally, Replay can help you backfill your SLO reporting. If you have a backlog of SLI data from the past few days or weeks, you can retrieve it using Replay and recalculate your remaining error budget.
How far back in time Replay can retrieve SLI data depends on the maximum period for historical data retrieval initially allowed by your data source.
Since using Replay can be resource-intensive, you might want to further limit this maximum duration at the integration level to avoid overloading. To do this, specify your preferred Maximum Period for Historical Data Retrieval when adding a data source to your Nobl9 organization.
The value for the maximum period for historical data retrieval must be less or equal to the value initially allowed by your data source.
When you need to replay SLOs on creation, configure this option for your data source first. Then, set the required period for historical data retrieval when creating the SLO.
Replay always retrieves SLI data up to the time it's been launchedβyou select only how far back the data will be retrieved.
Every organization in Nobl9 is limited to two concurrent Replays.
Subsequent Replays are queued and processed in order as Replay slots open up.
Replay at a glanceβ
- With Replay, you can access historical data for new and existing SLOs.
- Replay fetches historical data while your SLO collects new data in real-time. The historical and current data are merged, producing an error budget calculated for the entire period.
- Data sources limit the data retrieval period. This period can be limited even more for your data source in Nobl9.
- Replay always retrieves data until now.1
- You can run two Replays simultaneously; further Replays are queued.
- You can track Replays in the Job Status widget.
- You can cancel queued Replays in the Job Status widget. However, cancelling ongoing Replays isn't possible.
- SLOs with ongoing Replays display no data in the tiles and charts over the retrieved period until Replay is complete. Queued Replays don't affect SLO's tiles and charts.
1"Now" means the moment of triggering Replay
Scope of supportβ
Supported data sources for Replay, their minimum required agent versions, and the maximum period for historical data retrieval:
Data source name | Nobl9 agent minimum version | Maximum period for historical data retrieval |
---|---|---|
Amazon CloudWatch | 0.65.0 | 15 days |
Amazon Prometheus | 0.65.0 | 30 days |
AppDynamics | 0.68.0 | 30 days |
Azure Monitor | 0.69.0-beta01 | 30 days |
Azure Monitor managed service for Prometheus | 0.78.0-beta | 30 days |
Datadog | 0.65.0 | 30 days |
Dynatrace | 0.66.0 | 28 days |
Elasticsearch | 0.85.0-beta | 30 days |
Google Cloud Monitoring | 0.79.0-beta | 30 days |
Graphite | 0.65.0 | 30 days |
LogicMonitor | 0.81.0-beta | 30 days |
New Relic | 0.65.0 | 30 days |
Prometheus | 0.65.0 | 30 days |
ServiceNow Cloud Observability | 0.65.0 | 30 days |
Splunk | 0.65.0 | 30 days |
Replay configurationβ
- Nobl9 Web
- sloctl
In the Data source wizard, define the Advanced settings > Historical data retrieval:
- The Maximum period for historical data retrieval value sets the limits for how far back to the past Nobl9 can query data from your data source.
The maximum period must be less or equal to the period initially allowed by your data source. - The Default period for historical data retrieval defines the period applied to replay SLOs based on your data source. This value is suggested by default as Period for historical data retrieval for SLOs created based on this data source. This value must be less or equal to the Maximum period for historical data retrieval set for this data source.
Requirements:
- The values must be the whole positive numbers or zero.
- Default period for historical data retrieval must be up to the maximum period set for this data source. You can override this value when creating SLOs.
- Maximum period for historical data retrieval must be up to the maximum period allowed initially by your data source.
- To replay SLOs based on this data source immediately on creation, set Maximum period for historical data retrieval to a non-zero value.
In your data source's YAML definition for agent or direct, define the values for historicalDataRetrieval
:
maxDuration
sets the limits for how far back to the past Nobl9 can query data from your data source.
The maximum period must be less than or equal to the period initially allowed by your data source.defaultDuration
defines the period applied to replay SLOs based on your data source.
- apiVersion: n9/v1alpha
# Oneof: Agent, Direct
kind: Direct
metadata:
displayName: My data source with replay
name: my-data-source
project: my-project
spec:
# Your data source-specific fields
# ...
historicalDataRetrieval:
defaultDuration:
unit: Day
value: 0
maxDuration:
unit: Day
value: 30
# Query parameters configuration: query interval, query delay, jitter, and timeont
interval:
unit: Minute
value: 1
queryDelay:
unit: Minute
value: 5
jitter:
unit: Second
value: 45
timeout:
unit: Second
value: 50
# Event logs available for direct connections only
logCollectionEnabled: true
# Oneof: beta, stable. Use beta for Replay
releaseChannel: beta
Run sloctl apply
to proceed.
Learn more about YAML configuration for data sources connected using the agent or direct methods.
Requirements:
- The values must be the whole positive numbers or zero.
- The value for
defaultPeriod
must be up to the maximum period set for this data source. - The
maxDuration
value must be up to the maximum period allowed initially by your data source. - To replay SLOs based on this data source immediately on creation, set
maxDuration
to a non-zero value.
You can also set up replaying SLOs immediately on creation with the Nobl9 Terraform provider.
For this, specify the value for retrieve_historical_data_from
in your SLO definition.