OpenTSDB
OpenTSDB is a distributed, scalable Time Series Database (TSDB). OpenTSDB stores, indexes, and serves metrics collected from computer systems at a large scale, and makes this data easily accessible and suitable for graphing.
OpenTSDB parameters and supported features in Nobl9
- General support:
- Release channel: Stable, Beta
- Connection method: Agent
- Replay and SLI Analyzer: Not supported
- Event logs: Not supported
- Query checker: Not supported
- Query parameters retrieval: Not supported
- Timestamp cache persistence: Supported
- Query parameters:
- Query interval: 1 min
- Query delay: 1 min
- Jitter: 15 sec
- Timeout: 30 sec
- Agent details and minimum required versions for supported features:
- Plugin name: n9opentsdb
- Query delay environment variable: OPENTSDB_QUERY_DELAY
- Timestamp cache persistence: 0.65.0
Creating SLOs with OpenTSDBβ
Nobl9 Webβ
Follow the instructions below to create your SLOs with OpenTSDB in the Nobl9 Web:
-
Navigate to Service Level Objectives.
-
Click .
-
In step 1 of the SLO wizard, select the service the SLO will be associated with.
-
In step 2, select OpenTSDB as the Data Source for your SLO, then specify the Metric. You can choose either a Threshold Metric, where a single time series is evaluated against a threshold, or a Ratio Metric, which allows you to enter two time series to compare (for example, a count of good requests and total requests).
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values.
It results in a constantly increasing SLO graph.
- Choose the Data Count Method for your ratio metric:
-
Enter a Metric Selector or Metric selector for good counter and Metric selector for total counter for the metric you selected. The following are query examples:
-
Threshold metric for OpenTSDB:
Metric Selector:m=none:{{.N9RESOLUTION}}-avg-zero:transaction.duration{host=host.01}
-
Ratio metric for OpenTSDB:
Metric selector for good counter:m=none:{{.N9RESOLUTION}}-count-zero:cpu{cpu.usage=core.1}}}-count-zero:http.code{code=2xx}
Metric selector for total counter:
m=none:{{.N9RESOLUTION}}-count-zero:http.code{type=http.status_code}
SLI values for good and totalWhen choosing the query for the ratio SLI (countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
-
-
In step 3, define a Time Window for the SLO.
-
Rolling time windows are better for tracking the recent user experience of a service.
-
Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.
-
In step 4, specify the Error Budget Calculation Method and your Objective(s).
- Occurrences method counts good attempts against the count of total attempts.
- Time Slicesmethod measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
See the use case example and the SLO calculations guide for more information on the error budget calculation methods.
-
In step 5, add the Display name, Name, and other settings for your SLO:
- Create a composite SLO
- Set notification on data, if this option is available for your data source.
When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes. - Add alert policies, labels, and links, if required.
You can add up to 20 links per SLO.
-
Click Create SLO.
sloctlβ
- Threshold (rawMetric)
- Ratio (countMetric)
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example OpenTSDB SLO
indicator:
metricSource:
name: open-t-s-d-b
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 200
name: ok
target: 0.95
rawMetric:
query:
opentsdb:
query: >-
start={{.BeginTime}}&end={{.EndTime}}&ms=true&m=none:{{.Resolution}}-avg-zero:transaction.duration{host=host.01}
op: lte
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example OpenTSDB SLO
indicator:
metricSource:
name: open-t-s-d-b
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 1
name: ok
target: 0.95
countMetrics:
incremental: true
good:
opentsdb:
query: >-
start={{.BeginTime}}&end={{.EndTime}}&ms=true&m=none:{{.Resolution}}-count-zero:http.code{code=2xx}
total:
opentsdb:
query: >-
start={{.BeginTime}}&end={{.EndTime}}&ms=true&m=none:{{.Resolution}}-count-zero:http.code{type=http.status_code}
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
Important notes:
Important notes: Nobl9 agent must have control over queried time range. The query must filter out documents in a specific time range: The Nobl9 agent must have control over time series resolution:
m=none:{{.N9RESOLUTION}}-p75:metric.name{tag.name_1=tag.tag_1}
In this query:
{{.N9RESOLUTION}}
is the mandatory placeholder replaced by the Nobl9 agent with the correct valuep75
is the aggregation function that will be used (e.g., count, 99th percentile)test.to.test
is the target metric namemetric.name
is the target metric name{tag.name_1=tag.tag_1}
is an optional key-value set parameter for additional filtering (e.g., host=cluster01){tag.name_1=tag.tag_1}
is an optional key-value set parameter for additional filtering (e.g., host=cluster01) Nobl9 also supports a list of TSUIDs that share a common metric instead of a query. For more details, refer to TSUIDs and UIDs.
Querying the OpenTSDB serverβ
Nobl9 queries OpenTSDB API once per minute and requests a resolution of 4, thus giving 4 data points per minute. The start and end times, along with the specified query and resolution value are passed into the API call.