Splunk Observability On demand
Splunk Observability allows users to search, monitor, and analyze machine-generated big data. Splunk Observability facilitates collecting and monitoring metrics, logs, and traces from common data sources. Data collection and monitoring in one place ensure full-stack, end-to-end observability of the entire infrastructure.
Splunk Observability is different from the Splunk Core that powers Splunk Cloud / Enterprise and is the traditional log management solution from Splunk. Nobl9 also integrates to that through a different set of APIs.
The Splunk Observability integration with Nobl9 is available on demand. Fill in the request form to access it.
Authenticationβ
SplunkObservability is SaaS but the URL which indicates the realm (region) needs to be provided. For more details, refer to Realms in Endpoints | Splunk Observability documentation.
When deploying the Nobl9 agent for SplunkObservability, it is required to provide
SPLUNK_OBSERVABILITY_ACCESS_TOKEN
as an environment variable for authentication with organization API Access Token (see Create an Access Token | Splunk Observability documentation). There is a placeholder for that value in configuration obtained from installation instructions on the Nobl9 Web (refer to the Agent configuration on the Nobl9 Web section).
Adding Splunk Observability Realmβ
Splunk Observability connection also requires entering your organizationβs Realm. Follow the below instructions to get your API endpoint for the Realm in Splunk:
-
In your Splunk account, go to Settings > Profile.
-
Go to the Endpoints section
-
Choose the
URL
from theAPI field
.
-
Access tokens are valid for 30 days.
-
Customers could use Org tokens which are valid for 5 years. Org tokens can also be used to generate session tokens
- Sample access token for Splunk Observability:
t4QJpMY1XLcECzm1c5Jb0A
- Sample access token for Splunk Observability:
Adding Splunk Observability as a data sourceβ
You can add the Splunk Observability data source using the direct or agent connection methods.
Direct connection methodβ
Direct connection to Splunk Observability requires users to enter their credentials which Nobl9 stores safely.
Nobl9 Webβ
To set up this type of connection:
- Navigate to Integrations > Sources.
- Click .
- Click the required Source icon.
- Choose Direct.
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel can change.
- The
-
Enter your organization's Realm to connect your data source.
Refer to the Authentication section above for more details. -
Enter the Access Token environment variable for authentication with the organization API Access Token.
Refer to the Authentication section above for more details.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Splunk Observability integration for Query delay is
5 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Splunk Observability integration for Query delay is
- Click Add Data Source
sloctlβ
The YAML for setting up a direct connection to Splunk Observability looks like this:
apiVersion: n9/v1alpha
kind: Direct
metadata:
name: splunk-observability-direct
displayName: Splunk Observability direct
project: splunk-observability-direct
spec:
description: Direct integration with Splunk Observability
sourceOf:
- Metrics
- Services
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
splunkObservability:
realm: us1
accessToken: example-access-token
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
splunkObservability.realm mandatory | string | See realms in endpoints | Splunk Observability documentation for more details. |
splunkObservability.accessToken mandatory | string, secret | Environment variable used for authentication with the organization API Access Token. See authentication section above for more details. |
Agent connection methodβ
Nobl9 Webβ
Follow the instructions below to configure your Splunk Observability agent.
- Navigate to Integrations > Sources.
- Click .
- Click the required Source icon.
- Choose Agent.
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel can change.
- The
-
Enter your organization's Realm to connect your data source.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Splunk Observability integration for Query delay is
5 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Splunk Observability integration for Query delay is
- Click Add Data Source
sloctlβ
The YAML for setting up an agent connection to Splunk Observability looks like this:
apiVersion: n9/v1alpha
kind: Agent
metadata:
name: splunk-observability
displayName: Splunk Observability
project: splunk-observability
spec:
description: Agent settings for Splunk Observability
sourceOf:
- Metrics
- Services
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
splunkObservability:
realm: us1
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
logCollectionEnabled optional | boolean | Optional. Defaults to false . Set to true if you'd like your direct to collect event logs. Beta functionality available only through direct release channel. Reach out to support@nobl9.com to activate it. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
splunkObservability.realm mandatory | string | See realms in endpoints | Splunk Observability documentation for more details. |
You can deploy only one agent in one YAML file by using the sloctl apply
command.
Agent deploymentβ
When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available in the web UI, under the Agent Configuration section. Be sure to swap in your credentials (e.g., replace the <SPLUNK_OBSERVABILITY_ACCESS_TOKEN>
with your organization key).
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent. It will look something like this:
# DISCLAIMER: This Deployment description is containing only the necessary fields for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description and the client_id as well as the client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-dwq-ble
namespace: default
type: Opaque
stringData:
splunk_observability_access_token: "<SPLUNK_OBSERVABILITY_ACCESS_TOKEN>"
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-splunkobs-deployment
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "splunkobs"
nobl9-agent-project: "deployment"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "splunkobs"
nobl9-agent-project: "deployment"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.80.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-splunkobs-deployment
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-dwq-ble
- name: SPLUNK_OBSERVABILITY_ACCESS_TOKEN
valueFrom:
secretKeyRef:
key: splunk_observability_access_token
name: nobl9-agent-nobl9-dev-dwq-ble
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the agent. It will look something like this:
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply command, and you will need to replace the placeholder values with your own values.
docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-splunkobs_deployment \
-e N9_CLIENT_ID="unique_client_id" \
-e N9_CLIENT_SECRET="unique_client_secret" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
-e SPLUNK_OBSERVABILITY_ACCESS_TOKEN="<SPLUNK_OBSERVABILITY_ACCESS_TOKEN>" \
nobl9/agent:0.80.0
Log sampling for the Splunk Observability agentβ
The Splunk Observability agent features a logging mechanism to handle burstable log loads. It applies only to redundant points dropping information. Other logs are logged normally.
You can decide whether you want to use log sampling or not by setting SPLUNK_OBSERVABILITY_DATA_POINT_LOG_SAMPLING_CONFIG
environment variable. It's a JSON object with the following fields:
- JSON
- Kubernetes - logs
- Alternative config
{
"burst": int, // how many messages?
"period": int, // how often? (in seconds)
"enabled": bool,
}
Here's an example of Kubernetes deployment YAML with activated burst logs:
# DISCLAIMER: This Deployment description is containing only the necessary fields for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description and the client_id as well as the client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-dwq-ble
namespace: default
type: Opaque
stringData:
splunk_observability_access_token: "<SPLUNK_OBSERVABILITY_ACCESS_TOKEN>"
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-splunkobs-deployment
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "splunkobs"
nobl9-agent-project: "deployment"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "splunkobs"
nobl9-agent-project: "deployment"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.80.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-splunkobs-deployment
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-dwq-ble
- name: SPLUNK_OBSERVABILITY_ACCESS_TOKEN
valueFrom:
secretKeyRef:
key: splunk_observability_access_token
name: nobl9-agent-nobl9-dev-dwq-ble
- name: N9_METRICS_PORT
value: "9090"
- name: SPLUNK_OBSERVABILITY_DATA_POINT_LOG_SAMPLING_CONFIG
value: '{ \"burst\": 3, \"period\": 120, \"enabled\": true}'
Here's an alternative way to activate burst logs via a YAML config:
- name: SPLUNK_OBSERVABILITY_DATA_POINT_LOG_SAMPLING_CONFIG
value: '{"enabled":true}'
The above YAMLs default .enabled
to false
so that agents by default don't use it.
If only the .enabled
variable is set to true, it defaults .burst
to 1
, and .period
to 900
, which is an equivalent to log 1 message each 15 minutes per organization
.
Here's an example of configuration that allows to log 3 messages per 120 seconds per organization:
"{ \"burst\": 3, \"period\": 120, \"enabled\": true}"
Creating SLOs with Splunk Observabilityβ
Nobl9 Webβ
Follow the instructions below to create your SLOs with Splunk Observability in the UI:
-
Navigate to Service Level Objectives.
-
Click .
-
In step 2, select Splunk Observability as the Data Source for your SLO, then specify the Metric. You can choose either a Threshold Metric, where a single time series is evaluated against a threshold, or a Ratio Metric, which allows you to enter two time series to compare (for example, a count of good requests and total requests).
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values.
It results in a constantly increasing SLO graph.
-
Enter a Program (for the Threshold metric), or Program for good counter, and Program for total counter (for the count metric). The following are program examples:
-
Threshold metric for Splunk Observability:
A = data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish(label='A', enable=False);
B = data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').stddev().publish(label='B', enable=False);
C = (B/A).publish(label='C'); -
Ratio metric for Splunk Observability:
Program for good counter:
data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'),rollup='rate').stddev().publish()
Program for total counter:
data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish()
SLI values for good and totalWhen choosing the query for the ratio SLI (countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
-
-
In step 3, define a Time Window for the SLO.
-
Rolling time windows are better for tracking the recent user experience of a service.
-
Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.
-
-
In step 4, specify the Error Budget Calculation Method and your Objective(s).
- Occurrences method counts good attempts against the count of total attempts.
- Time Slicesmethod measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
See the use case example and the SLO calculations guide for more information on the error budget calculation methods.
-
In step 5, add the Display name, Name, and other settings for your SLO:
- Create a composite SLO
- Set notification on data, if this option is available for your data source.
When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes. - Add alert policies, labels, and links, if required.
You can add up to 20 links per SLO.
-
Click Create SLO.
sloctlβ
- rawMetric
- countMetric
Hereβs an example of Splunk Observability using a rawMetric
(threshold metric):
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: tokyo-server-4-latency
displayName: Server4 Latency [Tokyo]
project: splunk-observability
spec:
description: Latency of Server4 in Tokyo ragion
service: splunk-observability-demo-service
indicator:
metricSource:
name: splunk-observability
timeWindows:
- unit: Day
count: 1
calendar:
startTime: 2020-01-21 12:30:00
timeZone: America/New_York
budgetingMethod: Occurrences
objectives:
- displayName: Excellent
op: lte
rawMetric:
query:
splunkObservability:
program: 'data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish()'
value: 200
target: 0.8
- displayName: Good
op: lte
rawMetric:
query:
splunkObservability:
program: 'data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish()'
value: 250
target: 0.9
- displayName: Poor
op: lte
rawMetric:
query:
splunkObservability:
program: 'data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish()'
value: 300
target: 0.99
Hereβs an example of Splunk Observability using a countMetric
(ratio metric):
- apiVersion: n9/v1alpha
kind: SLO
metadata:
displayName: Splunk Observability demo
name: splunk-obs-demo
project: splunk-observability
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: splunk-observability
project: splunk-observability
objectives:
- countMetrics:
incremental: false
good:
splunkObservability:
program: data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'),rollup='rate').stddev().publish()
total:
splunkObservability:
program: data('demo.trans.count', filter=filter('demo_datacenter', 'Tokyo'), rollup='rate').mean().publish()
displayName: Enough
target: 0.5
value: 1
service: splunk-observability-demo-service
timeWindows:
- count: 1
isRolling: true
period:
begin: "2021-05-05T10:39:55Z"
end: "2021-05-05T11:39:55Z"
unit: Hour
Important notes:
Metric specification from Splunk Observability has one field:
program
refers to a SignalFlow analytics program and is mandatory (string). Search criteria that return exactly one time series.program
must return only one key in the data map (one time series).
For query examples, check Signalflow: sample queries under the Useful links section.
Querying the Splunk Observability serverβ
Nobl9 queries Splunk observability 4 data points every minute, resulting in a 15-second resolution.
Splunk Observability API rate limitsβ
You can control your resource usage using org token (Access Tokens) limits. For more information, refer to the Org token limits | Splunk Observability documentation and the System limits for Splunk Infrastructure Monitoring | Splunk Observability documentation.