Splunk
Splunk provides software for searching, monitoring, and analyzing machine-generated data via a Web-style interface. Splunk-Nobl9 integration allows users to enter their metrics using the Splunk Processing Language (SPL).
Scope of supportβ
- Query parameters retrieval with
sloctl
- SLI Analyzer
- Replay
- Event logs for direct connection method
Nobl9 does not support a self-signed Splunk Enterprise. Nobl9 agent requires that if Splunk Enterprise is configured to use TLS, then it must successfully pass certificate validation which self-signed certificates do not.
Requirementsβ
Splunk API Endpoint URLβ
For connecting to the required Splunk instance, both direct and agent connection methods require API Endpoint URL to contain the following:
SPLUNK_BASE_URL
the base URL configured during the deployment of Splunk software, for Splunk Enterprise.PORT_NUMBER
:8089
, if the API is using the default port.
Ask your Splunk administrator for the API Token and correct URL for connecting.
This URL must point to the base API URL of the Splunk Search app.
Usually, the format is {SPLUNK_BASE_URL}:{PORT_NUMBER}/services/
.
So, for example, your resulting API Endpoint URL can be https://splunk.my-instance.com:8089/services/
.
Here's a quick checklist to avoid request failures:
- Splunk base URL: confirm it's correct with your Splunk administrator
- Port:
8089
by default, or your specific port /services/
: ensure it's exactly like this
Authenticationβ
Splunk agent deployment requires authentication. You can authenticate in either way:
-
With Splunk Search App REST API, using
SAML
.
For this, pass your Splunk App Token with theSPLUNK_APP_TOKEN
environment variable. -
Passing your token with a local config file under the
n9splunk
section.For exampleCreate the
cfg.toml
file and specify your token as then9splunk
value:[n9splunk]
application_token="YOUR_TOKEN"Likewise, you can use your username and password with the
app_user
andapp_password
keys. -
Using the
basic
authentication method.
This requires passing your user credentials with theSPLUNK_USER
andSPLUNK_PASSWORD
environment variables at the agent startup.
Minimum required permissionsβ
Ensure the following permissions are set for the Nobl9 agent:
- The
search
capability - Access to index
Alternatively, you can use a wildcard:
Adding Splunk as a data sourceβ
To ensure data transmission between Nobl9 and your data source,
it may be necessary to list Nobl9 IP addresses as trusted.
- 18.159.114.21
- 18.158.132.186
- 3.64.154.26
You can add the Splunk data source using the direct or agent connection methods.
Direct connection methodβ
Direct configuration for Splunk requires users to enter their credentials, which Nobl9 stores safely.
Nobl9 Webβ
Follow these steps to set up a direct configuration:
- Navigate to Integrations > Sources.
- Click .
- Click the required Source icon.
- Choose Direct.
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel can change.
- The
-
Specify API Endpoint URL to connect to your required Splunk instance.
Example URL:https://splunk.example.com:8089/services/
. Make sure it doesn't contain any typos. -
Enter the Access Token generated from your Splunk instance (mandatory).
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Splunk integration for Query delay is
5 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Splunk integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on this data source.
- The maximum period value depends on the data source.
Find the maximum value for your data source. - A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
sloctlβ
The YAML for setting up a direct connection to Splunk looks like this:
apiVersion: n9/v1alpha
kind: Direct
metadata:
name: splunk-direct
displayName: Splunk direct
project: splunk-direct
spec:
description: Direct integration with Splunk
sourceOf:
- Metrics
- Services
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
splunk:
accessToken: ""
url: "https://splunk.example.com"
historicalDataRetrieval:
maxDuration:
value: 30
unit: Day
defaultDuration:
value: 0
unit: Day
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
logCollectionEnabled optional | boolean | Optional. Defaults to false . Set to true if you'd like your direct to collect event logs. Beta functionality available only through direct release channel. Reach out to support@nobl9.com to activate it. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
splunk.accessToken mandatory | string, secret | Environment variable used for authentication with the Splunk Search App REST API. See authentication for more details. |
splunk.URL mandatory | string | Base API URL of the Splunk Search app. See authentication for more details. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
Agent connection methodβ
Nobl9 Webβ
Follow the instructions below to configure your Splunk agent.
- Navigate to Integrations > Sources.
- Click .
- Click the required Source icon.
- Choose Agent.
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel can change.
- The
-
Specify API Endpoint URL to connect to your required Splunk instance.
Example URL:https://splunk.example.com:8089/services/
. Make sure it doesn't contain any typos.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Splunk integration for Query delay is
5 minutes
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Splunk integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved when replaying your SLO based on this data source.
- The maximum period value depends on the data source.
Find the maximum value for your data source. - A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
sloctlβ
The YAML for setting up an agent connection to Splunk looks like this:
apiVersion: n9/v1alpha
kind: Agent
metadata:
name: splunk
project: splunk
spec:
sourceOf:
- Metrics
- Services
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
splunk:
url: https://splunk.example.com
historicalDataRetrieval:
maxDuration:
value: 30
unit: Day
defaultDuration:
value: 0
unit: Day
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
splunk.URL mandatory | string | Base API URL of the Splunk Search app. See authentication section above for more details. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
You can deploy only one agent in one YAML file by using the sloctl apply
command.
Agent deploymentβ
When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available in the web UI, under the Agent Configuration section. Be sure to swap in your credentials (e.g., replace <SPLUNK_APP_TOKEN>
with your organization key).
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent. It will look something like this:
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-default-splunkagent
namespace: default
type: Opaque
stringData:
splunk_app_token: "<SPLUNK_APP_TOKEN>"
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-default-splunkagent
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "splunkagent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "splunkagent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:0.80.0
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_APP_TOKEN
valueFrom:
secretKeyRef:
key: splunk_app_token
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_USER
valueFrom:
secretKeyRef:
key: splunk_user
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_PASSWORD
valueFrom:
secretKeyRef:
key: splunk_password
name: nobl9-agent-nobl9-dev-default-splunkagent
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the agent. It will look something like this:
# DISCLAIMER: This Docker command contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply command, and you will need to replace the placeholder values with your own values.
docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-default-splunkagent \
-e N9_CLIENT_ID="unique_client_id" \
-e N9_CLIENT_SECRET="unique_client_secret" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
-e SPLUNK_APP_TOKEN="<SPLUNK_APP_TOKEN>" \
-e SPLUNK_USER="<SPLUNK_USERNAME>" \
-e SPLUNK_PASSWORD="<SPLUNK_PASSWORD>" \
nobl9/agent:0.80.0
With agent version 0.65.3
, we've introduced the support for the following environment variables in the Splunk integration:
- name: N9_SPLUNK_COLLECTION_JITTER
value: "15s" # Deafult: 15s
- name: N9_SPLUNK_QUERY_INTERVAL
value: "1m" # Default: 1m
- name: N9_SPLUNK_HTTP_CLIENT_TIMEOUT
value: "15s" # Default: 15s
Creating SLOs with Splunkβ
Nobl9 Webβ
-
Navigate to Service Level Objectives.
-
Click .
-
In step 1 of the SLO wizard, select the Service the SLO will be associated with.
-
In step 2, select Splunk as the Data Source for your SLO, then specify the Metric. You can choose either a Threshold Metric, where a single time series is evaluated against a threshold or a Ratio Metric, which allows you to enter two time series to compare (for example, a count of good requests and total requests).
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values.
It results in a constantly increasing SLO graph.
-
Enter a Query, Good query, or Total query for the metric you selected:
-
Every query must return
n9time
andn9value
fields.- The
n9time
field must be a Unix timestamp, andn9value
field must be a float value.
- The
-
Nobl9 validates the query provided by the user against 3 rules:
-
The query contains
index=
with a value. -
The query contains the
n9time
value. -
The query contains an
n9value
value.
-
-
Every time range of the dataset is segmented into 15-second chunks and aggregated. The aggregation is as follows:
-
Raw metric: calculates the average.
-
Count metric incremental: takes the max value.
-
Count metric non-incremental: the sum of values.
SLI values for good and totalWhen choosing the query for the ratio SLI (countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
-
-
-
In step 3, define a Time Window for the SLO.
-
Rolling time windows are better for tracking the recent user experience of a service.
-
Calendar-aligned windows are best suited for SLOs that are intended to map to business metrics measured on a calendar-aligned basis, such as every calendar month or every quarter.
-
In step 4, specify the Error Budget Calculation Method and your Objective(s).
- Occurrences method counts good attempts against the count of total attempts.
- Time Slicesmethod measures how many good minutes were achieved (when a system operates within defined boundaries) during a time window.
- You can define up to 12 objectives for an SLO.
See the use case example and the SLO calculations guide for more information on the error budget calculation methods.
-
In step 5, add the Display name, Name, and other settings for your SLO:
- Create a composite SLO
- Set notification on data, if this option is available for your data source.
When activated, Nobl9 notifies you if your SLO hasn't received data or received incomplete data for more than 15 minutes. - Add alert policies, labels, and links, if required.
You can add up to 20 links per SLO.
-
Click Create SLO.
sloctlβ
- rawMetric
- countMetric
Hereβs an example of Splunk using a rawMetric
(threshold metric):
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-raw-rolling
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
name: splunk
timeWindows:
- unit: Day
count: 7
isRolling: true
budgetingMethod: Occurrences
objectives:
- displayName: Good
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.25
target: 0.50
- displayName: Moderate
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.5
target: 0.75
- displayName: Annoying
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 1.0
target: 0.95
---
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-raw-calendar
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
name: splunk
timeWindows:
- unit: Day
count: 7
calendar:
startTime: 2020-03-09 00:00:00
timeZone: Europe/Warsaw
budgetingMethod: Occurrences
objectives:
- displayName: Good
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.25
target: 0.50
- displayName: Moderate
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.5
target: 0.75
- displayName: Annoying
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 1.0
target: 0.95
Hereβs an example of Splunk using a countMetric
(ratio metric):
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-counts-rolling
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
kind: Agent
name: splunk
project: splunk
timeWindows:
- unit: Hour
count: 1
isRolling: true
budgetingMethod: Occurrences
objectives:
- displayName: Poor
target: 0.50
countMetrics:
incremental: false
good:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats count as n9value by _time | rename _time as n9time | fields n9time n9value
total:
splunk:
query: index=* source=udp:5072 sourcetype=syslog | bucket _time span=1m | stats count as n9value by _time | rename _time as n9time | fields n9time n9value
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-counts-calendar
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
kind: Agent
name: splunk
project: splunk
timeWindows:
- unit: Day
count: 1
calendar:
startTime: 2021-04-09 00:00:00
timeZone: Europe/Warsaw
budgetingMethod: Occurrences
objectives:
- displayName: So so
target: 0.80
countMetrics:
incremental: false
good:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats count as n9value by _time | rename _time as n9time | fields n9time n9value
total:
splunk:
query: index=* source=udp:5072 sourcetype=syslog | bucket _time span=1m | stats count as n9value by _time | rename _time as n9time | fields n9time n9value
Splunk queries require:
-
Defining an
index
attribute ("index=index_name"
) to avoid long-running queries.-
The query can retrieve data from both, the Events and Metrics indexes.
-
You can retrieve Metrics data by using the
| mstats
command. -
To retrieve data from the Events and Metrics indexes, you must enter SPL query and select a proper index:
index=_metrics
orindex=_events
, where_metrics
is the name of the metrics index, and_events
is the name of the events index. For more information on the SPL query, refer to the About the Search Language | Splunk documentation. -
Query example for Events index:
search index=_events sourcetype=syslog status<400
| bucket _time span=1m
| stats count as n9value by _time
| rename _time as n9time
| fields n9time n9value -
Query example for Metrics index:
| mstats avg("my.metric") as n9value WHERE index=_metrics span=15s
| rename _time as n9time
| fields n9time n9value
-
-
A return value for
n9time
andn9value
.
Use Splunk field extractions to return values using those exact names. Then9time
is the actual time, and then9value
is the metric value. Then9time
must be a Unix timestamp and then9value
must be a float value.- Example:
index=myserver-events source=udp:5072 sourcetype=syslog response_time>0
| rename _time as n9time, response_time as n9value
| fields n9time n9value
Typically, you will rename
_time
ton9time
and then rename the field containing the metric value (response_time
in the previous example) to then9value
. The following is the appendage to your normal query that handles this.| rename _time as n9time, response_time as n9value
| fields n9time n9value- The Splunk query will be executed once every minute, returning the values found in the fields
n9time
andn9value
. Ensure your hardware can support the query frequency.
- Example:
Querying Splunk serverβ
The Nobl9 agent leverages Splunk Enterprise API parameters. It pulls data at a per-minute interval from the Splunk server.
API rate limits for the Nobl9 agentβ
Splunk Enterprise API rate limits are configured by its administrators.
Rate limits must be high enough to accommodate searches from the Nobl9 agent.
The Nobl9 agent makes one query per minute per unique query
.
Read more in Maximum and actual search concurrency calculations | Splunk community.
For the best results, the number of concurrent searches must be about the same as the number of SLIs you have for this data source.
Number of events returned from Splunk queriesβ
Supported search SPL command searches within indexed events. The total number of events can be large, and a query without specific conditions, such as search sourcetype=*
, returns all indexed events. A large number of data points sent to Nobl9 could disrupt the systemβs performance. Therefore, there is a hard limit of 4 events per minute.
File-based queries and Splunk disk quotaβ
If youβre using file-based queries (the inputlookup
function) instead of index-based queries, your query might not work as expected. Due to the difference in jitter configuration between Splunk and Nobl9, you might need to increase your Splunk disk quota for the inputlookup
function to work properly.
To determine the appropriate disk quota size for your Splunk account, we recommend the following steps:
- Go to the Splunk UI and navigate to Activity > Jobs.
- Filter the logs by the user you currently use in Nobl9 App.
- Execute requests at 29 seconds intervals to gather all logs from the corresponding cycle.
- Sum the sizes of all requests from the list to determine the minimum disk quota. It is important to add a buffer to this number for safety.
- Once you have created more SLOs, adjust the disk quota accordingly.
We suggest increasing the quota to 2GB to resolve the issue. However, itβs important to note that the final disk quota size will depend on the data being queried.
Known limitationsβ
Query limitations:
- Within search command Time Range Modifiers | Splunk documentation
earliest
andlatest
are not allowed.