Skip to main content

Splunk

Reading time: 0 minute(s) (0 words)

Splunk provides software for searching, monitoring, and analyzing machine-generated data via a Web-style interface. Splunk integration with Nobl9 allows users to enter their metrics using the Splunk Processing Language (SPL).

Scope of supportโ€‹

Nobl9 does not support a self-signed Splunk Enterprise. Nobl9 agent requires that if Splunk Enterprise is configured to use TLS then it must successfully pass certificate validation which self-signed certificates do not.

Authenticationโ€‹

Splunk configuration for the agent only accepts a single parameter: url. The url has to point to the base API URL of the Splunk Search app. It will usually have a form of {SPLUNK_BASE_URL}:{PORT_NUMBER}/services where:

  • SPLUNK_BASE_URL - for Splunk Enterprise, the base URL is configured during the deployment of Splunk software.

  • PORT_NUMBER - Assuming the API is using the default port is 8089. It is recommended that you contact your Splunk Admin to get your API Token and to verify the correct URL to connect.

When deploying the agent for Splunk, you can use one of the following authentication methods:

  • SAML: provide SPLUNK_APP_TOKEN as environment variables for authentication with the Splunk Search App REST API.
  • basic: to use basic type, the agent requires SPLUNK_USER and SPLUNK_PASSWORD passed as an environment variables during agent startup.

For more details, refer to the How to Obtain Value for SPLUNK_APP_TOKEN | Splunk documentation.

tip

Alternatively, you can pass the token using a local config file with key application_token under the n9splunk section. You can give your username and password the same way, and the keys are app_user and app_password.

Adding Splunk as a data sourceโ€‹

To add Splunk as a data source in Nobl9 using the agent or direct connection method, follow these steps:

  1. Navigate to Integrations > Sources.
  2. Click .
  3. Click the relevant Source icon.
  4. Choose a relevant connection method (Agent or Direct), then configure the source as described below.

Splunk directโ€‹

Direct configuration in the UIโ€‹

Direct configuration for Splunk requires users to enter their credentials, which Nobl9 stores safely. Follow these steps to set up a direct configuration:

caution

To access your Splunk Cloud Platform deployment using the Splunk REST API, you must submit a case requesting access using the Splunk Support Portal Splunk direct opens port 8089 for REST access. You can specify a range of IP addresses to control who can access the REST API.

You will need to allow the following IPs:

18.159.114.21 18.158.132.186 3.64.154.26

  1. Select one of the following Release Channels:
    • The stable channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of a beta release. Use it to avoid crashes and other limitations.
    • The beta channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel may be subject to change.
  2. Enter anย API Endpoint URLย to connect to your data source (mandatory).
    Refer to the Authentication section above for more details.

  3. Enter theย Access Tokenย generated from your Splunk instance (mandatory).
    Refer to the Authentication section above for more details.

  1. Select a Project.
    Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank then object is assigned to project default.
  2. Enter a Display Name.
    You can enter a friendly name with spaces in this field.
  3. Enter a Name.
    The name is mandatory and can only contain lowercase, alphanumeric characters and dashes (for example, my-project-name). This field is populated automatically when you enter a display name, but you can edit the result.
  4. Enter a Description.
    Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it.
  5. Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
    • The default value in Splunk integration for Query delay is 5 minutes.
    info
    Changing the Query delay may affect your SLI data. For more details, check the Query delay documentation.
  6. Enter a Maximum Period for Historical Data Retrieval.
    • This value defines how far back in the past your data will be retrieved.
    • The value for the maximum period of data retrieval depends on the data source. Check the Replay documentation for details.
    • A greater period can extend the loading time when creating an SLO.
      • The value must be a positive integer.
  7. Enter a Default Period for Historical Data Retrieval.
    • It is used by SLOs connected to this data source.
    • The value must be a positive integer or 0.
    • By default, this value is set to 0. When you set it to >0, you will create SLOs with Replay.
  8. Click Add Data Source.

Direct using CLI - YAMLโ€‹

The YAML for setting up a direct connection to Splunk looks like this:

apiVersion: n9/v1alpha
kind: Direct
metadata:
name: splunk-direct
displayName: Splunk direct
project: splunk-direct
spec:
description: Direct integration with Splunk
sourceOf:
- Metrics
- Services
releaseChannel: beta # string, one of: beta || stable
queryDelay:
unit: Minute # string, one of: Second || Minute
value: 720 # numeric, must be a number less than 1440 minutes (24 hours)
splunk:
accessToken: "" #secret
url: "https://splunk.example.com"
historicalDataRetrieval:
maxDuration:
value: 30 # integer greater than or equal to 0
unit: Day # accepted values: Minute, Hour, Day
defaultDuration: # value must be less than or equal to value of maxDuration
value: 0 # integer greater than or equal to 0
unit: Day # accepted values: Minute, Hour, Day

Important notes:

  • accessToken - required environment variable used for authentication with the Splunk Search App REST API. For more details, refer to the Authentication section above.
  • url - base API URL of the Splunk Search app. For more details, refer to the Authentication section above.
  • spec[n].historicalDataRetrieval - refer to Replay for more details.

Splunk agentโ€‹

Agent configuration in the UIโ€‹

Follow the instructions below to configure your Splunk agent. Refer to the section above for the description of the fields.

  1. Select one of the following Release Channels:
    • The stable channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of a beta release. Use it to avoid crashes and other limitations.
    • The beta channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel may be subject to change.
  2. Add the URL to connect to your data source.
    Example URL: https://splunk.example.com:8089/services. It is recommended that your contact your Splunk Admin to get your API Token and to verify the correct URL to connect.

  1. Enter a Project.
  2. Enter a Name.
  3. Create a Description.
  4. In the Advanced Settings you can:
    1. Enter a Maximum Period for Historical Data Retrieval.
    2. Enter a Default Period for Historical Data Retrieval.
  5. Click Add Data Source.

Agent using CLI - YAMLโ€‹

The YAML for setting up an agent connection to Splunk looks like this:

apiVersion: n9/v1alpha
kind: Agent
metadata:
name: splunk
project: splunk
spec:
sourceOf:
- Metrics
- Services
releaseChannel: beta # string, one of: beta || stable
queryDelay:
unit: Minute # string, one of: Second || Minute
value: 720 # numeric, must be a number less than 1440 minutes (24 hours)
splunk:
url: https://splunk.example.com
historicalDataRetrieval:
maxDuration:
value: 30 # integer greater than or equal to 0
unit: Day # accepted values: Minute, Hour, Day
defaultDuration: # value must be less than or equal to value of maxDuration
value: 0 # integer greater than or equal to 0
unit: Day # accepted values: Minute, Hour, Day

Important notes:

  • url - base API URL of the Splunk Search app. For more details, refer to the Authentication section above.
  • spec[n].historicalDataRetrieval - refer to Replay documentation for more details.
warning

You can deploy only one agent in one YAML file by using the sloctl apply command.

Deploying Splunk agentโ€‹

When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the agent. Both of these are available in the web UI, under the Agent Configuration section. Be sure to swap in your credentials (e.g., replace <SPLUNK_APP_TOKEN> with your organization key).

If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the agent. It will look something like this:

# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.

apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-default-splunkagent
namespace: default
type: Opaque
stringData:
splunk_app_token: "<SPLUNK_APP_TOKEN>"
client_id: "unique_client_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-default-splunkagent
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: "splunkagent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
template:
metadata:
labels:
nobl9-agent-name: "splunkagent"
nobl9-agent-project: "default"
nobl9-agent-organization: "nobl9-dev"
spec:
containers:
- name: agent-container
image: nobl9/agent:latest
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_APP_TOKEN
valueFrom:
secretKeyRef:
key: splunk_app_token
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_USER
valueFrom:
secretKeyRef:
key: splunk_user
name: nobl9-agent-nobl9-dev-default-splunkagent
- name: SPLUNK_PASSWORD
valueFrom:
secretKeyRef:
key: splunk_password
name: nobl9-agent-nobl9-dev-default-splunkagent
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donโ€™t want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"

With agent version 0.65.3, we've introduced the support for the following environment variables in the Splunk integration:

- name: N9_SPLUNK_COLLECTION_JITTER
value: "15s" # Deafult: 15s
- name: N9_SPLUNK_QUERY_INTERVAL
value: "1m" # Default: 1m
- name: N9_SPLUNK_HTTP_CLIENT_TIMEOUT
value: "15s" # Default: 15s

Creating SLOs with Splunkโ€‹

Creating SLOs in the UIโ€‹

  1. Navigate to Service Level Objectives.

  2. Click .
  3. In step 1 of the SLO wizard, select the Service the SLO will be associated with.

  4. In step 2, select Splunk as the Data Source for your SLO, then specify the Metric. You can choose either a Threshold Metric, where a single time series is evaluated against a threshold or a Ratio Metric, which allows you to enter two time series to compare (for example, a count of good requests and total requests).

    1. Choose the Data Count Method for your ratio metric:
    • Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
    • Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
  5. Enter a Query, Good query, or Total query for the metric you selected:

    • Every query must return n9time and n9value fields.

      • The n9time field must be a Unix timestamp, and n9value field must be a float value.
    • Nobl9 validates the query provided by the user against 3 rules:

      • The query contains index= with a value.

      • The query contains the n9time value.

      • The query contains an n9value value.

    • Every time range of the dataset is segmented into 15-second chunks and aggregated. The aggregation is as follows:

      • Raw metric: calculates the average.

      • Count metric incremental: takes the max value.

      • Count metric non-incremental: the sum of values.

  6. In step 3, define a Time Window for the SLO.

  7. In step 4, specify the Error Budget Calculation Method and your Objective(s).

  8. In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.

  9. When youโ€™re done, click Create SLO.

SLOs using Splunk - YAML samplesโ€‹

Hereโ€™s an example of Splunk using a rawMetric (threshold metric):

- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-raw-rolling
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
name: splunk
timeWindows:
- unit: Day
count: 7
isRolling: true
budgetingMethod: Occurrences
objectives:
- displayName: Good
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.25
target: 0.50
- displayName: Moderate
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.5
target: 0.75
- displayName: Annoying
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 1.0
target: 0.95
---
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: splunk-raw-calendar
project: splunk
spec:
service: splunk-service
indicator:
metricSource:
name: splunk
timeWindows:
- unit: Day
count: 7
calendar:
startTime: 2020-03-09 00:00:00
timeZone: Europe/Warsaw
budgetingMethod: Occurrences
objectives:
- displayName: Good
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.25
target: 0.50
- displayName: Moderate
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 0.5
target: 0.75
- displayName: Annoying
op: lte
rawMetric:
query:
splunk:
query: index=* source=udp:5072 sourcetype=syslog status<400 | bucket _time span=1m | stats avg(response_time) as n9value by _time | rename _time as n9time | fields n9time n9value
value: 1.0
target: 0.95

Splunk queries require:

  • Defining an index attribute ("index=index_name") to avoid long-running queries.

    • The query can retrieve data from both, the Events and Metrics indexes.

    • You can retrieve Metrics data by using the | mstats command.

    • To retrieve data from the Events and Metrics indexes, you must enter SPL query and select a proper index: index=_metrics or index=_events, where _metrics is the name of the metrics index, and _events is the name of the events index. For more information on the SPL query, refer to the About the Search Language | Splunk documentation.

    • Query example for Events index:

      search index=_events sourcetype=syslog status<400
      | bucket _time span=1m
      | stats count as n9value by _time
      | rename _time as n9time
      | fields n9time n9value
    • Query example for Metrics index:

      | mstats avg("my.metric") as n9value WHERE index=_metrics span=15s
      | rename _time as n9time
      | fields n9time n9value
  • A return value for n9time and n9value.
    Use Splunk field extractions to return values using those exact names. The n9time is the actual time, and the n9value is the metric value. The n9time must be a Unix timestamp and the n9value must be a float value.

    • Example:

      index=myserver-events source=udp:5072 sourcetype=syslog response_time>0
      | rename _time as n9time, response_time as n9value
      | fields n9time n9value

      Typically, you will rename _time to n9time and then rename the field containing the metric value (response_time in the previous example) to the n9value. The following is the appendage to your normal query that handles this.

      | rename _time as n9time, response_time as n9value
      | fields n9time n9value
      • The Splunk query will be executed once every minute, returning the values found in the fields n9time and n9value. Ensure your hardware can support the query frequency.

Querying Splunk serverโ€‹

The Nobl9 agent leverages Splunk Enterprise API parameters. It pulls data at a per-minute interval from the Splunk server.

API rate limits for the Nobl9 agentโ€‹

Splunk Enterprise API rate limits are configured by its administrators. Rate limits must be high enough to accommodate searches from the Nobl9 agent. The Nobl9 agent makes one query per minute per unique query.

Read more in Maximum and actual search concurrency calculations | Splunk community.

Concurrent searches

For the best results, the number of concurrent searches must be about the same as the number of SLIs you have for this data source.

Number of events returned from Splunk queriesโ€‹

Supported search SPL command searches within indexed events. The total number of events can be large, and a query without specific conditions, such as search sourcetype=*, returns all indexed events. A large number of data points sent to Nobl9 could disrupt the systemโ€™s performance. Therefore, there is a hard limit of 4 events per minute.

File-based queries and Splunk disk quotaโ€‹

If youโ€™re using file-based queries (the inputlookup function) instead of index-based queries, your query might not work as expected. Due to the difference in jitter configuration between Splunk and Nobl9, you might need to increase your Splunk disk quota for the inputlookup function to work properly.

To determine the appropriate disk quota size for your Splunk account, we recommend the following steps:

  1. Go to the Splunk UI and navigate to Activity > Jobs.
  2. Filter the logs by the user you currently use in Nobl9 App.
  3. Execute requests at 29 seconds intervals to gather all logs from the corresponding cycle.
  4. Sum the sizes of all requests from the list to determine the minimum disk quota. It is important to add a buffer to this number for safety.
  5. Once you have created more SLOs, adjust the disk quota accordingly.

We suggest increasing the quota to 2GB to resolve the issue. However, itโ€™s important to note that the final disk quota size will depend on the data being queried.

Known limitationsโ€‹

Query limitations:

REST Access | Splunk documentation

Create Auth Tokens | Splunk documentation

Agent metrics

Creating SLOs via Terraform

Creating agents via Terraform