Composites use case
The following document describes a configured example of Composites 2.0 emulating a real-life scenario.
User journeyโ
The chart below illustrates a user journey defined for an online store:
First, the user interacts with a website and selects products they want to buy. When finalizing the purchase, they pay via an external payment provider integrated with the website.
After an order is placed, several things happen. An order with a request to dispatch a package is sent to the warehouse. The warehouse uses its software, which is hosted outside of our shopโs IT infrastructure and integrated via API. An invoice is issued and processed by accounting. An email with the order confirmation, delivery details from the warehouse, and an attached invoice is sent back to the user.
The entire scenario is divided into two distinct phases:
- Before the user places an order
- After the user places an order.
System architectureโ
Both of these phases mix steps that are fulfilled by software hosted in the storeโs IT infrastructureโstore website, email server, invoicing softwareโand external providers, such as payment and warehouse services.
The following chart illustrates the store's IT architecture:
The company uses Prometheus to monitor all its self-hosted services. Due to the limitations imposed by external service providers, metrics regarding warehouse operations and payment services are only available via Datadog integration.
Nobl9 SLO configurationโ
The company has already configured a set of SLOs for each of the services using both available data sources. Because different teams are responsible for self-hosted applications and external integrations, these configurations were organized into separate Nobl9 projects. The reliability of most services is measured with two SLOs: one SLO measuring availability and one SLO measuring the latency of the service.
The company is now interested in measuring the overall reliability of services that contribute to the core user flow of placing orders and making purchases. The following observations has been made when defining reliability requirements:
Based on the following criteria, the company introduces the following SLOs:
-
A user experience of purchase user journey SLO that further aggregates the above two
The entire Nobl9 configuration looks like this:
SLO configurationโ
The following section includes configuration of all SLOs defined for the online store along with specific services connected to them.
Component SLOsโ
- Store website
- Payments
- Invoices
- Emailing
- Warehouse
apiVersion: n9/v1alpha
kind: Service
metadata:
name: store-web
displayName: Store Website
project: e-commerce
spec:
description: User facing store website.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: store-web-availability
displayName: Store Website Availability
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*1+10)%(minute(vector(time()*1+10*60))+1)
target: 0.95
value: 50
service: store-web
timeWindows:
- count: 28
isRolling: true
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: store-web-latency
displayName: Store Website Latency
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Latency
name: latency
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*1+17)%(minute(vector(time()*1+17*60))+1)
target: 0.9
value: 40
service: store-web
timeWindows:
- count: 14
isRolling: true
unit: Day
apiVersion: n9/v1alpha
kind: Service
metadata:
name: payments
displayName: Payments integration
project: external-services
spec:
description: Module mediating between store website and external payments provider.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: payments-availability
displayName: Payments integration availability
project: external-services
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Direct
name: datadog
project: external-services
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
datadog:
query: avg:system.cpu.user{*}
target: 0.95
value: 16
service: payments
timeWindows:
- count: 28
isRolling: true
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: payments-latency
displayName: Payments integration latency
project: external-services
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Direct
name: datadog
project: external-services
objectives:
- displayName: Latency
name: latency
op: lt
primary: true
rawMetric:
query:
datadog:
query: avg:system.cpu.user{*}
target: 0.9
value: 15.9
service: payments
timeWindows:
- count: 14
isRolling: true
unit: Day
apiVersion: n9/v1alpha
kind: Service
metadata:
name: invoices
displayName: Invoicing system
project: e-commerce
spec:
description: Service responsible for generating invoices.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: invoices-availability
displayName: Invoices system availability
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*2+40)%(minute(vector(time()*2+40*60))+1)
target: 0.95
value: 50
service: invoices
timeWindows:
- count: 28
isRolling: true
unit: Day
apiVersion: n9/v1alpha
kind: Service
metadata:
name: emailing
displayName: Emailing system
project: e-commerce
spec:
description: Service responsible for delivering e-mail confirmations to users.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: emailing-availability
displayName: Emailing system availability
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*1+5)%(minute(vector(time()*1+5*60))+1)
target: 0.95
value: 50
service: emailing
timeWindows:
- count: 28
isRolling: true
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: emailing-latency
displayName: Emailing system latency
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Latency
name: latency
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*2+0)%(minute(vector(time()*1+0*60))+1)
target: 0.9
value: 40
service: emailing
timeWindows:
- count: 14
isRolling: true
unit: Day
apiVersion: n9/v1alpha
kind: Service
metadata:
name: emailing
displayName: Emailing system
project: e-commerce
spec:
description: Service responsible for delivering e-mail confirmations to users.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: emailing-availability
displayName: Emailing system availability
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*1+5)%(minute(vector(time()*1+5*60))+1)
target: 0.95
value: 50
service: emailing
timeWindows:
- count: 28
isRolling: true
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: emailing-latency
displayName: Emailing system latency
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Latency
name: latency
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*2+0)%(minute(vector(time()*1+0*60))+1)
target: 0.9
value: 40
service: emailing
timeWindows:
- count: 14
isRolling: true
unit: Day
Composite SLOsโ
- Pre-purchase UX
- Post-purchase UX
- Purchase user journey
apiVersion: n9/v1alpha
kind: Service
metadata:
name: user-experience
displayName: User experience
project: e-commerce
spec:
description: Service for grouping all user experience based SLOs.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: pre-purchase-user-experience
displayName: Pre-purchase user experience
project: e-commerce
spec:
description: Measures experience before user makes a purchase.
budgetingMethod: Occurrences
objectives:
- displayName: Pre-purchase user experience
name: pre-purchase-user-experience
target: 0.9
composite:
maxDelay: 45m
components:
objectives:
- project: e-commerce
slo: store-web-availability
objective: availability
weight: 4
whenDelayed: CountAsBad
- project: e-commerce
slo: store-web-latency
objective: latency
weight: 1
whenDelayed: CountAsGood
- project: external-services
slo: payments-availability
objective: availability
weight: 3
whenDelayed: CountAsBad
- project: external-services
slo: payments-latency
objective: latency
weight: 1
whenDelayed: CountAsGood
service: user-experience
timeWindows:
- unit: Day
count: 28
isRolling: true
alertPolicies:
- slow-budget-drop
- slow-burn
- fast-budget-drop
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: post-purchase-user-experience
displayName: Post-purchase user experience
project: e-commerce
spec:
description: Measures experience after user had made purchase.
budgetingMethod: Occurrences
objectives:
- displayName: Post-purchase user experience
name: post-purchase-user-experience
target: 0.9
composite:
maxDelay: 45m
components:
objectives:
- project: e-commerce
slo: invoices-availability
objective: availability
weight: 4
whenDelayed: CountAsBad
- project: e-commerce
slo: emailing-availability
objective: availability
weight: 3
whenDelayed: CountAsBad
- project: e-commerce
slo: emailing-latency
objective: latency
weight: 1
whenDelayed: CountAsGood
- project: external-services
slo: warehouse-availability
objective: availability
weight: 4
whenDelayed: CountAsBad
service: user-experience
timeWindows:
- unit: Day
count: 14
isRolling: true
alertPolicies:
- slow-budget-drop
- slow-burn
- fast-budget-drop
apiVersion: n9/v1alpha
kind: Service
metadata:
name: invoices
displayName: Invoicing system
project: e-commerce
spec:
description: Service responsible for generating invoices.
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: invoices-availability
displayName: Invoices system availability
project: e-commerce
spec:
budgetingMethod: Occurrences
indicator:
metricSource:
kind: Agent
name: prometheus
project: e-commerce
objectives:
- displayName: Availability
name: availability
op: lt
primary: true
rawMetric:
query:
prometheus:
promql: (time()*2+40)%(minute(vector(time()*2+40*60))+1)
target: 0.95
value: 50
service: invoices
timeWindows:
- count: 28
isRolling: true
unit: Day
The queries in all SLOs are entirely hypothetical. The data doesn't reflect the actual availability or latency of any real system.