The Time Slices method of error budget calculation measures how many good minutes were achieved (minutes where a system is operating within defined boundaries) during a time window. Time slices are currently hard-coded to 1-minute evaluation intervals. A good minute is one where the Time Slice Allowance is not violated. This will be explained further later in this document.
No matter what budgeting method you choose, Target is used to calculate your error budget. If you want something to be reliable 95% of the time (or in other words, have at most a 5% failure rate over the defined time window), then your target should be set at 95%. This is the reliability target for the Service Level Objective (SLO).
Time Slice Allowance
The Time Slice Allowance is used to evaluate each time slice. It can be considered a micro-objective. This is used to determine if a time slice should be considered good or bad and is a separate evaluation from the error budget. Each time slice is evaluated independently to determine whether it fell within the defined allowance. If so, then it is considered a good minute. If not, then it is considered a bad minute, and some of the error budget will be burned.
If you decide that a good minute is one with a 90% success rate, then your Time Slice Allowance should be set at 90%. Your Target will then be for 95% of minutes to have at minimum a 90% success rate.
Use Case Example
Let’s say you are told that over a 24-hour time period a given SLI should have fewer than 10% slow responses (over 750 ms), 95% of the time. Another way of viewing this is that 90% of the data points must be good (in this case, with a value less than 750 ms), 95% of the time. You’ll need to use the Time Slices error budget calculation method for this SLO: you have been given a Time Slice Allowance of 90% - a maximum 10% error/failure rate per 1-minute time slice - and this must be achieved in 95% of the time slices in a 24-hour time period.