Cribl - Docs

Getting started with Cribl LogStream

Questions? We'd love to help you! Meet us in #cribl (sign up)

Changelog    Guides

Dynamic Sampling

Description


The Dynamic Sampling function filters out events based on an expression and a sampling rate.

Usage


Filter: Filter expression (JS) that selects data to be fed through the function. Defaults to empty - all events will be evaluated.

Description: Simple description about this function. Defaults to empty.

Final: If true, stops data from being fed to the downstream functions. Defaults to No.

Sample Mode: Defines how sample rate will be derived. Methods are supported; Logarithmic, log(previousPeriodCount) or Square Root, sqrt(previousPeriodCount). Defaults to Logarithmic.

Sample Group Key: Expression used to derive sample group key. For example: ${domain}:${httpCode}. Each sample group will have its own derived sampling rate based on volume. Defaults to `${host}` . (All events without a host field passing through the function will be associated with the same group and sampled the same.)

Advanced Settings:

  • Sample Period Sec: How often (in seconds) sample rates will be adjusted. Defaults to 30.
  • Minimum Events: Minimum number of events that must be received in previous sample period for sampling mode to be applied to current period. If the num events received for a sample group is less than min a sample rate of 1:1 is used. Defaults to 30.
  • Max Sampling Rate. Maximum Sampling rate. If computed sampling rate is above this value it will be clamped down to it.

How does dynamic sampling work


Compared to static sampling where users must select a sample rate apriori, Dynamic Sampling allows for automatically adjusting sampling rates based on incoming data volume per sample group. The function allows users to only set the aggressiveness/coarseness of this adjustment. Square Root is more aggressive than Logarithmic setting.

As an event passes through the function, it's evaluated against the Sample Group Key expression to determine the sample group it will be associated with. For example, given an event with these fields ...ip=1.2.3.42, port:1234... and a Sample Group Key of `${ip}:{port}` it will be associated with 1.2.3.42:1234 sample group.

Note: If Sample Group Key is left at default `${host}` all events without a host will be associated with the same group and sampled the same.

When a sample group is new, it will initially have a sample rate of 1:1 for Sample Period seconds (this defaults to 30 seconds). Once Sample Period seconds have elapsed, a sample rate will be derived based on the configured Sample Mode using sample group's event volume during the previous sample period.

For example, assume a Logarithmic Sample Mode:

Period 0 (first 30s): Number of events in sample group: 1000, Sample Rate: 1:1, Events allowed: ALL
Sample Rate calculation for next period: Math.ceil(Math.log(1000)) = 7

Period 1 (next 30s) -- Number of events in sample group: 4000, Sample Rate: 7:1: Events allowed: 572
Sample Rate calculation for next period: Math.ceil(Math.log(4000)) = 9

Period 2 (next 30s) -- Number of events in sample group: 12000, Sample Rate: 9:1: Events allowed: 1334
Sample Rate calculation for next period: Math.ceil(Math.log(12000)) = 10

Period 3 (next 30s) -- Number of events in sample group: 2000, Sample Rate: 10:1: Events allowed: 200
Sample Rate calculation for next period: Math.ceil(Math.log(2000)) = 8
...

Sample Modes:

  1. Logarithmic - The sample rate is derived for each sample group using Math.ceil(Math.log(lastPeriodVolume)) (natural log). This mode is less aggressive and drops fewer events.
  2. Square Root - The sample rate is derived for each sample group using Math.ceil(Math.sqrt(lastPeriodVolume)). This mode is more aggressive and drops more events.

Dynamic Sampling


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.