These docs are for Cribl Stream 4.7 and are no longer actively maintained.
See the latest version (4.14).
SentinelOne DataSet Destination
Cribl Stream can send log events to the SentinelOne/Scalyr DataSet platform via the DataSet API. This Destination sends batches of events, as JSON, to that API’s addEvents endpoint.
Type: Streaming | TLS Support: Yes | PQ Support: Yes
Configuring Cribl Stream to Output to DataSet
From the top nav, click Manage, then select a Worker Group to configure. Next, you have two options:
To configure via the graphical QuickConnect UI, click Routing > QuickConnect (Stream) or Collect (Edge). Next, click Add Destination at right. From the resulting drawer’s tiles, select DataSet. Next, click either Add Destination or (if displayed) Select Existing. The resulting drawer will provide the options below.
Or, to configure via the Routing UI, click Data > Destinations (Stream) or More > Destinations (Edge). From the resulting page’s tiles or the Destinations left nav, select DataSet. Next, click Add Destination to open a New Destination modal that provides the options below.
General Settings
Output ID: Enter a unique name to identify this Destination definition.
Authentication Settings
Use the Authentication method buttons to select one of these options:
- Manual: Displays a field for you to enter an API key that is available in your DataSet profile. 
- Secret: This option exposes an API key (text secret) drop-down to select a stored secret that references an API key. A Create link is available to store a new, reusable secret. 
API key: Enter your DataSet API key that has Log Write Access.
Optional Settings
DataSet site: Select the US (default), Europe, or Custom region. If you select Custom, enter your custom endpoint URL.
Message field: Name of the event field that contains the message to send. If not specified, Cribl Stream sends all non-internal fields of events passing through the Destination. If specified, we follow this logic:
- If an event does not contain the specified field, send the whole event (except internal fields).
- If an event has the specified field, and the field’s value is a non-object, send the event in the format:{ message: <value from event> }.
- If an event has the specified field, and the field’s value is an object, send the event in the format:{ <all fields from the object> }.
Exclude fields: Fields to exclude from the event if the Message field either is unspecified or refers to an object. Ignored if the Message field is a string, number, or boolean. If empty, Cribl Stream sends all non-internal fields.
Default exclude fields are sev, _time, ts, and thread. We automatically send these fields as metadata of the event, in DataSet’s required format. This is to avoid charges for field bytes – metadata bytes do not count toward ingestion.
Server/host field: Name of the event field that contains the server or host that generated the event. Cribl Stream groups events by the value of this field, and gives them a unique session token to conform to the DataSet API. Each group is sent out as a separate batch; therefore, Cribl recommends specifying a field with a low cardinality, to avoid queuing up many different batches at the Destination. If not specified, or not a string, the implicit default value is cribl_<outputId>. For more details, see Rate Limiting and Session Management.
Timestamp field: Name of the event field that contains the event timestamp. Cribl Stream sends this value as part of each event’s metadata, not as an attribute field on the event.
Timestamps are automatically converted to a nanosecond-precision string. If an event does not contain the field specified Timestamp field, or if the value cannot be converted into a nanosecond-precision string, Cribl Stream assigns a timestamp using the first valid output returned from
ts,_time, orDate.now(), in that order.
Severity: Use the drop-down to assign a default value to the severity field (which is sent as event metadata, not as an attribute field). Cribl Stream falls back to this value when an event contains no valid sev or __severity field. DataSet’s severity model ranges from 0 least-severe (finest) to 6 most-severe (fatal).
- Where an event’s sevfield contains an integer in this range, Cribl Stream passes it through as the severity.
- Where the sevfield contains a string matching DataSet’s enum (finest,finer,fine,info,warning,error,fatal), Cribl Stream converts it to the corresponding integer.
Backpressure behavior: Specify whether to block, drop, or queue events when all receivers are exerting backpressure. Defaults to Block.
Tags: Optionally, add tags that you can use to filter and group Destinations in Cribl Stream’s Manage Destinations page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
Persistent Queue Settings
This section is displayed when the Backpressure behavior is set to Persistent Queue.
Max file size: The maximum data volume to store in each queue file before closing it. Enter a numeral with units of KB, MB, etc. Defaults to 1 MB.
Max queue size: The maximum amount of disk space the queue is allowed to consume. Once this limit is reached, Cribl Stream stops queueing and applies the fallback Queue‑full behavior. Enter a numeral with units of KB, MB, etc.
Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. To this value, Cribl Stream will append /<worker‑id>/<output‑id>.
Compression: Codec to use to compress the persisted data, once a file is closed. Defaults to None. Select Gzip to enable compression.
Queue-full behavior: Whether to block or drop events when the queue is exerting backpressure (because disk is low or at full capacity). Block is the same behavior as non-PQ blocking, corresponding to the Block option on the Backpressure behavior drop-down. Drop new data throws away incoming data, while leaving the contents of the PQ unchanged.
Clear persistent queue: Click this button if you want to flush out files that are currently queued for delivery to this Destination. A confirmation modal will appear. (Appears only after Output ID has been defined.)
Processing Settings
Post‑Processing
Pipeline: Pipeline to process data before sending the data out using this output.
System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe (identifying the Cribl Stream Pipeline that processed the event). Supports wildcards. Other options include:
- cribl_host– Cribl Stream Node that processed the event.
- cribl_input– Cribl Stream Source that processed the event.
- cribl_output– Cribl Stream Destination that processed the event.
- cribl_route– Cribl Stream Route (or QuickConnect) that processed the event.
- cribl_wp– Cribl Stream Worker Process that processed the event.
Retries
Honor Retry-After header: Whether to honor a Retry-After header, provided that the header specifies a delay no longer than 180 seconds. Cribl Stream limits the delay to 180 seconds even if the Retry-After header specifies a longer delay. When enabled, any Retry-After header received takes precedence over all other options configured in the Retries section. When disabled, all Retry-After headers are ignored.
Settings for failed HTTP requests: When you want to automatically retry requests that receive particular HTTP response status codes, use these settings to list those response codes.
For any HTTP response status codes that are not explicitly configured for retries, Cribl Stream applies the following rules:
| Status Code | Action | 
|---|---|
| Any in the 1xx,3xx, or4xxseries | Drop the request | 
| Any in the 5xxseries | Retry the request | 
Upon receiving a response code that’s on the list, Cribl Stream first waits for a set time interval called the Pre-backoff interval and then begins retrying the request. Time between retries increases based on an exponential backoff algorithm whose base is the Backoff multiplier, until the backoff multiplier reaches the Backoff limit (ms). At that point, Cribl Stream continues retrying the request without increasing the time between retries any further.
If the sender (which manages the connection to the Destination) is at capacity, it will not accept any incoming events. These incoming events originate internally from a previous stage of the data flow when Destinations send outbound requests to their respective external services, and they include retry requests and new requests. Any events that were already in transit when the sender reached capacity will continue to be processed downstream.
Sender capacity is freed up when an outgoing request succeeds or encounters a non-retryable error. When the sender has available capacity again, it will resume accepting incoming events. This capacity management is influenced by the number of active connections and configured limits, such as concurrency and buffer sizes. If a Pipeline sends events faster than the Destination can process, the buffers may fill up, leading to backpressure and Sender at capacity warnings. This backpressure prevents the sender from accepting additional requests until capacity is restored.
By default, 429 (Too Many Requests) is the only response code configured for automatic retries. For each response code you want to add to the list, click Add Setting and configure the following settings:
- HTTP status code: A response code that indicates a failed request, for example 429 (Too Many Requests)or503 (Service Unavailable).
- Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000(one second).
- Backoff multiplier: The base for the exponential backoff algorithm. A value of 2(the default) means that Cribl Stream will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
- Backoff limit (ms): The maximum backoff interval Cribl Stream should apply for its final retry, in milliseconds. Default (and minimum) is 10,000(10 seconds); maximum is180,000(180 seconds, or 3 minutes).
Retry timed-out HTTP requests: When you want to automatically retry requests that have timed out, toggle this control on to display the following settings for configuring retry behavior:
- Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000(one second).
- Backoff multiplier: The base for the exponential backoff algorithm. A value of 2(the default) means that Cribl Stream will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
- Backoff limit (ms): The maximum backoff interval Cribl Stream should apply for its final retry, in milliseconds. Default (and minimum) is 10,000(10 seconds); maximum is180,000(180 seconds, or 3 minutes).
Advanced Settings
Validate server certs: Defaults to Yes to reject certificates that are not authorized by a CA in the CA certificate path, nor by another trusted CA (e.g., the system’s CA).
Round-robin DNS: Toggle on to enable round-robin DNS lookup across multiple IP addresses, IPv4 and IPv6. When a DNS server resolves a Fully Qualified Domain Name (FQDN) to multiple IP addresses, Cribl Stream will sequentially use each address in the order they are returned by the DNS server for subsequent connection attempts.
Compress: Compresses the payload body before sending. Defaults to Yes (recommended).
Request timeout: Amount of time (in seconds) to wait for a request to complete before aborting it. Defaults to 30.
Request concurrency: Maximum number of concurrent requests before blocking. This is set per Worker Process. Defaults to 5.
Max body size (KB): Maximum size of the request body before compression. Defaults to 4096 KB. The actual request body size might exceed the specified value because the Destination adds bytes when it writes to the downstream receiver. Cribl recommends that you experiment with the Max body size value until downstream receivers reliably accept all events.
Buffer memory limit (KB): Total amount of memory used to buffer outgoing requests waiting to be sent. If left blank, defaults to 5 times the max body size (if set). If 0, no limit is enforced. This provides granular control over the memory allocated for buffering outgoing batched requests. Increasing the limit allows batches to grow larger before being flushed, improving efficiency for data with high cardinality (a large number of unique batches). Finding the optimal balance between efficient data transfer and memory usage involves adjusting both the Buffer memory limit and Max Body Size settings.
Max events per request: Maximum number of events to include in the request body. The 0 default allows unlimited events.
Flush period (sec): Maximum time between requests. Low values could cause the payload size to be smaller than its configured maximum. Defaults to 1.
Extra HTTP headers: Click Add header to define Name/Value pairs to pass as additional HTTP headers.
Failed request logging mode: Use this drop-down to determine which data should be logged when a request fails. Select among None (the default), Payload, or Payload + Headers. With this last option, Cribl Stream will redact all headers, except non-sensitive headers that you declare below in Safe headers.
Safe headers: Add headers here to declare them as safe to log in plaintext. (Sensitive headers like authorization will always be redacted, even if listed here.) Use a tab or hard return to separate header names.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Rate Limiting and Session Management
DataSet rate limits incoming data per session, not based on API key or IP address. The rate limit is 10 MB/sec per session, but the recommended limit is 2.5 MB/sec. To exceed this limit, split events into multiple sessions.
In Cribl Stream, the session contains:
- A unique session ID
- An optional serverHostdescribing the data source, defaulting tocribl_${this.outId}
- Optional properties like logfileandparser(not currently configurable)
Events are grouped into batches based on serverHost, with each batch having its own session ID and rate limit.
Configuration Tips
- Default Configuration: If Server/host field is left empty, all events will have serverHost: cribl_${this.outId}, sharing the same session ID and rate limit.
- Increasing Sessions: Set Server/host field to an event field with higher cardinality to create multiple sessions and increase the combined rate limit.
- Practical Solutions:- Original Source Server/Host: Set Server/host field to a field describing the original source server/host name.
- Worker Process ID: In a Pipeline, set a field like worker_idtoC.env.CRIBL_WORKER_LABELand configure Server/host field toworker_id.
- Random Distribution: Use an Eval function to set a field to Math.floor(Math.random() * 8)and configure Server/host field to this field to divide events into multiple sessions.
 
Internal Fields
The __severity field is included in the severity assignment order, after the sev field. The order is sev, __severity, then the configured default Severity.
Proxying Requests
If you need to proxy HTTP/S requests, see System Proxy Configuration.
Troubleshooting
The Destination’s configuration modal has helpful tabs for troubleshooting:
Live Data: Try capturing live data to see real-time events as they flow through the Destination. On the Live Data tab, click Start Capture to begin viewing real-time data.
Logs: Review and search the logs that provide detailed information about the delivery process, including any errors or warnings that may have occurred.
Test: Ensures that the Destination is correctly set up and reachable. Verify that sample events are sent correctly by clicking Run Test.
You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify delivery issues. Analyze the graphs showing events and bytes in/out over time.