Home / Edge/ Sending Data/DataSet

DataSet

Cribl Edge can send log events to the SentinelOne/Scalyr DataSet platform via the DataSet API. This Destination sends batches of events, as JSON, to that API’s addEvents endpoint.

Type: Streaming | TLS Support: Yes | PQ Support: Yes

Configuring Cribl Edge to Output to DataSet

From the top nav, click Manage, then select a Fleet to configure. Next, you have two options:

To configure via the graphical QuickConnect UI, click Routing > QuickConnect (Stream) or Collect (Edge). Next, click Add Destination at right. From the resulting drawer’s tiles, select DataSet. Next, click either Add Destination or (if displayed) Select Existing. The resulting drawer will provide the options below.

Or, to configure via the Routing UI, click Data > Destinations (Stream) or More > Destinations (Edge). From the resulting page’s tiles or the Destinations left nav, select DataSet. Next, click Add Destination to open a New Destination modal that provides the options below.

General Settings

Output ID: Enter a unique name to identify this Destination definition.

Authentication Settings

Use the Authentication method buttons to select one of these options:

  • Manual: Displays a field for you to enter an API key that is available in your DataSet profile.

  • Secret: This option exposes an API key (text secret) drop-down to select a stored secret that references an API key. A Create link is available to store a new, reusable secret.

API key: Enter your DataSet API key that has Log Write Access.

Optional Settings

DataSet site: Select the US (default), Europe, or Custom region. If you select Custom, enter your custom endpoint URL.

Message field: Name of the event field that contains the message to send. If not specified, Cribl Edge sends all non-internal fields of events passing through the Destination. If specified, we follow this logic:

  • If an event does not contain the specified field, send the whole event (except internal fields).
  • If an event has the specified field, and the field’s value is a non-object, send the event in the format:
    { message: <value from event> }.
  • If an event has the specified field, and the field’s value is an object, send the event in the format:
    { <all fields from the object> }.

Exclude fields: Fields to exclude from the event if the Message field either is unspecified or refers to an object. Ignored if the Message field is a string, number, or boolean. If empty, Cribl Edge sends all non-internal fields.

Default exclude fields are sev, _time, ts, and thread. We automatically send these fields as metadata of the event, in DataSet’s required format. This is to avoid charges for field bytes – metadata bytes do not count toward ingestion.

Server/host field: Name of the event field that contains the server or host that generated the event. Cribl Edge groups events by the value of this field, and gives them a unique session token to conform to the DataSet API. Each group is sent out as a separate batch; therefore, Cribl recommends specifying a field with a low cardinality, to avoid queuing up many different batches at the Destination. If not specified, or not a string, the implicit default value is cribl_<outputId>.

Timestamp field: Name of the event field that contains the event timestamp. Cribl Edge sends this value as part of each event’s metadata, not as an attribute field on the event.

Timestamps are automatically converted to a nanosecond-precision string. If an event does not contain the field specified Timestamp field, or if the value cannot be converted into a nanosecond-precision string, Cribl Edge assigns a timestamp using the first valid output returned from ts, _time, or Date.now(), in that order.

Severity: Use the drop-down to assign a default value to the severity field (which is sent as event metadata, not as an attribute field). Cribl Edge falls back to this value when an event contains no valid sev or __severity field. DataSet’s severity model ranges from 0 least-severe (finest) to 6 most-severe (fatal).

  • Where an event’s sev field contains an integer in this range, Cribl Edge passes it through as the severity.
  • Where the sev field contains a string matching DataSet’s enum (finest, finer, fine, info, warning, error, fatal), Cribl Edge converts it to the corresponding integer.

Backpressure behavior: Specify whether to block, drop, or queue events when all receivers are exerting backpressure. Defaults to Block.

Tags: Optionally, add tags that you can use for filtering and grouping at the final destination. Use a tab or hard return between (arbitrary) tag names.

Persistent Queue Settings

This section is displayed when the Backpressure behavior is set to Persistent Queue.

Max file size: The maximum data volume to store in each queue file before closing it. Enter a numeral with units of KB, MB, etc. Defaults to 1 MB.

Max queue size: The maximum amount of disk space the queue is allowed to consume. Once this limit is reached, Cribl Edge stops queueing and applies the fallback Queue‑full behavior. Enter a numeral with units of KB, MB, etc.

Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. To this value, Cribl Edge will append /<worker‑id>/<output‑id>.

Compression: Codec to use to compress the persisted data, once a file is closed. Defaults to None. Select Gzip to enable compression.

Queue-full behavior: Whether to block or drop events when the queue is exerting backpressure (because disk is low or at full capacity). Block is the same behavior as non-PQ blocking, corresponding to the Block option on the Backpressure behavior drop-down. Drop new data throws away incoming data, while leaving the contents of the PQ unchanged.

Clear persistent queue: Click this button if you want to flush out files that are currently queued for delivery to this Destination. A confirmation modal will appear. (Appears only after Output ID has been defined.)

Processing Settings

Post‑Processing

Pipeline: Pipeline to process data before sending the data out using this output.

System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe (identifying the Cribl Edge Pipeline that processed the event). Supports wildcards. Other options include:

  • cribl_host – Cribl Edge Node that processed the event.
  • cribl_input – Cribl Edge Source that processed the event.
  • cribl_output – Cribl Edge Destination that processed the event.
  • cribl_route – Cribl Edge Route (or QuickConnect) that processed the event.
  • cribl_wp – Cribl Edge Worker Process that processed the event.

Advanced Settings

Validate server certs: Defaults to Yes to reject certificates that are not authorized by a CA in the CA certificate path, nor by another trusted CA (e.g., the system’s CA).

Round-robin DNS: Toggle on to enable round-robin DNS lookup across multiple IP addresses, IPv4 and IPv6. When a DNS server resolves a Fully Qualified Domain Name (FQDN) to multiple IP addresses, Cribl Edge will sequentially use each address in the order they are returned by the DNS server for subsequent connection attempts.

Compress: Defaults to Yes, to compress log events’ payload body before sending.

Request timeout: Amount of time (in seconds) to wait for a request to complete before aborting it. Defaults to 30.

Request concurrency: Maximum number of concurrent requests before blocking. This is set per Worker Process. Defaults to 5.

Max body size (KB): Maximum size of the request body before compression. Defaults to 4096 KB. The actual request body size might exceed the specified value because the Destination adds bytes when it writes to the downstream receiver. Cribl recommends that you experiment with the Max body size value until downstream receivers reliably accept all events.

Max events per request: Maximum number of events to include in the request body. The 0 default allows unlimited events.

Flush period (sec): Maximum time between requests. Low values could cause the payload size to be smaller than its configured maximum. Defaults to 1.

Extra HTTP headers: Click Add header to define Name/Value pairs to pass as additional HTTP headers.

Failed request logging mode: Use this drop-down to determine which data should be logged when a request fails. Select among None (the default), Payload, or Payload + Headers. With this last option, Cribl Edge will redact all headers, except non-sensitive headers that you declare below in Safe headers.

Safe headers: Add headers here to declare them as safe to log in plaintext. (Sensitive headers like authorization will always be redacted, even if listed here.) Use a tab or hard return to separate header names.

Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.

Internal Fields

The __severity field is included in the severity assignment order, after the sev field. The order is sev, __severity, then the configured default Severity.

Proxying Requests

If you need to proxy HTTP/S requests, see System Proxy Configuration.