Datadog Destination
Cribl Edge can send log and metric events to Datadog. Datadog supports metrics of type gauge, counter, rate, and distribution via its REST API.
Cribl Edge sends data via two primary Datadog endpoints, which must be accessible from your Cribl environment:
- Logs: Data is routed through http-intake.logs.{domain}
- Metrics: Data is sent via api.{domain}
The {domain} placeholder in the endpoints represents the specific Datadog region you’re using. By default, this is set to datadoghq.com for the US region. You can configure an alternative region with the Datadog site under Optional Settings below.
Supported Domains:
- US: datadoghq.com(default)
- US3: us3.datadoghq.com
- US5: us5.datadoghq.com
- Europe: datadoghq.eu
- US1-FED: ddog-gov.com
- AP1: ap1.datadoghq.com
- Custom: custom.datadoghq.com
Type: Streaming | TLS Support: Yes | PQ Support: Yes
Configure Cribl Edge to Output to Datadog
- On the top bar, select Products, and then select Cribl Edge. Under Fleets, select a Fleet. Next, you have two options: - To configure via QuickConnect, navigate to Routing > QuickConnect (Stream) or Collect (Edge). Select Add Destination and select the Destination you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select Data > Destinations or More > Destinations (Edge). Select the Destination you want. Next, select Add Destination.
 
- In the New Destination modal, configure the following under General Settings: - Output ID: Enter a unique name to identify this Destination definition.
- Description: Optionally, enter a description.
 
- Under Authentication, select an Authentication method from the dropdown: - Manual: Displays a field for you to enter an API key that is available in your Datadog profile.
- Secret: This option exposes an API key (text secret) drop-down, in which you can select a stored secret that references the API access token described above. A Create link is available to store a new, reusable secret.
- API key: Enter your Datadog organization’s API key.
 
- Next, you can configure the following Optional Settings: - Datadog site: Select the Datadog region you are sending to. Defaults to US; the other options areUS3,US5,Europe,US1-FED,AP1, andCustom. SelectCustomto manually enter your Datadog region’s URL if it’s not listed.
- Send logs as: Specify the content type to use when sending logs. Defaults to application/json, where each log message is represented by a JSON object. The alternativetext/plainoption sends one message per line, with newline\ndelimiters.
- Message field: Name of the event field that contains the message to send. If not specified, Cribl Edge sends a JSON representation of the whole event (regardless of whether Send logs as is set to JSON or plain text).
- Source: Name of the source to send with logs. If you’re sending logs as JSON objects (that is, you’ve selected Send logs as: application/json), the event’ssourcefield (if set) will override this value.
- Host: Name of the host to send with logs. If you’re sending logs as JSON objects, the event’s hostfield (if set) will override this value.
- Service: Name of the service to send with logs. If you’re sending logs as JSON objects, the event’s __servicefield (if set) will override this value.
- Datadog tags: List of tags to send with logs, such as env:prod,env_staging:east. These tags enhance search and filtering in Datadog. For log events, these tags are used in the default batching behavior controlled by the Batch by tags advanced setting. High tag cardinality can affect performance.
- Severity: Default value for message severity. If you’re sending logs as JSON objects, the event’s __severityfield (if set) will override this value.
 - Datadog uses the above five fields ( - source,- host,- __service,- tags, and- __severity) to enhance searches.- Allow API key from events: If toggled on, any API key in the - __agent_api_keyinternal field will override the API key field’s value. This option is useful if events originate from multiple Datadog Agent Sources, each configured with a different API key. (For further details, see Managing API Keys.)
- Backpressure behavior: Specify whether to block, drop, or queue events when all receivers are exerting backpressure. Defaults to - Block.
- Tags: Optionally, add tags that you can use to filter and group Destinations on the Destinations page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names. 
 
- Datadog site: Select the Datadog region you are sending to. Defaults to 
- Optionally, you can adjust the Persistent Queue, Processing, Retries, and Advanced settings outlined in the sections below. 
- Select Save, then Commit & Deploy. 
Persistent Queue Settings
The Persistent Queue Settings tab displays when the Backpressure behavior option in General settings is set to Persistent Queue. Persistent queue buffers and preserves incoming events when a downstream Destination has an outage or experiences backpressure.
Before enabling persistent queue, learn more about persistent queue behavior and how to optimize it with your system:
- About Persistent Queues
- Optimize Destination Persistent Queues (dPQ)
- Destination Backpressure Triggers
On Cribl-managed Cloud Workers (with an Enterprise plan), this tab exposes only the destructive Clear Persistent Queue button (described at the end of this section). A maximum queue size of 1 GB disk space is automatically allocated per PQ‑enabled Destination, per Worker Process. The 1 GB limit is on outbound uncompressed data, and no compression is applied to the queue.
This limit is not configurable. If the queue fills up, Cribl Stream/Edge will block outbound data. To configure the queue size, compression, queue-full fallback behavior, and other options below, use a hybrid Group.
Mode: Use this menu to select when Cribl Stream/Edge engages the persistent queue in response to backpressure events from this Destination. The options are:
| Mode | Description | 
|---|---|
| Error | Queues and stores data on a disk when the Destination is unavailable or in an error state. | 
| Backpressure | Queues and stores data to a disk when it detects backpressure from the Destination until the backpressure event resolves. | 
| Always On | Cribl Stream or Edge immediately queues and stores all data on a disk for all events, even when there is no backpressure. | 
If a Worker/Edge Node starts with an invalid Mode setting, it automatically switches to Error mode. This might happen if the Worker/Edge Node is running a version that does not support other modes (older than 4.9.0), or if it encounters a nonexistent value in YAML configuration files.
File size limit: The maximum data volume to store in each queue file before closing it. Enter a numeral with units of KB, MB, etc. Defaults to 1 MB.
Queue size limit: The maximum amount of disk space that the queue can consume on each Worker Process. When the queue reaches this limit, the Destination stops queueing data and applies the Queue‑full behavior. Defaults to 5 GB. This field accepts positive numbers with units of KB, MB, GB, and so on. You can set it as high as 1 TB, unless you’ve configured a different Worker Process PQ size limit on the Worker Group/Fleet Settings page.
Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. Cribl Stream/Edge will append /<worker‑id>/<output‑id> to this value.
Compression: Set the codec to use when compressing the persisted data after closing a file. Defaults to None. Gzip is also available.
Queue-full behavior: Whether to block or drop events when the queue begins to exert backpressure. A queue begins to exert backpressure when the disk is low or at full capacity. This setting has two options:
- Block: The output will refuse to accept new data until the receiver is ready. The system will return block signals back to the sender.
- Drop new data: Discard all new events until the backpressure event has resolved and the receiver is ready.
Backpressure duration Limit: When Mode is set to Backpressure, this setting controls how long to wait during network slowdowns before activating queues. A shorter duration enhances critical data loss prevention, while a longer duration helps avoid unnecessary queue transitions in environments with frequent, brief network fluctuations. The default value is 30 seconds.
Strict ordering: Toggle on (default) to enable FIFO (first in, first out) event forwarding, ensuring Cribl Stream/Edge sends earlier queued events first when receivers recover. The persistent queue flushes every 10 seconds in this mode. Toggle off to prioritize new events over queued events, configure a custom drain rate for the queue, and display this option:
- Drain rate limit (EPS): Optionally, set a throttling rate (in events per second) on writing from the queue to receivers. (The default 0value disables throttling.) Throttling the queue drain rate can boost the throughput of new and active connections by reserving more resources for them. You can further optimize Worker startup connections and CPU load in the Worker Processes settings.
Clear Persistent Queue: For Cloud Enterprise only, click this button if you want to delete the files that are currently queued for delivery to this Destination. If you click this button, a confirmation modal appears. Clearing the queue frees up disk space by permanently deleting the queued data, without delivering it to downstream receivers. This button only appears after you define the Output ID.
Use the Clear Persistent Queue button with caution to avoid data loss. See Steps to Safely Disable and Clear Persistent Queues for more information.
Processing Settings
Post‑Processing
Pipeline: Pipeline or Pack to process data before sending the data out using this output.
System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe (identifying the Cribl Edge Pipeline that processed the event). Supports wildcards. Other options include:
- cribl_host– Cribl Edge Node that processed the event.
- cribl_input– Cribl Edge Source that processed the event.
- cribl_output– Cribl Edge Destination that processed the event.
- cribl_route– Cribl Edge Route (or QuickConnect) that processed the event.
- cribl_wp– Cribl Edge Worker Process that processed the event.
Retries
Honor Retry-After header: Toggle on to honor a Retry-After header, provided that the header specifies a delay no longer than 180 seconds. Cribl Stream/Edge limits the delay to 180 seconds even if the Retry-After header specifies a longer delay. Any Retry-After header received takes precedence over all other options configured in the Retries section. Toggle off to ignore all Retry-After headers.
Settings for failed HTTP requests: When you want to automatically retry requests that receive particular HTTP response status codes, use these settings to list those response codes.
For any HTTP response status codes that are not explicitly configured for retries, Cribl Stream/Edge applies the following rules:
| Status Code | Action | 
|---|---|
| Any in the 1xx,3xx, or4xxseries | Drop the request | 
| Any in the 5xxseries | Retry the request | 
Upon receiving a response code that’s on the list, Cribl Stream/Edge first waits for a set time interval called the Pre-backoff interval and then begins retrying the request. Time between retries increases based on an exponential backoff algorithm whose base is the Backoff multiplier, until the backoff multiplier reaches the Backoff limit (ms). At that point, Cribl Stream/Edge continues retrying the request without increasing the time between retries any further.
If the sender (which manages the connection to the Destination) is at capacity, it will not accept any incoming events. These incoming events originate internally from a previous stage of the data flow when Destinations send outbound requests to their respective external services, and they include retry requests and new requests. Any events that were already in transit when the sender reached capacity will continue to be processed downstream.
Sender capacity is freed up when an outgoing request succeeds or encounters a non-retryable error. When the sender has available capacity again, it will resume accepting incoming events. This capacity management is influenced by the number of active connections and configured limits, such as concurrency and buffer sizes. If a Pipeline sends events faster than the Destination can process, the buffers may fill up, leading to backpressure and Sender at capacity warnings. This backpressure prevents the sender from accepting additional requests until capacity is restored.
By default, this Destination has no response codes configured for automatic retries. For each response code you want to add to the list, select Add Setting and configure the following settings:
- HTTP status code: A response code that indicates a failed request, for example 429 (Too Many Requests)or503 (Service Unavailable).
- Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000(one second).
- Backoff multiplier: The base for the exponential backoff algorithm. A value of 2(the default) means that Cribl Stream/Edge will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
- Backoff limit (ms): The maximum backoff interval Cribl Stream/Edge should apply for its final retry, in milliseconds. Default (and minimum) is 10,000(10 seconds); maximum is180,000(180 seconds, or 3 minutes).
Retry timed-out HTTP requests: Toggle on to automatically retry requests that have timed out and display the following settings for configuring retry behavior:
- Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000(one second).
- Backoff multiplier: The base for the exponential backoff algorithm. A value of 2(the default) means that Cribl Stream/Edge will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
- Backoff limit (ms): The maximum backoff interval Cribl Stream/Edge should apply for its final retry, in milliseconds. Default (and minimum) is 10,000(10 seconds); maximum is180,000(180 seconds, or 3 minutes).
Advanced Settings
Validate server certs: Toggle on to reject certificates that are not authorized by a CA in the CA certificate path, nor by another trusted CA (for example, the system’s CA).
Round-robin DNS: Toggle on to enable round-robin DNS lookup across multiple IP addresses, IPv4 and IPv6. When a DNS server resolves a Fully Qualified Domain Name (FQDN) to multiple IP addresses, Cribl Edge will sequentially use each address in the order they are returned by the DNS server for subsequent connection attempts.
Compress: Toggle on (default; recommended) to compress the payload body before sending.
Send counter metrics as ‘count’: Unless toggled on, Datadog might transform counter metrics to gauge. Learn more about Datadog metrics types.
Batch by tags: When toggled on (default), log events are batched using both the API key and Datadog tags to optimize logical grouping. However, high cardinality in Datadog tags (many unique tag values) can result in excessive concurrent requests and potential 408 errors from Datadog. Toggling this setting off changes the batching behavior to group log events solely by API key, reducing batch count and improving performance.
Request timeout: Amount of time (in seconds) to wait for a request to complete before aborting it. Defaults to 30.
Request concurrency: Maximum number of concurrent requests per Worker Process. When Cribl Edge hits this limit, it begins throttling traffic to the downstream service. Defaults to 5. Minimum: 1. Maximum: 32.
Body size limit (KB): Maximum size of the request body before compression. Defaults to 4096 KB. The actual request body size might exceed the specified value because the Destination adds bytes when it writes to the downstream receiver. Cribl recommends that you experiment with the Body size limit value until downstream receivers reliably accept all events.
Buffer memory limit (KB): Total amount of memory used to buffer outgoing requests waiting to be sent. If left blank, defaults to 5 times the max body size (if set). If 0, no limit is enforced. This provides granular control over the memory allocated for buffering outgoing batched requests. Increasing the limit allows batches to grow larger before being flushed, improving efficiency for data with high cardinality (a large number of unique batches). Finding the optimal balance between efficient data transfer and memory usage involves adjusting both the Buffer memory limit and Body size limit settings.
Events-per-request limit: Maximum number of events to include in the request body. The 0 default allows unlimited events.
Flush period (s): Maximum time between requests. Low values could cause the payload size to be smaller than its configured maximum. Defaults to 1.
Extra HTTP headers: Name-value pairs to pass as additional HTTP headers. Values will be sent encrypted.
Failed request logging mode: Use this drop-down to determine which data should be logged when a request fails. Select among None (the default), Payload, or Payload + Headers. With this last option, Cribl Edge will redact all headers, except non-sensitive headers that you declare below in Safe headers.
Safe headers: Add headers to declare them as safe to log in plaintext. (Sensitive headers such as authorization will always be redacted, even if listed here.) Use a tab or hard return to separate header names.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Internal Fields
Cribl Edge uses a set of internal fields to assist in forwarding data to a Destination.
If an event contains the internal field __criblMetrics, Cribl Edge will send it to Datadog as a metric event. Otherwise, Cribl Edge will send it as a log event.
You can use these fields to override outbound event values for log events:
- ddtags
- __service
- __severity
No internal fields are supported for metric events.
Proxying Requests
If you need to proxy HTTP/S requests, see System Proxy Configuration.
For More Information
You might find these Datadog references helpful:
Troubleshooting
The Destination’s configuration modal has helpful tabs for troubleshooting:
Live Data: Try capturing live data to see real-time events as they flow through the Destination. On the Live Data tab, click Start Capture to begin viewing real-time data.
Logs: Review and search the logs that provide detailed information about the delivery process, including any errors or warnings that may have occurred.
Test: Ensures that the Destination is correctly set up and reachable. Verify that sample events are sent correctly by clicking Run Test.
You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify delivery issues. Analyze the graphs showing events and bytes in/out over time.