About Destination Backpressure Triggers
This page describes the conditions that can trigger backpressure in Cribl Edge Destinations. Backpressure exists when a destination receives more data than it can currently send. It typically results from network connectivity problems or when a Destination is receiving data faster than it can forward that data to the downstream service.
When a Destination signals backpressure, it notifies the Sources generating data that its buffers are full and that the incoming data flow should pause until there is room for more data.
You can configure your desired behavior through a Destination’s Backpressure Behavior drop-down. Where other options are not displayed, Cribl Edge’s default behavior is Block. For details about all the above behaviors and options, see Persistent Queues.
HTTP-Based Destinations
HTTP-based Destinations trigger backpressure when they encounter errors while connecting or when they receive certain HTTP response codes. If the HTTP call returns a 500 series (500–599) HTTP code, the Destination adds the events in the buffer back to its internal buffer and tries to send them again. The retry attempt can also trigger backpressure, depending on the frequency at which this happens.
If the Destinaton receives a 400 series response code, it will drop (not retry) the events, because these response codes indicate an error in payload or formatting (for example). See this Mozilla page on the 400 Bad Request code for more information.
Finally, HTTP-based Destinations trigger backpressure when their processing loads exceed the specified limit. HTTP-based Destinations handle requests by creating a connection, sending the request payload, and receiving and parsing a response. Because this process can take time, HTTP Destinations allow more than one request to be in progress at a time. When the number of requests in flight exceeds the value set in Advanced Settings > Request concurrency, the Destination will trigger backpressure.
The following conditions will trigger backpressure:
- Connection failure.
- Connection or request timeouts.
- Data overload, such as when the Source sends more data than the Destination will accept.
- Reaching the limit on concurrent (in-flight) requests.
- HTTP 500series responses.
You can set the number of in-flight or concurrent requests in Advanced Settings > Request concurrency.
See Available Destinations for information about which Destinations are HTTP-based.
Filesystem-Based Destinations
Filesystem-based Destinations that send data to cloud-based services trigger backpressure when Cribl Edge can’t move a file or upload it to the cloud.
Filesystem-based Destinations queue events in files on disk in a staging directory. When a batch of events is ready for transmission, Cribl Edge closes the file, optionally compresses it, and transmits the file to the downstream service.
For the Filesystem Destination, the batch simply moves from the staging directory to the output directory.
For other filesystem-based Destinations, Cribl Edge uploads the file to a cloud service, such as AWS S3, Azure Blob Storage, or other. For these Destinations, a failure to upload the file will trigger backpressure.
These scenarios include:
- Permissions problem writing to the staging directory, such as when the user running the Cribl Edge process does not have permission to write to the staging directory. 
- The filesystem for the staging or output directory is full (impacts Filesystem Destination only). 
- The credentials provided for the cloud service are invalid. 
- The Destination doesn’t have the permissions required to upload to the cloud service. 
- Cloud service permissions issue, such as when an upload fails because the cloud user (identified in credentials) does not have write permissions. 
- Network connectivity – The Destination can’t communicate with the cloud service due to network connectivity issues. 
See Available Destinations for information about which Destinations are filesystem-based.
TCP-Based Destinations
TCP Sources establish a persistent connection with a Destination and send data. These Sources maintain connection to the Destination and will re-establish that connection if they detect a close or write timeout.
If the connection fails, Cribl Edge uses a backoff algorithm to avoid spiking the CPU. This backoff logic makes the first retry at 2 seconds, then gradually increases the time between retries up to a maximum of 60 seconds.
The following conditions can trigger backpressure from a TCP-based Destination:
- Connection failure. 
- Authentication failure. 
- Write timeout, where a Source sends data but doesn’t receive an - ackat the TCP layer. In this case, Cribl Edge closes the connection to the Destination and tries to re-establish it.
- The Destination receives data from the Source faster than it can send data over the network to the downstream service, causing internal buffers to fill up. This condition can occur due to a slow network connection or slow processing on the downstream service.When space in the internal buffers becomes available, backpressure is relieved, allowing the Source to send more data. 
See Available Destinations for information about which Destinations are TCP-based. Some Destinations use either TCP or UDP to send messages, but they trigger backpressure and use the backoff algorithm described above only when used with the TCP protocol.
Kafka
Kafka is a binary protocol that runs over TCP. The protocol defines three roles – producers, consumers, and brokers.
Producers send data to a Kafka topic (think of it as a message queue). Kafka consumers read data from Kafka topics. Brokers receive data from topic producers and send data to topic consumers, meaning they serve as go-betweens for producers and consumers.
Cribl Edge’s Kafka-based Destinations are producers. They send data to Kafka brokers. Since the protocol is TCP-based, the conditions that trigger backpressure are similar to those of Cribl Edge’s TCP Destinations.
The specifics for connection and request retries are defined by the Kafka JS library in conjunction with the configuration parameters defined for each Destination. In all cases, the Destinations have internal buffers that are sent to the Broker when they’re considered full (on a size and time basis).
The following conditions can trigger backpressure for Kafka-based Sources:
- Failure to connect to the broker, lack of network connectivity, or the broker is down.
- Invalid credentials – Destination connects successfully, but the connection is rejected due to a failure to authenticate.
- Invalid permissions – The credentials assigned do not have proper permissions to send events to the topic.
- The Destination is attempting to send data faster than the broker can receive it. Backpressure will be relieved when internal buffers have room for more events.
Kafka Retries and PQ
In some situations, Kafka Destinations may take a long time to engage the PQ (1 to 6 minutes by default).
This delay can result from the default settings used with the Kafka Destination. The relevant settings are all in the Advanced Settings UI:
- Connection timeout (ms): Defaults to 10000,
- Request timeout (ms): Defaults to 60000.
- Retry limit: Defaults to 5.
These default settings mean that Cribl Edge will retry five times after a failed connection, with a 10-second timeout between each attempt. It will also retry five files after a failed request, with a timeout of 60 seconds between each attempt.
For faster PQ engagement, update these values. You can use smaller values for Connection timeout and Request timeout to decrease the interval between retries. You can also use smaller values for Retry limit or even set it to 0 to eliminate retries altogether.
UDP
UDP is a connectionless protocol, so there is no concept of backpressure at the network layer. As a result, UDP Destinations will never trigger backpressure.
See Available Destinations for information about which Destinations support UDP. Except for SNMP Trap, these Destinations also support TCP, but the UDP versions do not trigger backpressure.
SQS
The SQS (Simple Queue Service) Destination uses the AWS SQS library to send events to the Destination’s configured queue. Like Kafka, this Destination acts as a producer that publishes messages to the provisioned SQS queue.
This Destination has five buffers with 10 events per buffer for a total of 50 events that can be queued before backpressure is enabled. The reason that backpressure might be generated for this Destination include:
- Invalid credentials.
- Invalid permissions.
- Queue does not exist and Create Queue is disabled.
- Lack of network connectivity.
Amazon Kinesis
Kinesis runs over HTTPS and behaves like Pub/Sub Destinations.
This Destination serializes and publishes events in a newline-delimited JSON format. In the context of the Kinesis Destination, it behaves as a producer where a Kinesis source is considered a consumer. Since the protocol is HTTP-based, the conditions that trigger backpressure are similar to those of other Cribl Edge HTTP Destinations.
The Kinesis library defines the specifics for connection and request retries in conjunction with the configuration parameters defined for each Destination. In all cases, the Destinations have internal buffers that are sent to the Broker when they’re considered full (on a size and time basis).
The following conditions can trigger backpressure for Kinesis-based Destinations:
- Failure to connect to the Kinesis service, network connectivity, or the service is down.
- Invalid credentials – Connects successfully but connection is rejected due to a failure to authenticate.
- Invalid permissions – The credentials assigned do not have proper permissions to send events to the Kinesis stream.
- The Destination is attempting to send data faster than the service can receive it. Backpressure will be relieved when internal buffers have room for more events.
Google Cloud Pub/Sub
Google Cloud Pub/Sub is similar to Kafka and Kinesis in that there are producers that send data and consumers that receive data. The Cribl Edge Google Cloud Pub/Sub Destination is a producer that sends data to the Google Cloud Pub/Sub service. Google Cloud Pub/Sub leverages gRCP, however, which uses protocol buffers as both its interface definition language (IDL) and as its underlying message interchange format.
The conditions that trigger backpressure for a Google Cloud Pub/Sub Destination are:
- Failure to connect to the Google Pub/Sub service, network connectivity, or the service is down.
- Invalid credentials – Connects successfully but connection is rejected due to a failure to authenticate.
- Invalid permissions – The credentials assigned do not have proper permissions to send events to the topic.
- The Destination is attempting to send data faster than the service can receive it. Backpressure will be relieved when internal buffers have room for more events.
OpenTelemetry
OpenTelemetry supports gRPC or HTTP to send messages. For the gRPC protocol, the same rules described for other TCP Destinations generally apply to OpenTelemetry. The key difference is that this Destination uses the OpenTelemetry library, which handles write timeouts and reconnection logic.
For the HTTP protocol, the same rules apply to our HTTP Destinations for retries based on connectivity or 500 series response codes.
Load-Balanced Destinations
HTTP-Based Load-Balanced Destinations
Some HTTP-based Destinations support two modes – single host and multiple load-balanced hosts.
With a load-balanced version of an HTTP Destination, Cribl Edge spreads events across all configured Destinations, using a weighted algorithm. Under normal circumstances, this algorithm spreads the load across the configured Destinations according to the weight assigned to the Destination.
When one of the load-balanced Destinations is out of service (due to a write
timeout or 500 series response code - buffers full), Cribl Edge removes that Destination from the
list of hosts and moves any in-flight buffers to the next available host.
Cribl Edge will try to resend events to the downed host in the background, and
it will resume sending data to that host once the connection is restored.
See Available Destinations for information about which HTTP-based Destinations support load balancing.
TCP Load-Balanced Destinations
Some TCP-based Destinations support two modes – single host and multiple load-balanced hosts.
With a load-balanced TCP Destination, Cribl Edge spreads events across all configured Destinations using a weighted algorithm. Under normal circumstances, this algorithm spreads the load across the configured Destinations according to the weight assigned to each Destination.
When one of the load-balanced Destinations is out of service (due to a write timeout or connection failure), Cribl Edge removes that Destination from the list of hosts and moves any in-flight buffers to the next available host. Cribl Edge will try to reconnect to the downed host and will resume sending data to that host once the connection is restored.
TCP load-balanced Destinations can be configured with one or multiple receivers. If one or more receivers go down, Cribl Edge will continue sending data to any healthy receivers.
See Available Destinations for information about which TCP Destinations support load balancing.
Output Router
The Output Router is a wrapper destination that groups together 1..N Destinations. It’s used in conjunction with event filters to route data to a selected Destination. The Output Router Destination emits backpressure only when one of its downstream Destinations is blocking.