Backpressure Impacts to Sources
When a Destination experiences backpressure, Cribl Stream must manage the incoming data flow to prevent overwhelming the system. If persistent queue (PQ) is not enabled, backpressure propagates back through the data Pipeline to the Sources, which can impact data flow in different ways depending on the Source type.
This page explains how backpressure affects Sources when PQ is not in use and the Backpressure behavior is set to Block. For information about backpressure behavior options, see Data Delivery to Unreachable Destinations.
How Backpressure Propagates Without PQ
When a Destination signals backpressure and PQ is not enabled, the Cribl Stream event processors stop accepting new data. This creates a chain reaction that propagates back through the Pipeline:
- The Destination internal buffers fill up.
- The event processors stop reading from Source buffers.
- Source buffers fill up.
- The Source signals backpressure to upstream senders (when supported).
The speed at which this propagation occurs depends on buffer sizes throughout the Pipeline. A brief backpressure event might not immediately affect all Sources, but sustained backpressure will eventually impact all connected senders. For Sources that support backpressure signaling, senders are blocked and data queues on the sender side. For Sources that do not support backpressure signaling (such as UDP-based Sources), data is dropped.
The Destination does not directly notify individual Sources about backpressure. Instead, for Sources that support backpressure signaling, backpressure propagates indirectly when the Cribl Stream event processors stop reading from Source buffers, causing those buffers to fill. The specific mechanism varies by Source type. See the sections below for details.
The following sections describe how this backpressure propagation affects different Source types when PQ is not enabled.
TCP-Based Sources
For TCP-based Sources (such as Splunk TCP, Syslog over TCP, TCP JSON, and TCP Raw), backpressure propagates through the following mechanism:
- When the Destination buffers fill, the Cribl Stream event processor stops reading from the TCP socket buffer.
- When the TCP socket buffer fills, the TCP/IP stack notifies the sender to stop sending data.
- The sender pauses transmission until the socket buffer has capacity again.
This mechanism relies on TCP flow control, which is built into the protocol. When Cribl Stream stops acknowledging received data, the TCP stack on the sender automatically slows or pauses transmission. During this time, the sender’s outbound data queues in its local TCP buffer. When Cribl Stream resumes reading from the socket (because the Destination has capacity again), the TCP stack automatically resumes transmission and the sender’s buffer drains.
Per-Connection Behavior
When the TCP socket buffer fills, the TCP/IP stack signals the sender to stop sending data. Because this occurs at the individual socket level, some senders may experience blocking sooner than others, even if they are sending data to the same Destination.
This means:
- Different senders connected to the same Source may experience backpressure at different times.
- Senders transmitting higher volumes of data will fill their socket buffers faster and experience backpressure sooner.
- Senders with slower network connections may have smaller effective buffer capacities and experience backpressure sooner.
Only TCP-based Sources support this flow control mechanism. UDP-based Sources (such as UDP Raw and SNMP Trap) do not have built-in backpressure support because UDP is a connectionless protocol. For UDP Sources, Cribl Stream drops incoming data when buffers are full.
Pull-Based Sources
For pull-based Sources, Cribl Stream has built-in intelligence to detect backpressure and pause data retrieval. When backpressure is detected, pull-based Sources stop or slow their data collection activities.
The following pull-based Sources are affected:
| Source Type | Backpressure Behavior |
|---|---|
| S3, Azure Blob Storage, and other object storage Sources | File downloads pause when backpressure is detected. |
| Database Collectors | Results streaming pauses during backpressure. |
| REST Collectors | GET requests pause during backpressure. |
| Kafka, Confluent Cloud, Amazon MSK | Polling from Kafka-based Sources stops during backpressure. |
| Amazon Kinesis Data Streams | Stream reads pause during backpressure. |
| Amazon SQS, Google Cloud Pub/Sub | Message retrieval pauses during backpressure. |
Side Effects of Pausing Pull-Based Sources
While pausing data collection prevents overwhelming the Pipeline, it can introduce side effects.
File Download Timeouts
Backpressure causes file downloads from object storage (such as S3 or Azure Blob Storage) to pause. If backpressure persists for too long, the download may time out. This can result in:
- Duplicate data: If Cribl Stream retries the download after a timeout, you may ingest the same data twice.
- Data loss: If the source data is purged or expires before Cribl Stream can recollect it, that data is lost.
To mitigate this risk, consider increasing the Socket timeout setting in Advanced Settings for the Source, or enable persistent queue to prevent prolonged backpressure events.
Database Resource Consumption
When Database Collectors pause results streaming due to backpressure, the database query remains active on the database server. This means:
- The database server continues to consume resources (CPU, memory, connections) for the paused query.
- Database connections remain held, which can exhaust the connection pool and block other database operations.
- The Database Collector runs on only one Worker Process at a time, so if that process is blocked, the entire Database Collector job run is affected.
For long-running database queries, monitor your database server resources and connection pool usage during backpressure events.
Kafka Consumer Group Rebalancing
When Kafka-based Sources pause polling for extended periods, the Kafka broker may consider the consumer inactive and trigger a consumer group rebalance. This can cause:
- Temporary disruption to other consumers in the same consumer group.
- Potential duplicate processing when the consumer rejoins.
Impact on Multi-Destination Routing
When a sender is blocked due to backpressure from one Destination, it affects data delivery to all Destinations receiving data from that sender, even Destinations that are not experiencing backpressure.
This occurs because event cloning for multi-Destination routing happens after data passes through the Routes. If a sender is blocked:
- Data from that sender cannot flow through the Routes.
- The Clone Function or Output Router cannot create copies for other Destinations.
- All Destinations expecting data from that sender stop receiving data.
Example Scenario
Consider a configuration where:
- A Syslog Source sends data to two Destinations via an Output Router.
- Destination A experiences backpressure.
- Destination B is healthy and can accept data.
When Destination A blocks and the sender is affected:
- No new data from that sender reaches Destination B either.
- This occurs even though Destination B has capacity.
- The block persists until backpressure on Destination A resolves.
To prevent one Destination from blocking all data flow, set non-critical Destinations to Drop or enable persistent queue.