LogStream's persistent queuing (PQ) feature helps minimize data loss if a downstream receiver is unreachable. PQ provides durability by writing data to disk for the duration of the outage, and forwarding it upon recovery.
Each LogStream output has an in-memory queue that helps it absorb temporary imbalances between inbound and outbound data rates. E.g., if there is an inbound burst of data, the output will store events in the queue, and then output them at the rate to which the receiver can sync (as opposed to blocking or dropping them). Only when this queue is full will the output impose backpressure upstream.
Backpressure behavior can be configured to one of Block, Drop Events, or (on Destinations that support it) Persistent Queue. In Block mode, the output will refuse to accept new data until the receiver is ready.
The system will back propagate block "signals" all the way back to the sender (assuming that the sender supports backpressure, too). In general, TCP-based senders support backpressure, but this is not a guarantee: Each upstream application's developer is responsible for ensuring that the application stops sending data once LogStream stops sending TCP acknowledgments back to it.
In Drop mode, the Destination will discard new events until the receiver is ready. In some environments, the in-memory queues and their block/drop behavior are acceptable.
Persistent queues serve environments where more durability is required (e.g., outages last longer than memory queues can sustain), or where upstream senders do not support backpressure (e.g., ephemeral/network senders).
Engaging persistent queues in these scenarios can help minimize data loss. Once the in-memory queue is full, the LogStream Destination will write its data to disk. Then, when the receiver is ready, the output will start draining the queues in FIFO (first in, first out) fashion.
Persistent queues are:
- Available on the output side (i.e., after processing).
- Engaged only when all of that output's receivers exert blocking.
- Drained when at least one receiver can accept data.
- Not infinite in size. I.e., if data cannot be delivered out, you might run out of disk space.
- Not able to fully protect in cases of application failure. E.g., in-memory data might get lost if a crash occurs.
- Not able to protect in cases of hardware failure. E.g., disk failure, corruption, or machine/host loss.
- TLS-encrypted only for data in flight, and only on Destinations where TLS is supported and enabled. To encrypt data at rest, including disk writes/reads, you must configure encryption on the underlying storage volume(s).
The following LogStream Destinations support Persistent Queuing:
- Splunk Single Instance
- Splunk Load Balanced
- Splunk HEC
- Cloudwatch Logs
- Azure Monitor Logs
- Azure Event Hubs
- StatsD Extended
- TCP JSON
Persistent Queueing is configured individually for each output that supports it. To enable persistent queueing, go to the output's (Destination's) configuration page and set the Backpressure Behavior control to Persistent Queueing. This exposes the following additional controls:
Max file size: The maximum size to store in each queue file before closing it. Enter a numeral with units of KB, MB, etc. Defaults to
Max queue size: The maximum amount of disk space that the queue is allowed to consume, on each Worker Process. Once this limit is reached, queueing is stopped, and data blocking is applied. Enter a numeral with units of KB, MB, etc.
Queue file path: The location for the persistent queue files. This will be of the form:
your/path/here/<worker-id>/<output-id>. Defaults to
Compression: Codec to use to compress the persisted data, once a file is closed. Defaults to
Gzipis also available.
Queue-full behavior: Determines whether to block or drop events when the queue is exerting backpressure (because disk is low or at full capacity). Block is the same behavior as non-PQ blocking, corresponding to the Block option on the Backpressure behavior drop-down.Drop new data throws away incoming data, while leaving the contents of the PQ unchanged.
Minimum Free Disk Space
For queuing to operate properly, you must provide sufficient disk space. You configure the minimum disk space in global ⚙️ Settings (lower left) > General Settings > Limits > Min Free Disk Space. If available disk space falls below this threshold, LogStream will stop maintaining persistent queues, and data loss will begin. The default minimum is 5GB. Be sure to set this on your Worker Nodes (rather than on the Leader Node) when in distributed mode.
Updated 7 days ago