Home / Stream/ Sources/ Internal·Cribl Stream (Deprecated)

Cribl Stream (Deprecated)

This Source was deprecated as of Cribl Stream 3.5, and has been removed as of v.4.0. Please instead use the Cribl TCP or the Cribl HTTP Source to enable Worker Nodes to send data to peer Nodes.

The Cribl Stream internal Source is available only in distributed deployments. It is provided to facilitate sending data between Worker Nodes that are connected to the same Leader. You’ll find this Source especially valuable in a hybrid Cloud deployment.

This Source can receive data only from a TCP JSON Destination, on a Worker Node that is connected to the same Leader. The following instructions cover configuring such a TCP JSON Destination.

Type: Internal | TLS Support: YES | Event Breaker Support: No

Use New Sources Instead

The replacement Sources, Cribl TCP and the Cribl HTTP, don’t rely on IP filtering, for the following reasons:

  • Load balancers and/or proxies between the Cribl Destination and Cribl Source can change the IP address, resulting in a bad match and rejected ingest.

  • A Lookup table of all IP addresses needed to be sent to each Worker Node/Edge Node from the Leader, which is not scalable.

  • The Lookup table of IP addresses required constant communication between the Worker Node/Edge Nodes and the Leader, making this fragile and placing an arbitrary reliance on the Leader that shouldn’t be there.

How It Works

You can use the Cribl Stream Source to send data between Workers. In a hybrid Cribl.Cloud deployment, using this Source ensures that you’re billed for ingress only once – when the data is originally received. All other data transferred into Workers via the Cribl Stream Source is not charged.

This use case is common in deployments where a customer-managed (on-prem) Worker sends data to a Worker in Cribl Cloud, for additional processing and then routing to Destinations.

As one usage example, assume that you want to send data from one Worker Node/Edge Node deployed on-prem, to another that is deployed in Cribl Cloud. You could do the following:

  • Create an on-prem File System Collector (or whatever Collector or Source is suitable) for the data you want to send to Cribl Cloud.
  • Create an on-prem TCP JSON Destination.
  • Create a “Cribl Stream” Source, on the target Worker Group/Fleet in Cribl Cloud.
  • Configure these to send data from the File System Collector to the TCP JSON Destination, and from there to the “Cribl Stream” Source in Cribl Cloud.

The key points about configuring this architecture are:

  • The TCP JSON Destination must be on a Worker Node/Edge Node that is connected to the same Leader as the “Cribl Stream” Source.
  • The TCP JSON Destination’s Address and Port must match the “Cribl Stream” Source’s Address and Port.

Configuring the Cribl Stream Source

In the QuickConnect UI: Click New Source or Add Source. From the resulting drawer’s tiles, select [System and Internal >] Cribl Stream. Next, click either Add Destination or (if displayed) Select Existing. The drawer will now provide the following options and fields.

Or, in the Data Routes UI: From the top nav of a Cribl Stream instance or Group, select Data > Sources. From the resulting page’s tiles or the Sources left nav, select [System and Internal >] Cribl Stream. Next, click New Source to open a Cribl Stream > New Source modal that provides the following options and fields.

General Settings

Input ID: Enter a unique name to identify this Cribl Stream Source definition.

Address: Enter hostname/IP to listen for TCP JSON data. E.g., localhost or 0.0.0.0.

Port: Enter the port number to listen on.

Authentication Settings

Use the Authentication method buttons to select one of these options:

  • Manual: Use this default option to enter the shared secret that clients must provide in the authToken header field. Exposes an Auth token field for this purpose. (If left blank, unauthenticated access will be permitted.) A Generate link is available if you need a new secret.

  • Secret: This option exposes an Auth token (text secret) drop-down, in which you can select a stored secret that references the authToken header field value described above. The secret can reside in Cribl Stream’s internal secrets manager or (if enabled) in an external KMS. A Create link is available if you need a new secret.

Optional Settings

Tags: Optionally, add tags that you can use for filtering and grouping in the Cribl Stream UI. Use a tab or hard return between (arbitrary) tag names. These tags aren’t added to processed events.

TLS Settings (Server Side)

Enabled defaults to No. When toggled to Yes:

Certificate name: Name of the predefined certificate.

Private key path: Server path containing the private key (in PEM format) to use. Path can reference $ENV_VARS.

Passphrase: Passphrase to use to decrypt private key.

Certificate path: Server path containing certificates (in PEM format) to use. Path can reference $ENV_VARS.

CA certificate path: Server path containing CA certificates (in PEM format) to use. Path can reference $ENV_VARS.

Authenticate client (mutual auth): Require clients to present their certificates. Used to perform mutual authentication using SSL certs. Defaults to No. When toggled to Yes:

  • Validate client certs: Reject certificates that are not authorized by a CA in the CA certificate path, or by another trusted CA (e.g., the system’s CA). Defaults to No.

  • Common name: Regex matching subject common names in peer certificates allowed to connect. Defaults to .*. Matches on the substring after CN=. As needed, escape regex tokens to match literal characters. E.g., to match the subject CN=worker.cribl.local, you would enter: worker\.cribl\.local.

Minimum TLS version: Optionally, select the minimum TLS version to accept from connections.

Maximum TLS version: Optionally, select the maximum TLS version to accept from connections.

Persistent Queue Settings

In this section, you can optionally specify persistent queue storage, using the following controls. This will buffer and preserve incoming events when a downstream Destination is down, or exhibiting backpressure.

On Cribl-managed Cribl.Cloud Workers (with an Enterprise plan), this tab exposes only the Enable Persistent Queue toggle. If enabled, PQ is automatically configured in Always On mode, with a maximum queue size of 1 GB disk space allocated per PQ‑enabled Source, per Worker Process.

The 1 GB limit is on uncompressed inbound data, and no compression is applied to the queue. This limit is not configurable. For configurable queue size, compression, mode, and other options below, use a hybrid Group.

Enable Persistent Queue: Defaults to No. When toggled to Yes:

Mode: Select a condition for engaging persistent queues.

  • Always On: This default option will always write events to the persistent queue, before forwarding them to Cribl Stream’s data processing engine.
  • Smart: This option will engage PQ only when the Source detects backpressure from Cribl Stream’s data processing engine.

Max buffer size: The maximum number of events to hold in memory before reporting backpressure to the sender and writing the queue to disk. Defaults to 1000. (This buffer is per connection, not just per Worker Process – and this can dramatically expand memory usage.)

Commit frequency: The number of events to send downstream before committing that Stream has read them. Defaults to 42.

Max file size: The maximum data volume to store in each queue file before closing it and (optionally) applying the configured Compression. Enter a numeral with units of KB, MB, etc. If not specified, Cribl Stream applies the default 1 MB.

Max queue size: The maximum amount of disk space that the queue is allowed to consume on each Worker Process. Once this limit is reached, this Source will stop queueing data and block incoming data. Required, and defaults to 5 GB. Accepts positive numbers with units of KB, MB, GB, etc. Can be set as high as 1 TB, unless you’ve configured a different Max PQ size per Worker Process in Group Settings.

Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. To this field’s specified path, Cribl Stream will append /<worker-id>/inputs/<input-id>.

Compression: Optional codec to compress the persisted data after a file is closed. Defaults to None; Gzip is also available.

In Cribl Stream 4.1 and later, Source-side PQ’s default Mode is Always on, to best ensure events’ delivery. For details on optimizing this selection, see Always On versus Smart Mode.

You can optimize Workers’ startup connections and CPU load at Group Settings > Worker Processes.

Processing Settings

Fields

In this section, you can add Fields to each event, using Eval-like functionality.

Name: Field name.

Value: JavaScript expression to compute field’s value, enclosed in quotes or backticks. (Can evaluate to a constant.)

Pre-Processing

In this section’s Pipeline drop-down list, you can select a single existing Pipeline to process data from this input before the data is sent through the Routes.

Advanced Settings

Enable Proxy Protocol: Toggle to Yes if the connection is proxied by a device that supports Proxy Protocol v1 or v2.

IP Allowlist Regex: Regex matching IP addresses that are allowed to establish a connection. Although it appears in the Advanced Settings UI, this is not a user-configurable setting. Its value is updated automatically to match the IP addresses used in the deployment, and cannot be edited.

Max Active Connections: Maximum number of active connections allowed per Worker Process. Use 0 for unlimited.

Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.

Connected Destinations

Select Send to Routes to enable conditional routing, filtering, and cloning of this Source’s data via the Routing table.

Select QuickConnect to send this Source’s data to one or more Destinations via independent, direct connections.

Internal Fields

Cribl Stream uses a set of internal fields to assist in handling of data. These “meta” fields are not part of an event, but they are accessible, and Functions can use them to make processing decisions.

Field for this Source:

  • __inputId
  • __srcIpPort