These docs are for Cribl Stream 4.11 and are no longer actively maintained.
See the latest version (4.13).
Azure Blob Storage Destination
Cribl Stream supports sending data to (and replaying specific events from) both Azure Blob Storage, and Azure Data Lake Storage Gen2, which implements a hierarchical namespace over blob data. This Destination can deliver data to Azure whether Cribl Stream is running on Azure, another cloud platform, or on-prem.
Type: Non-Streaming | TLS Support: Yes | PQ Support: No
For configuration examples, see Resources below.
Configure Cribl Stream to Output to Azure Blob Storage
- On the top bar, select Products, and then select Cribl Stream. Under Worker Groups, select a Worker Group. Next, you have two options:
- To configure via QuickConnect, navigate to Routing > QuickConnect (Stream) or Collect (Edge). Select Add Destination and select the Destination you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select Data > Destinations or More > Destinations (Edge). Select the Destination you want. Next, select Add Destination.
- In the New Destination modal, configure the following under General Settings:
- Output ID: Enter a unique name to identify this Destination definition.
- Description: Optionally, enter a description.
- Container name: Specify the Azure Blob Storage container name. Containers organize blobs, similar to directories in a file system. Container names can include only lowercase letters, numbers, and hyphens.
- For dynamic container names, provide a JavaScript expression within quotes or backticks, for example,
`myContainer-${C.env["CRIBL_WORKER_ID"]}`
. This expression is evaluated at initialization, resolving environment or runtime variables. Only simple concatenations and references to global variables are supported.
- For dynamic container names, provide a JavaScript expression within quotes or backticks, for example,
- Blob prefix: Root directory to prepend to path before uploading.
- Staging location: Local filesystem location in which to buffer files before compressing and moving them to the final destination. Cribl recommends that this location be stable and high-performance.
In Cribl Stream, the Staging location field is not displayed or available on Cribl.Cloud-managed Worker Groups.
- Data format: The output data format defaults to
JSON
.Raw
andParquet
are also available. SelectingParquet
(supported only on Linux, not Windows) exposes a Parquet Settings left tab, where you must configure certain options in order to export data in Parquet format.
- Under Authentication, select an Authentication method from the dropdown:
- Manual: Use this default option to enter your Azure Storage connection string directly. Exposes a Connection string field for this purpose. (If left blank, Cribl Stream will fall back to
env.AZURE_STORAGE_CONNECTION_STRING
.) - Secret: This option exposes a Connection string (text secret) drop-down, in which you can select a stored secret that references an Azure Storage connection string. A Create link is available to store a new, reusable secret. For details, see Connection String Format.
- Client secret: This option allows you to use the Azure service principal’s client secret to authenticate, and exposes Azure service principal settings.
- Certificate: This option allows you to use a certificate registered with the Azure service principal to authenticate, and exposes Azure service principal settings.
- Manual: Use this default option to enter your Azure Storage connection string directly. Exposes a Connection string field for this purpose. (If left blank, Cribl Stream will fall back to
- Next, you can configure the following Optional Settings:
- Create container: Toggle on to create the configured container in Azure Blob Storage if one does not already exist.
- Partitioning expression: JavaScript expression that defines how files are partitioned and organized. Default is date-based. If blank, Cribl Stream will fall back to the event’s
__partition
field value (if present); or otherwise to the root directory of the Output Location and Staging Location. - Compress: Data compression format used before moving to final destination. Defaults to
gzip
(recommended). This setting is not available when Data format is set toParquet
. - File name prefix expression: The output file name prefix. Must be a JavaScript expression (which can evaluate to a constant), enclosed in quotes or backticks. Defaults to
CriblOut
. - File name suffix expression: The output file name suffix. Must be a JavaScript expression (which can evaluate to a constant), enclosed in quotes or backticks. Defaults to
`.${C.env["CRIBL_WORKER_ID"]}.${__format}${__compression === "gzip" ? ".gz" : ""}`
, where__format
can bejson
orraw
, and__compression
can benone
orgzip
.To prevent files from being overwritten, Cribl appends a random sequence of 6 characters to the end of their names. This also applies to prefix and suffix expressions in file names.
For example, if you set the File name prefix expression to
CriblExec
and set the File name suffix expression to.csv
, the file name might display asCriblExec-adPRWM.csv
withadPRWM
appended. - Backpressure behavior: Whether to block or drop events when all receivers are exerting backpressure. (Causes might include an accumulation of too many files needing to be closed.) Defaults to
Block
. - Tags: Optionally, add tags that you can use to filter and group Destinations on the Destinations page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
- Optionally, you can adjust the Processing, Parquet Settings, and Advanced settings outlined in the sections below.
- Select Save, then Commit & Deploy.
Connection String Format
Manual and Secret authentication methods use an Azure Storage connection string in this format:
DefaultEndpointsProtocol=[http|https];AccountName=<your‑account‑name>;AccountKey=<your‑account‑key>;EndpointSuffix=<your-endpoint-suffix>
A fictitious example, using Microsoft’s recommended HTTPS option, is:
DefaultEndpointsProtocol=https;AccountName=storagesample;AccountKey=12345678...32;EndpointSuffix=core.windows.net
Service Principal Settings
Client secret and Certificate authentication methods expose the following additional settings:
- Storage account name: Enter the name of your Azure Storage Account.
- Tenant ID: Enter the service principal’s tenant ID.
- Client ID: Enter the service principal’s client ID.
- Client secret (text secret) (with Client secret method selected): Select a text secret containing the client secret, or create a new one.
- Certificate (with Certificate method selected): Select the certificate you registered as credentials for your app in the Azure portal, or create a new one.
- Endpoint suffix: (Optional) Enable connectivity to Azure Blob Storage in different regions, for instance, Azure China with
core.chinacloudapi.cn
. Defaults tocore.windows.net
.
Processing Settings
Post‑Processing
Pipeline: Pipeline or Pack to process data before sending the data out using this output.
System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe
(identifying the Cribl Stream Pipeline that processed the event). Supports wildcards. Other options include:
cribl_host
– Cribl Stream Node that processed the event.cribl_input
– Cribl Stream Source that processed the event.cribl_output
– Cribl Stream Destination that processed the event.cribl_route
– Cribl Stream Route (or QuickConnect) that processed the event.cribl_wp
– Cribl Stream Worker Process that processed the event.
Parquet Settings
To write out Parquet files, note that:
- On Linux, you can use the Cribl Stream CLI’s
parquet
command to view a Parquet file, its metadata, or its schema. - Cribl Edge Workers support Parquet only when running on Linux, not on Windows.
- See Working with Parquet for pointers on how to avoid problems such as data mismatches.
Automatic schema: Toggle on to automatically generate a Parquet schema based on the events of each Parquet file that Cribl Stream writes. When toggled off (the default), exposes the following additional field:
- Parquet schema: Select a schema from the drop-down.
If you need to modify a schema or add a new one, follow the instructions in our Parquet Schemas topic. These steps will propagate the freshest schema back to this drop-down.
Parquet version: Determines which data types are supported, and how they are represented. Defaults to 2.6
; 2.4
and 1.0
are also available.
Data page version: Serialization format for data pages. Defaults to V2
. If your toolchain includes a Parquet reader that does not support V2
, use V1
.
Group row limit: The number of rows that every group will contain. The final group can contain a smaller number of rows. Defaults to 10000
.
Page size: Set the target memory size for page segments. Generally, set lower values to improve reading speed, or set higher values to improve compression. Value must be a positive integer smaller than the Row group size value, with appropriate units. Defaults to 1 MB
.
Log invalid rows: Toggle on to output up to 20 unique rows that were skipped due to data format mismatch. Log level must be set to debug
for output to be visible.
Write statistics: Leave toggled on (the default) if you have Parquet tools configured to view statistics – these profile an entire file in terms of minimum/maximum values within data, numbers of nulls, and so on.
Write page indexes: Leave toggled on (the default) if your Parquet reader uses statistics from Page Indexes to enable page skipping. One Page Index contains statistics for one data page.
Write page checksum: Toggle on if you have configured Parquet tools to verify data integrity using the checksums of Parquet pages.
Metadata (optional): The metadata of files the Destination writes will include the properties you add here as key-value pairs. For example, one way to tag events as belonging to the OCSF category for security findings would be to set Key to OCSF Event Class
and Value to 2001
.
Advanced Settings
Max file size (MB): Maximum uncompressed output file size. Files reaching this size will be closed and moved to the final output location. Defaults to 32
.
Max file open time (sec): Maximum amount of time to write to a file. Files open for longer than this limit will be closed and moved to final output location. Defaults to 300
.
Max file idle time (sec): Maximum amount of time to keep inactive files open. Files open for longer than this limit will be closed and moved to final output location. Default: 30
.
Max open files: Maximum number of files to keep open concurrently. When exceeded, the oldest open files will be closed and moved to final output location. Default: 100
.
Cribl Stream will close files when either of the
Max file size (MB)
or theMax file open time (sec)
conditions are met.
Max concurrent file parts: Maximum number of parts to upload in parallel per file. A value of 1
tells the Destination to send one part at a time – that is, to upload the file’s contents sequentially. Defaults to 1
; highest allowed value is 10
.
Blob access tier: Select the access tier for storing your data in Azure Blob Storage. Defaults to Default account access tier
. Options include:
Hot tier
for frequently accessed data.Cool tier
for infrequent access.Cold tier
for rarely accessed data.Archive tier
for data that can tolerate retrieval delays.
Disk space protection: Specifies whether to Block
(default) or Drop
incoming events when the disk space falls below the globally defined Min free disk space amount.
Add Output ID: When toggled on (default), adds the Output ID field’s value to the staging location’s file path. This ensures that each Destination’s logs will write to its own bucket.
For a Destination originally configured in a Cribl Stream version below 2.4.0, the Add Output ID behavior will be switched off on the backend, regardless of this toggle’s state. This is to avoid losing any files pending in the original staging directory, upon Cribl Stream upgrade and restart. To enable this option for such Destinations, Cribl’s recommended migration path is:
- Clone the Destination.
- Redirect the Routes referencing the original Destination to instead reference the new, cloned Destination.
This way, the original Destination will process pending files (after an idle timeout), and the new, cloned Destination will process newly arriving events with Add output ID enabled.
Remove staging dirs: When toggled on (the default), Cribl Stream deletes empty staging directories after moving files. This prevents the proliferation of orphaned empty directories. When enabled, exposes this additional option:
- Staging cleanup period: How often (in seconds) to delete empty directories when Remove staging dirs is enabled. Defaults to
300
seconds (every 5 minutes). Minimum configurable interval is10
seconds; maximum is86400
seconds (every 24 hours).
Enable dead-lettering: Toggle this on to set a maximum number of retries, and to move files to a designated directory when write failures exceed that limit. This prevents data flow blockage and excessive error logging due to undeliverable files. When enabled, exposes two additional fields:
- Dead-letter location: Specify the storage location for undeliverable files. Defaults to
$CRIBL_HOME/state/outputs/dead-letter
. - Maximum retry limit: Configure the retry limit for failed file deliveries. This setting defines how many times the system will attempt to move a file to its intended location before it is deemed undeliverable and placed in the dead-letter directory. Defaults to
20
.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Internal Fields
Cribl Stream uses a set of internal fields to assist in forwarding data to a Destination.
Field for this Destination:
__partition
Proxying Requests
If you need to proxy HTTP/S requests, see System Proxy Configuration.
Resources
For examples of configuring Cribl Stream to interoperate with Azure services, see these guides:
Troubleshooting
The Destination’s configuration modal has helpful tabs for troubleshooting:
Live Data: Try capturing live data to see real-time events as they flow through the Destination. On the Live Data tab, click Start Capture to begin viewing real-time data.
Logs: Review and search the logs that provide detailed information about the delivery process, including any errors or warnings that may have occurred.
Test: Ensures that the Destination is correctly set up and reachable. Verify that sample events are sent correctly by clicking Run Test.
You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify delivery issues. Analyze the graphs showing events and bytes in/out over time.