/ / / / ·

Azure Data Explorer Destination

Cribl Edge supports sending data to the Azure Data Explorer (ADX) managed data analytics service; you can then run Kusto queries against the data. This Destination can deliver data to Azure whether Cribl Edge is running on Azure, another cloud platform, or on-prem.

ADX stores log data in databases, which in turn contain one or more of the tables defined in the Azure namespace. You must create a separate Cribl Edge ADX Destination for each table that will store your data.

Type: Non-Streaming or Streaming (configurable) | TLS Support: Yes | PQ Support: No
For configuration examples, see Resources below.

Prerequisites

Before you configure the Destination, you need to collect some information from your Azure deployment. Start from Home (portal.azure.com), then navigate to the locations described below and collect the items specified.

Information About Your ADX Cluster

From Home, navigate to:

Azure Data Explorer Clusters > your_Azure_cluster_name > Overview > Essentials

Copy the URI. In Cribl Edge, this will be the value for General Settings > Cluster base URI. Make a copy of the URI with /.default appended - this will be the value of Authentication Settings > Scope.
Copy the Data Ingestion URI. In Cribl Edge, this will be the value for General Settings > Ingestion service URI.

Information About Your Azure Database

From Home, navigate to:

Azure Data Explorer Clusters > your_Azure_cluster_name | Databases > database_name | Query

Note the name of your desired target table - this table name will be the value of General Settings > Table name.

Information About Your Azure App

First, from Home, navigate to:

App registrations > your_Azure_app_name | Overview | Essentials

Copy the Application (client) ID - this will be the value of Authentication Settings > Client ID.
Copy the Directory (tenant) ID - this will be the value of Authentication Settings > Tenant ID.

Nex, from Home, navigate to:

App registrations > your_Azure_app_name | Certificates & secrets

Now what you do depends on which method for Authentication you plan to use in Cribl Edge.

To use the Client secret method

In the Client secrets tab, click New client secret and copy the Value of the new secret. In Cribl Edge, this will be the value of Client secret.

To use the Client secret (text secret) method

In the Client secrets tab, click New client secret and copy the Value of the new secret. In Cribl Edge, this will be the value of Create new secret > Value.

To use the Certificate method

After you have generated a certificate and keys (if you do this later on in the configuration process, return to this section then):

In the Certificates tab, click Upload Certificate. In the resulting drawer, follow the prompts to upload your certificate, and click Add to close the drawer.

The chosen certificate will then appear in the Certificates tab’s list.

Data Mapping Prerequisites

Finally, you might need to note or copy data mapping information to obtain values for some of the General Settings.

ADX uses a data mapping to map fields from incoming data to fields in the target table. To see the data mapping, you’ll run a Kusto query against your target table.

Here’s the query syntax:

.show table EntityName ingestion MappingKind mapping MappingName

Here’s an example query:

.show table SyslogTable ingestion json mapping "SyslogMapping"

The Kusto query that originally created this mapping would look like this:

.create table SyslogTable ingestion json mapping "SyslogMapping"

In Cribl Edge, you can use the names of mappings like the above example in the General Settings > Data mapping field. Alternatively, you can create a mapping without naming it, then copy and paste it (as a JSON object) into the Cribl Edge UI. To do this, toggle General Settings > Mapping object on. This hides the default Data mapping text field and opens a text window that expects JSON.

Configure Cribl Edge to Output to Azure Data Explorer

On the top bar, select Products, and then select Cribl Edge. Under Fleets, select a Fleet. Next, you have two options:
- To configure via QuickConnect, navigate to Routing > QuickConnect (Stream) or Collect (Edge). Select Add Destination and select the Destination you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select Data > Destinations or More > Destinations (Edge). Select the Destination you want. Next, select Add Destination.
In the New Destination modal, configure the following under General Settings:
- Output ID: Enter a unique name to identify this Destination definition.
- Description: Optionally, enter a description.
- Ingestion mode: Use the buttons to select Batching mode (the default) or Streaming mode. For details, see Ingestion Mode.
- Cluster base URI: Enter the base URI for your ADX cluster.
- Ingestion service URI: Displayed only in Batching mode. Enter the Ingestion Service URI for your ADX cluster.
- Database name: Enter the name of the ADX database where the target table resides.
- Table name: Enter the name of the target table.
- Validate database settings: When you save or start the Destination, validates the database name and credentials that you have entered in these settings. Also validates the table name, except when Add mapping object (below) is on. Defaults to toggled on. Toggle off if your Azure app does not have both the Database Viewer and the Table Viewer role.
- Add mapping object: Displayed only in Batching mode, this control is toggled off by default. Toggle on when you want to paste in a data mapping as a JSON object, instead of using a named data mapping. See Data Mapping Prerequisites above.
- Data mapping: If Add mapping object is off, enter the name of the desired data mapping. If Add mapping object is on, enter the desired data mapping as a JSON object.
- Compression: By default, this Destination compresses data using gzip; otherwise it sends the data uncompressed. This setting is not available when Data format is set to Parquet.
- Data format: The output data format defaults to JSON. Raw and Parquet are also available. Selecting Parquet (available only in Batching mode, and supported only on Linux, not Windows) exposes a Parquet Settings tab, where you select the Parquet schema.
- Backpressure behavior: Whether to block or drop events when all receivers are exerting backpressure. (Causes might include an accumulation of too many files needing to be closed.) Defaults to Block. When Ingestion mode is set to Streaming mode, the Persistent Queue option becomes available. See Persistent Queue Settings below.
- Tags: Optionally, add tags that you can use to filter and group Destinations on the Destinations page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
Optionally, you can adjust the Authentication, Persistent Queue , Processing, Parquet, and Advanced settings outlined in the sections below.
Select Save, then Commit & Deploy.

Ingestion Mode

This Destination supports two ingestion modes:

Batching (the default) stages data in a storage container from which ADX pulls batches of data, then writes them to the target ADX table. Batching works well when the Destination needs to ingest large amounts of data in short periods of time.
Streaming sends data as HTTP request payloads, directly to the target ADX table, without staging them in a container first. Streaming can achieve lower latency than batching, as long as the data that the Destination ingests arrives in relatively small amounts. When in this mode, the Retries section (left tab) becomes available.

Authentication Settings

Microsoft Entra ID authentication endpoint: From the drop-down, select the Microsoft Entra ID endpoint that provides authentication tokens. To understand the available choices, see the Microsoft Entra documentation.

Tenant ID: Directory ID (tenant identifier) in Microsoft Entra ID.

Client ID: client_id to pass in the OAuth request parameter.

Scope: Scope to pass in the OAuth request parameter.

Use the Authentication method buttons to select one of these options:

Client secret: Use this default option to enter the client secret that you generated for your app in the Azure portal.
Client secret (text secret): Use this option to select or create a stored text secret.
Certificate: Use this option to select the certificate whose public key you register (or will register) for your app in the Azure portal. Or, create a new one.

Persistent Queue Settings

The Persistent Queue Settings tab displays when the Backpressure behavior option in General settings is set to Persistent Queue. Persistent queue buffers and preserves incoming events when a downstream Destination has an outage or experiences backpressure.

Before enabling persistent queue, learn more about persistent queue behavior and how to optimize it with your system:

On Cribl-managed Cloud Workers (with an Enterprise plan), this tab exposes only the destructive Clear Persistent Queue button (described at the end of this section). A maximum queue size of 1 GB disk space is automatically allocated per PQ-enabled Destination, per Worker Process. The 1 GB limit is on outbound uncompressed data, and no compression is applied to the queue.
This limit is not configurable. If the queue fills up, Cribl Stream/Edge will block outbound data. To configure the queue size, compression, queue-full fallback behavior, and other options below, use a hybrid Group.

Mode: Use this menu to select when Cribl Stream/Edge engages the persistent queue in response to backpressure events from this Destination. The options are:

Mode	Description
Error	Queues and stores data on a disk when the Destination is unavailable or in an error state.
Backpressure	Queues and stores data to a disk when it detects backpressure from the Destination until the backpressure event resolves.
Always On	Cribl Stream or Edge immediately queues and stores all data on a disk for all events, even when there is no backpressure.

If a Worker/Edge Node starts with an invalid Mode setting, it automatically switches to Error mode. This might happen if the Worker/Edge Node is running a version that does not support other modes (older than 4.9.0), or if it encounters a nonexistent value in YAML configuration files.

File size limit: The maximum data volume to store in each queue file before closing it. Enter a numeral with units of KB, MB, etc. Defaults to 1 MB.

Queue size limit: The maximum amount of disk space that the queue can consume on each Worker Process. When the queue reaches this limit, the Destination stops queueing data and applies the Queue-full behavior. Defaults to 5 GB. This field accepts positive numbers with units of KB, MB, GB, and so on. You can set it as high as 1 TB, unless you’ve configured a different Worker Process PQ size limit on the Group Settings/Fleet Settings page.

Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. Cribl Stream/Edge will append /<worker-id>/<output-id> to this value.

Compression: Set the codec to use when compressing the persisted data after closing a file. Defaults to None. Gzip is also available.

Queue-full behavior: Whether to block or drop events when the queue begins to exert backpressure. A queue begins to exert backpressure when the disk is low or at full capacity. This setting has two options:

Block: The output will refuse to accept new data until the receiver is ready. The system will return block signals back to the sender.
Drop new data: Discard all new events until the backpressure event has resolved and the receiver is ready.

Backpressure duration Limit: When Mode is set to Backpressure, this setting controls how long to wait during network slowdowns before activating queues. A shorter duration enhances critical data loss prevention, while a longer duration helps avoid unnecessary queue transitions in environments with frequent, brief network fluctuations. The default value is 30 seconds.

Strict ordering: Toggle on (default) to enable FIFO (first in, first out) event forwarding, ensuring Cribl Stream/Edge sends earlier queued events first when receivers recover. The persistent queue flushes every 10 seconds in this mode. Toggle off to prioritize new events over queued events, configure a custom drain rate for the queue, and display this option:

Drain rate limit (EPS): Optionally, set a throttling rate (in events per second) on writing from the queue to receivers. (The default 0 value disables throttling.) Throttling the queue drain rate can boost the throughput of new and active connections by reserving more resources for them. You can further optimize Worker startup connections and CPU load in the Worker Processes settings.

Clear Persistent Queue: For Cloud Enterprise only, click this button if you want to delete the files that are currently queued for delivery to this Destination. If you click this button, a confirmation modal appears. Clearing the queue frees up disk space by permanently deleting the queued data, without delivering it to downstream receivers. This button only appears after you define the Output ID.

Use the Clear Persistent Queue button with caution to avoid data loss. See Steps to Safely Disable and Clear Persistent Queues for more information.

Processing Settings

Post-Processing

Pipeline: Pipeline or Pack to process data before sending the data out using this output.

System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe (identifying the Cribl Edge Pipeline that processed the event). Supports wildcards. Other options include:

cribl_host - Cribl Edge Node that processed the event.
cribl_input - Cribl Edge Source that processed the event.
cribl_output - Cribl Edge Destination that processed the event.
cribl_route - Cribl Edge Route (or QuickConnect) that processed the event.
cribl_wp - Cribl Edge Worker Process that processed the event.

Parquet Settings

To write out Parquet files, note that:

On Linux, you can use the Cribl Edge CLI’s parquet command to view a Parquet file, its metadata, or its schema.
Cribl Edge Workers support Parquet only when running on Linux, not on Windows.
See Working with Parquet for pointers on how to avoid problems such as data mismatches.

Automatic schema: Toggle on to automatically generate a Parquet schema based on the events of each Parquet file that Cribl Edge writes. When toggled off (the default), exposes the following additional field:

Parquet schema: Select a schema from the drop-down.

If you need to modify a schema or add a new one, follow the instructions in our Parquet Schemas topic. These steps will propagate the freshest schema back to this drop-down.

Parquet version: Determines which data types are supported and how they are represented. Defaults to 2.6; 2.4 and 1.0 are also available.

Data page version: Serialization format for data pages. Defaults to V2. If your toolchain includes a Parquet reader that does not support V2, use V1.

Group row limit: The number of rows that every group will contain. The final group can contain a smaller number of rows. Defaults to 10000.

Page size: The target memory size for page segments. Generally, lower values improve reading speed, while higher values improve compression. Must be a positive integer smaller than the Row group size value, with appropriate units. Defaults to 1 MB.

Log invalid rows: Toggle on to output up to 20 unique rows that were skipped due to data format mismatch. Log level must be set to debug for output to be visible.

Write statistics: Leave toggled on (the default) if you have Parquet tools configured to view statistics - these profile an entire file in terms of minimum/maximum values within data, numbers of nulls, etc.

Write page indexes: Leave toggled on (the default) if your Parquet reader uses statistics from Page Indexes to enable page skipping. One Page Index contains statistics for one data page.

Write page checksum: Toggle on if you have configured Parquet tools to verify data integrity using the checksums of Parquet pages.

Metadata (optional): The metadata of files the Destination writes will include the properties you add here as key-value pairs. For example, one way to tag events as belonging to the OCSF category for security findings would be to set Key to OCSF Event Class and Value to 2001.

Retries

This tab is displayed when General Settings > Ingestion mode is set to Streaming mode.

Honor Retry-After header: Whether to honor a Retry-After header, provided that the header specifies a delay no longer than 180 seconds. Cribl Edge limits the delay to 180 seconds even if the Retry-After header specifies a longer delay. When enabled, any Retry-After header received takes precedence over all other options configured in the Retries section. When disabled, all Retry-After headers are ignored.

Settings for failed HTTP requests: When you want to automatically retry requests that receive particular HTTP response status codes, use these settings to list those response codes.

For any HTTP response status codes that are not explicitly configured for retries, Cribl Edge applies the following rules:

Status Code	Action
`408`, `429`, `503`, `504`, `520`.	Retry the request.
Any other status code.	Drop the request.

Upon receiving a response code that’s on the list, Cribl Edge first waits for a set time interval called the Pre-backoff interval and then begins retrying the request. Time between retries increases based on an exponential backoff algorithm whose base is the Backoff multiplier, until the backoff multiplier reaches the Backoff limit (ms). At that point, Cribl Edge continues retrying the request without increasing the time between retries any further.

By default, 429 (Too Many Requests) is the only response code configured for automatic retries. For each response code you want to add to the list, click Add Setting and configure the following settings:

HTTP status code: A response code that indicates a failed request, for example 429 (Too Many Requests) or 503 (Service Unavailable).
Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000 (one second).
Backoff multiplier: The base for the exponential backoff algorithm. A value of 2 (the default) means that Cribl Edge will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
Backoff limit (ms): The maximum backoff interval Cribl Edge should apply for its final retry, in milliseconds. Default (and minimum) is 10,000 (10 seconds); maximum is 180,000 (180 seconds, or 3 minutes).

Retry timed-out HTTP requests: When you want to automatically retry requests that have timed out, toggle this control on to display the following settings for configuring retry behavior:

Pre-backoff interval (ms): The amount of time to wait before beginning retries, in milliseconds. Defaults to 1000 (one second).
Backoff multiplier: The base for the exponential backoff algorithm. A value of 2 (the default) means that Cribl Edge will retry after 2 seconds, then 4 seconds, then 8 seconds, and so on.
Backoff limit (ms): The maximum backoff interval Cribl Edge should apply for its final retry, in milliseconds. Default (and minimum) is 10,000 (10 seconds); maximum is 180,000 (180 seconds, or 3 minutes).

Advanced Settings

The controls on this tab vary depending on the Ingestion mode selected on the General Settings tab. Below you’ll find one dedicated section for the Batching, and another for the Streaming controls.

Only the Environment control, at the bottom of the tab, is common to both:

Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.

Advanced Settings for Batching Mode

Flush immediately: Toggle on to bypass the data management service’s aggregation mechanism. See the ADX documentation.

Retain blob on success: By default, Cribl Edge deletes each data blob once ingestion is complete. Toggle this on if you prefer to retain blobs instead.

Extent tags: Optionally, add strings or tags to partitions of the target table - Azure calls these extents or data shards.

Enforce uniqueness via tag values: Use the Add value button to specify a list of ingest-by values. Cribl Edge will then check the ingest-by tags of incoming extents and discard those whose values match a listed value. This mechanism for avoiding ingesting duplicate extents uses the ingestIfNotExists property, as described in the ADX documentation.

Report level: Level for ingestion status reporting. Options are DoNotReport, FailuresOnly (the default), and FailuresAndSuccesses.

Report method: Target for ingestion status reporting. Options are Queue (the default), Table, and QueueAndTable.

Additional fields: Optionally, enter additional configuration properties to send to the ingestion service.

Staging location: Local filesystem location in which to buffer files before compressing and moving them to the final destination. Cribl recommends that this location be stable and high-performance. Defaults to /tmp.

File name suffix expression: The output filename suffix. Must be a JavaScript expression (which can evaluate to a constant), enclosed in quotes or backticks.

Defaults to `.${C.env["CRIBL_WORKER_ID"]}.${__format}${__compression === "gzip" ? ".gz" : ""}`, where __format can be json or raw, and __compression can be none or gzip.

To prevent files from being overwritten, Cribl appends a random sequence of 6 characters to the end of their names. This also applies to prefix and suffix expressions in file names.
For example, if you set the File name prefix expression to CriblExec and set the File name suffix expression to .csv, the file name might display as CriblExec-adPRWM.csv with adPRWM appended.

File size limit (MB): Maximum uncompressed output file size. Files reaching this size will be closed and moved to the storage container. Defaults to 32.

File open time limit (sec): Maximum amount of time to write to a file. Files open for longer than this limit will be closed and moved to the storage container. Defaults to 300 seconds (5 minutes).

Idle time limit (sec): Maximum amount of time to keep inactive files open. Files open for longer than this limit will be closed and moved to the storage container. Default: 30.

Open file limit: Maximum number of files to keep open concurrently. When exceeded, the oldest open files will be closed and moved to the storage container. Default: 100.

Concurrent file parts limit: Maximum number of parts to upload in parallel per file. A value of 1 tells the Destination to send one part at a time - that is, to upload the file’s contents sequentially. Defaults to 1; highest allowed value is 10.

Disk space protection: Specifies whether to Block (default) or Drop incoming events when the disk space falls below the globally defined Min free disk space amount.

Add output ID: Toggle on if you want Cribl Edge to append the Destination name to staging directory pathnames. This can make it easier to organize and troubleshoot when you have multiple Destinations populating staging directories.

Remove empty staging directories: When toggled on (the default), Cribl Edge deletes empty staging directories after moving files. This prevents the proliferation of orphaned empty directories. When enabled, exposes this additional option:

Staging cleanup period: How often (in seconds) to delete empty directories when Remove staging dirs is enabled. Defaults to 300 seconds (every 5 minutes). Minimum configurable interval is 10 seconds; maximum is 86400 seconds (every 24 hours).

Enable dead-lettering: Toggle this on to set a maximum number of retries, and to move files to a designated directory when write failures exceed that limit. This prevents data flow blockage and excessive error logging due to undeliverable files. When enabled, exposes two additional fields:

Dead-letter location: Specify the storage location for undeliverable files. Defaults to $CRIBL_HOME/state/outputs/dead-letter.
Maximum retry limit: Configure the retry limit for failed file deliveries. This setting defines how many times the system will attempt to move a file to its intended location before it is deemed undeliverable and placed in the dead-letter directory. Defaults to 20.

Advanced Settings for Streaming Mode

Request timeout: Amount of time (in seconds) to wait for a request to complete before aborting it. Defaults to 30.

Request concurrency: Maximum number of concurrent requests per Worker Process. When Cribl Edge hits this limit, it begins throttling traffic to the downstream service. Defaults to 5. Minimum: 1. Maximum: 32.

Body size limit (KB): Maximum size of the request body before compression. Defaults to 4096 KB. The actual request body size might exceed the specified value because the Destination adds bytes when it writes to the downstream receiver. Cribl recommends that you experiment with the Body size limit value until downstream receivers reliably accept all events.

Events-per-request limit: Maximum number of events to include in the request body. The 0 default allows unlimited events.

Flush period (sec): Maximum time between requests. Low settings could cause the payload size to be smaller than its configured maximum. Defaults to 1.

Validate server certs: When toggled on (default) Cribl Edge will reject certificates that are not authorized by a CA in the CA certificate path, nor by another trusted CA (such as the system’s CA, for example).

Round-robin DNS: Toggle on to enable round-robin DNS lookup across multiple IP addresses, IPv4 and IPv6. When a DNS server resolves a Fully Qualified Domain Name (FQDN) to multiple IP addresses, Cribl Edge will sequentially use each address in the order they are returned by the DNS server for subsequent connection attempts.

Keep alive: By default, Cribl Edge sends Keep-Alive headers to the remote server and preserves the connection from the client side up to a maximum of 120 seconds. Toggle this off if you want Cribl Edge to close the connection immediately after sending a request.

Internal Fields

Cribl Edge uses a set of internal fields to assist in forwarding data to a Destination.

Field for this Destination:

__partition

Proxying Requests

If you need to proxy HTTP/S requests, see System Proxy Configuration.

Resources

For examples of configuring Cribl Edge to interoperate with Azure services, see these guides:

Troubleshooting

The Destination’s configuration modal has helpful tabs for troubleshooting:

Live Data: Try capturing live data to see real-time events as they flow through the Destination. On the Live Data tab, click Start Capture to begin viewing real-time data.

Logs: Review and search the logs that provide detailed information about the delivery process, including any errors or warnings that may have occurred.

Test: Ensures that the Destination is correctly set up and reachable. Verify that sample events are sent correctly by clicking Run Test.

You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify delivery issues. Analyze the graphs showing events and bytes in/out over time.

Azure Data Explorer Destination

Prerequisites​

Information About Your ADX Cluster​

Information About Your Azure Database​

Information About Your Azure App​

To use the Client secret method​

To use the Client secret (text secret) method​

To use the Certificate method​

Data Mapping Prerequisites​

Configure Cribl Edge to Output to Azure Data Explorer​

Ingestion Mode​

Authentication Settings​

Persistent Queue Settings​

Processing Settings​

Post-Processing​

Parquet Settings​

Retries​

Advanced Settings​

Advanced Settings for Batching Mode​

Advanced Settings for Streaming Mode​

Internal Fields​

Proxying Requests​

Resources​

Troubleshooting​

Common Resources