These docs are for Cribl Stream 4.0 and are no longer actively maintained.
See the latest version (4.13).
Database
You can configure Database Collectors to pull events from database management systems (DBMSs). This enables you to combine such structured data with unstructured machine data, and then route combined data to downstream systems of analysis to gain new insights.
As of this release, Cribl Stream supports integrating this Collector with MySQL and SQL Server databases.
How the Collector Pulls Data
Like other Cribl Stream Collectors, a Database Collector can perform Preview, Discovery, and Full Run operations in ad hoc or scheduled runs.
Database Collectors rely on Cribl Stream Database Connections Knowledge objects. Before you configure a Collector, configure a Database Connection to negotiate authenticated communication with the appropriate database type.
Configuring a Database Collector
From the top nav, click Manage, then select a Worker Group to configure. Next, select Data > Sources, then select Collectors > Database from the Manage Sources page’s tiles or left nav. Click New Collector to open a modal that provides the following options and fields.
The sections described below are spread across several tabs. Click the tab links at left to navigate among tabs. Click Save when you’ve configured your Collector.
Collector Sources currently cannot be selected or enabled in the QuickConnect UI.
Collector Settings
The Collector Settings determine how data is collected before processing.
Collector ID: Unique ID for this Collector. E.g., sh2GetStuff
.
Connection: Use the drop-down to select a Database Connections already configured on your Cribl Stream installation.
SQL query: Enter a JavaScript expression that resolves to a query string for selecting data from the database. Supports only a single statement, and only SELECT
. Has access to the special ${earliest}
and ${latest}
variables, which will resolve to the Collector run’s start and end time.
Optional Settings
Tags: Optionally, add tags that you can use for filtering and grouping in Cribl Stream. Use a tab or hard return between (arbitrary) tag names.
Result Settings
The Result Settings determine how Cribl Stream transforms and routes the collected data.
Custom Command
In this section, you can pass the data from this Collector to an external command for processing, before the data continues downstream.
Enabled: Defaults to No
. Toggle to Yes
to enable the custom command.
Command: Enter the command that will consume the data (via stdin
) and will process its output (via stdout
).
Arguments: Click + Add Argument to add each argument to the command. You can drag arguments vertically to resequence them.
Event Breakers
In this section, you can apply event breaking rules to convert data streams to discrete events.
Event Breaker rulesets: A list of event breaking rulesets that will be applied, in order, to the input data stream. Defaults to System Default Rule
.
Event Breaker buffer timeout: How long (in milliseconds) the Event Breaker will wait for new data to be sent to a specific channel, before flushing out the data stream, as-is, to the Routes. Minimum 10
ms, default 10000
(10 sec), maximum 43200000
(12 hours).
Fields
In this section, you can add Fields to each event, using Eval-like functionality.
Name: Field name.
Value: JavaScript expression to compute the field’s value (can be a constant).
Result Routing
Send to Routes: If set to Yes
(the default), Cribl Stream will send events to normal routing and event processing. Toggle to No
to select a specific Pipeline/Destination combination to receive the events. The No
setting exposes these two fields:
- Pipeline: Select a Pipeline to process results.
- Destination: Select a Destination to receive results.
The default Yes
setting instead exposes this field:
- Pre-processing Pipeline: Optionally, use the drop-down to select an existing Pipeline to process results before sending them to Routes.
This final field is always exposed:
- Throttling: Rate (in bytes per second) to throttle while writing to an output. Also takes values with multiple-byte units, such as
KB
,MB
,GB
, etc. (Example:42 MB
.) Default value of0
indicates no throttling.
You might disable Send to Routes when you’re configuring a Collector that will connect data from a specific Source to a specific Pipeline and Destination. This keeps the Collector’s configuration self‑contained and separate from Cribl Stream’s routing table for live data – potentially simplifying the Routes structure.
Advanced Settings
Advanced Settings enable you to customize post-processing and administrative options.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Time to live: How long to keep the job’s artifacts on disk after job completion. This also affects how long a job is listed in Job Inspector. Defaults to 4h
.
Remove Discover fields : List of fields to remove from the Discover results. This is useful when discovery returns sensitive fields that should not be exposed in the Jobs user interface. You can specify wildcards (such as aws*
).
Resume job on boot: Toggle to Yes
to resume ad hoc collection jobs if Cribl Stream restarts during the jobs’ execution.