These docs are for Cribl Stream 4.0 and are no longer actively maintained.
See the latest version (4.13).
Filesystem/NFS
Filesystem is a non-streaming Destination type that Cribl Stream can use to output files to a local file system or a network-attached file system (NFS).
Type: Non-Streaming | TLS Support: N/A | PQ Support: N/A
Configuring Cribl Stream to Output to Filesystem Destinations
From the top nav, click Manage, then select a Worker Group to configure. Next, you have two options:
To configure via the graphical QuickConnect UI, click Routing > QuickConnect (Stream) or Collect (Edge). Next, click + Add Destination at right. From the resulting drawer’s tiles, select Filesystem. Next, click either + Add Destination or (if displayed) Select Existing. The resulting drawer will provide the options below.
Or, to configure via the Routing UI, click Data > Destinations (Stream) or More > Destinations (Edge). From the resulting page’s tiles or the Destinations left nav, select Filesystem. Next, click New Destination to open a New Destination modal that provides the options below.
General Settings
Output ID: Enter a unique name to identify this Filesystem definition.
Output location: Final destination for the output files.
Data format: The output data format defaults to JSON
. Raw
and Parquet
are also available. Selecting Parquet
(supported only on Linux, not Windows) exposes a Parquet Settings left tab, where you must configure certain options in order to export data in Parquet format.
Optional Settings
Staging location: Local filesystem location in which to buffer files before compressing and moving them to the final destination. Cribl recommends that this location be stable and high-performance. (This field is not displayed or available on Cribl.Cloud-managed Worker Nodes.)
Partitioning expression: JavaScript expression that defines how files are partitioned and organized. Default is date-based. If blank, Cribl Stream will fall back to the event’s __partition
field value (if present); or otherwise to the root directory of the Output Location and Staging Location.
Compress: Data compression format used before moving to final destination. Defaults to none
. Cribl recommends setting this to gzip
. This setting is not available when Data format is set to Parquet
.
File name prefix expression: The output filename prefix. Must be a JavaScript expression (which can evaluate to a constant), enclosed in quotes or backticks. Defaults to CriblOut
.
File name suffix expression: The output filename suffix. Must be a JavaScript expression (which can evaluate to a constant), enclosed in quotes or backticks. Defaults to `.${C.env["CRIBL_WORKER_ID"]}.${__format}${__compression === "gzip" ? ".gz" : ""}`
, where __format
can be json
or raw
, and __compression
can be none
or gzip
.
To prevent files from being overwritten, Cribl appends a random sequence of 6 characters to the end of their names. File name prefix and suffix expressions do not bypass this behavior.
For example, if you set the File name prefix expression to
CriblExec
and set the File name suffix expression to.csv
, the file name might display asCriblExec-adPRWM.csv
withadPRWM
appended.
Backpressure Behavior: Select whether to block or drop events when all receivers are exerting backpressure. (Causes might include an accumulation of too many files needing to be closed.) Defaults to Block
.
Tags: Optionally, add tags that you can use for filtering and grouping at the final destination. Use a tab or hard return between (arbitrary) tag names.
Processing Settings
Post‑Processing
Pipeline: Pipeline to process data before sending the data out using this output.
System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe
(identifying the Cribl Stream Pipeline that processed the event). Supports wildcards. Other options include:
cribl_host
– Cribl Stream Node that processed the event.cribl_wp
– Cribl Stream Worker Process that processed the event.cribl_input
– Cribl Stream Source that processed the event.cribl_output
– Cribl Stream Destination that processed the event.
Parquet Settings
To write out Parquet files, note that:
- Cribl Edge Workers support Parquet only when running on Linux, not on Windows.
- See Working with Parquet for pointers on how to avoid problems such as data mismatches.
Parquet schema: Select a schema from the drop-down. To modify a schema or create a new one, follow the instructions below.
Row group size: Set the target memory size for row group segments. Modify this value to optimize memory use when writing. Value must be a positive integer smaller than the File size value, with appropriate units. Defaults to 16 MB
.
Page size: Set the target memory size for page segments. Generally, set lower values to improve reading speed, or set higher values to improve compression. Value must be a positive integer smaller than the Row group size value, with appropriate units. Defaults to 1 MB
.
Log invalid rows: Toggle to Yes
to output up to 20 unique rows that were skipped due to data format mismatch. Log level must be set to debug
for output to be visible.
Advanced Settings
Max file size (MB): Maximum uncompressed output file size. Files of this size will be closed and moved to final output location. Defaults to 32
.
Max file open time (sec): Maximum amount of time to write to a file. Files open for longer than this will be closed and moved to final output location. Defaults to 300
.
Max file idle time (sec): Maximum amount of time to keep inactive files open. Files open for longer than this will be closed and moved to final output location. Defaults to 30
.
Max open files: Maximum number of files to keep open concurrently. When exceeded, the oldest open files will be closed and moved to final output location. Defaults to 100
.
Cribl Stream will close files when either of the
Max file size (MB)
or theMax file open time (sec)
conditions are met.
Add Output ID: When set to Yes
(the default), adds the Output ID field’s value to the staging location’s file path. This ensures that each Destination’s logs will write to its own bucket.
For a Destination originally configured in a Cribl Stream version below 2.4.0, the Add Output ID behavior will be switched off on the backend, regardless of this slider’s state. This is so that upon Cribl Stream upgrade and restart, any files pending in the original staging directory will not be lost. To enable this option for such Destinations, Cribl’s recommended migration path is:
- Clone the Destination.
- Where Routes reference the original Destination, redirect them to instead reference the new, cloned Destination.
This way, the original Destination will process pending files (after an idle timeout), and the new, cloned Destination will process newly arriving events with Add output ID enabled.
Remove staging dirs: Toggle to Yes
to delete empty staging directories after moving files. This prevents the proliferation of orphaned empty directories. When enabled, exposes this additional option:
- Staging cleanup period: How often (in seconds) to delete empty directories when Remove staging dirs is enabled. Defaults to
300
seconds (every 5 minutes). Minimum configurable interval is10
seconds; maximum is86400
seconds (every 24 hours).
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Creating and Modifying Parquet Schemas
When you need to add a new Parquet schema or modify an existing one:
Navigate to the Parquet Schema Library, located at Processing > Knowledge > Parquet Schemas.
Create or modify the schema, as described here.
Cribl strongly recommends that to modify an existing schema, you first clone it and give the clone its own distinct name.
Return to this Destination, and select the new or modified schema in the Parquet schema drop-down.
Click Save.
Cloning First Is Safer than Just Modifying
Modifying an existing schema in the Schema Library does not propagate your modifications to the Destination’s copy of the schema.
- Cloning and renaming schemas is the safest approach, because in Step 3 above, it removes any doubt that your Destination will now use the newly-modified version of your schema.
- If you do not clone and rename the schema (i.e., you leave the schema name unchanged), you still must re-select the schema in the Destination’s Parquet schema drop-down, and click Save, to bring the modified version into the Destination.
Internal Fields
Cribl Stream uses a set of internal fields to assist in forwarding data to a Destination.
Field for this Destination:
__partition
To export events from an intermediate stage within a Pipeline to a file, see the Tee Function.