Home / Lake/Datasets

Datasets

Datasets are a way of organizing different types of data stored in Cribl Lake.

Cribl Lake comes with ready-to-use default Datasets and lets you create your own new Datasets.

The data is stored as gzip-compressed JSON files.

Cribl Lake Retention

Data in all your Datasets is stored for a defined retention period.

For built-in Datasets, the retention period is fixed to 10-30 days, depending on the Dataset.

For your own Datasets, you can configure the retention period depending on your needs.

Calculating Retention

Retention period is based on the date on which data was uploaded and saved, not on the dates of individual stored events.

This distinction is important if you are uploading older events in a batch. Retention will then be calculated based on the date of the upload.

For example, if on the 1st Aug 2024 you upload a batch of data including events dated at 1st June 2024, and set retention period to 1 year, the events will be deleted after 1st Aug 2025 (based on upload date), not in June 2025.

Built-in Datasets

The following Datasets are available by default in Cribl Lake:

DatasetContainsRetention Period (Days)Notes
default_logsLogs from multiple sources.30
default_metricsMetrics from multiple sources.15
default_spansDistributed trace spans from multiple sources.10
default_eventsEvents from sources such as Kubernetes or a third-party API.30
cribl_metricsMetrics from the Cloud Leader. Data is stored for 30 days free of charge.30Cannot be targeted by a Destination.
cribl_logsLogs from the Cloud Leader. Data is stored for 30 days free of charge.30Cannot be targeted by a Destination.

Audit and Access Logs

Cribl Lake automatically collects audit and access logs, and metrics from the Cloud Leader Node (a node that manages the whole Cribl deployment). This data is stored in the cribl_logs and cribl_metrics Datasets.

Because these Datasets are internal, you can’t use them as a target in the Cribl Lake Destination.

You can’t edit or delete built-in Datasets.