/ / / / /

Federated Search v2

Use v2 Datatypes and Datasets to speed up your federated searches. See what’s currently supported.

Highlights

v2 Datatypes are gradually replacing v1 Datatypes.
Federated Datasets currently support v2 Datatypes for NDJSON and CSV on S3 and Azure Blob.
Certain limitations apply.

What Are v2 Datatypes?

Next-generation Datatypes in Cribl Search. They power high-speed lakehouse engines, and let you use the new, high-performance architecture for federated queries into object storage like Amazon S3 or Azure Blob.

Whereas v2 Datatypes work with lakehouse engines out of the box, support for federated Datasets is rolling out gradually, and requires some reconfiguration on your part.

For basic info on v2 Datatypes, see v2 Datatypes in Cribl Search.

For info on Datatypes in general, see Datatypes in Cribl Search.

No Action Required For Now

You don’t have to migrate your existing federated Datasets at this point. However, we recommend using v2 Datatypes for new Datasets where supported.

Why Switch from v1 to v2

Use v2 Datatypes and Datasets to get:

Faster queries: Filtering and parsing happen closer to the data, replacing the older rule-chain model.
Clearer mapping: Each path and glob pattern maps directly to a Datatype, so you always know which parsing applies.
Future-proofed: Switching now aligns you with where federated search is going, as well as with high-speed lakehouse engines.

What’s Supported Today

As of Cribl Search 4.17.0, you can use v2 Datatypes with Amazon S3 and Azure Blob Storage Datasets. Supported data formats are limited to NDJSON and delimited text.

For full details on what’s supported, see Current Limitations.

Switch a Federated Dataset from v1 to v2

You can’t directly migrate from v1 to v2, but you can clone your existing v1 Dataset and modify it.

1. Check What’s Supported

Start by reviewing the limitations to understand current support.

2. Clone and Reconfigure Your Dataset

Clone your existing v1 Dataset.
Switch the cloned Dataset’s Type from v1 to v2.
Configure the bucket/container path(s) and other settings.
For Amazon S3, see v2 Dataset Configuration for S3.
For Azure Blob, see v2 Dataset Configuration for Azure Blob.
Confirm with Save.

3. Verify by Running a Search

Run a test search against your new v2 Dataset. If the results include the correct datatype field, the switch worked.

Dataset configuration may be cached briefly after saving, so if results look off at first, wait a moment and re-run.

Current Limitations of Federated Search v2

Support for v2 federated Datasets is expanding gradually. Read on to see the current status.

Only NDJSON and Delimited Text Are Supported

You can currently use only JSON Newline Delimited and Delimited Text data formats. Other formats, such as Parquet, are not supported.

Cribl Lake Datasets Don’t Support v2

You can’t currently use v2 Datatypes with Cribl Lake Datasets, even for JSON formats.

Some v2 Datatype Options Are Not Available for Federated Search

If you configure a v2 Datatype with the following options, you won’t be able to apply that Datatype to federated Datasets:

When you’re creating federated Dataset, the Cribl Search UI hides any v2 Datatypes that use these configurations.

One Bucket/Container Path per Dataset

Each v2 federated Dataset supports only one bucket/container path. Creating multiple paths via the API will cause search errors.

You can add multiple filters to that path to handle different file types (e.g., JSON and CSV). Filters are applied in order, so place specific patterns before general ones.

Glob Patterns Match the Full Object Path

Glob patterns match the entire path (e.g., bucket/prefix/folder/file.csv), not just the filename. Use recursive patterns like **/*.csv to include files in subfolders.

Cribl Search doesn’t validate patterns on save, so test with a small data subset first.

CSV Headers Are Parsed as Events

There’s no setting to mark a header row. If your CSV files have one, it will appear as the first data event instead of being used as column names.

Federated Search v2

Highlights​

What Are v2 Datatypes?​

No Action Required For Now​

Why Switch from v1 to v2​

What’s Supported Today​

Switch a Federated Dataset from v1 to v2​

1. Check What’s Supported​

2. Clone and Reconfigure Your Dataset​

3. Verify by Running a Search​

Current Limitations of Federated Search v2​

Only NDJSON and Delimited Text Are Supported​

Cribl Lake Datasets Don’t Support v2​

Some v2 Datatype Options Are Not Available for Federated Search​

One Bucket/Container Path per Dataset​

Glob Patterns Match the Full Object Path​

CSV Headers Are Parsed as Events​

Common Resources

Highlights