Home / Search/ Troubleshooting/Lakehouse Search Differences

Lakehouse Search Differences

Identify Cribl Search operators, functions, and data types that behave differently when searching Lakehouses.


When you run Cribl Search queries against a Lakehouse-assigned Dataset, some behavior and results differ from a corresponding search executed against a Cribl Lake Dataset without Lakehouse caching. This page identifies operators, functions, data types, and other details to keep in mind when searching Lakehouses.

Unlimited Events with sort

The sort operator’s MaxNoOfOutputEvents parameter defaults to unlimited with Lakehouse, rather than to 10000 events as in a regular search. However, you can set this limit parameter as desired.

Strict Boolean Comparisons

Boolean comparisons in Lakehouse searches are strongly typed. Assume a query condition like field1==field2, in which one field is a boolean, while the other is a boolean-compatible value (such as "yes", "y", "t", "true", or "1" for true; or "no", "f", "false", or "0" for false).

In a mixed comparison – for example, where field1’s value is true and field2’s value is "yes" – note that field1==field2 will not match. (When searching the same data without Lakehouse caching, looser typing allows these values to match.)

Statistical Aggregations with Booleans

A statistical aggregation that aggregates mathematically on a boolean value will return a null in Lakehouse searches. (Running the same aggregation without Lakehouse caching will return a numeric value – either 1 for true or 0 for false.)

This null return value applies to the following functions when their Expression argument takes a boolean value: avg, percentile, stdev, stdevp, sum, variance, and variancep. (A boolean value in these functions’ Predicate argument does not cause a null return value.)

The null return value also occurs with a boolean value in the following functions’ sole Expression argument: avgif, stdevif, sumif, and varianceif.

dcount Excludes null Values

When applying the dcount aggregation function, Lakehouse searches do not count null values toward the total, whereas searches without Lakehouse caching do.

This means that any null values in your data will cause the dcount value to be off by 1, compared to running the same query against the same data without Lakehouse caching (or in a different data store).

Extra null Column with timestats

Applying the timestats aggregation function to a search against a Lakehouse-assigned Dataset might return a null column when only empty values are found. Searches without Lakehouse caching do not return this column.

Higher dedup and eventstats Result Counts

Applying the dedup or eventstats operator to a search against a Lakehouse-assigned Dataset can return a higher result count, compared to searching the same data without Lakehouse caching. This is because dedup and eventstats in Lakehouse mode can process unlimited events, compared to the 50,000-event limit applied to dedup and eventstats in searches without caching.