On This Page

Home / Search/ Get Your Data In/Lakehouse Engines in Cribl Search

Lakehouse Engines in Cribl Search

Ingest your data into Cribl Search to get the best search performance and make the most of AI investigations.


What are Lakehouse Engines?

Storage and compute units that ingest, store, and accelerate your data inside Cribl Search, making queries faster and giving you deeper results from features like AI investigations or pre-search data exploration.

Why Use Lakehouse Engines

Lakehouse engines enable Cribl Search’s most powerful features:

  • Easy data onboarding: Add Sources, auto-apply Datatypes, and land into Datasets in one flow.
  • High-speed search: Run queries much faster than federated search.
  • Data explorer: Profile Datasets instantly using pre-computed metadata.
  • AI workflows: Run investigations with deep context derived from full schema discovery.

You don’t have to use Cribl Stream, Edge, or Lake. You can ingest your data directly into Cribl Search.

How a Lakehouse Engine Works

A lakehouse engine handles most of the work automatically, with a human-in-the-loop approach:

  1. Ingests data from one or more supported Sources.
  2. Recognizes, categorizes, and structures the data (we call it “Datatyping”).
  3. Organizes your data into Datasets.
  4. Drops expired data when its retention period is up.

The two main knobs you have are:

  • Lakehouse engine size (per lakehouse engine): Maximum ingest rate per day.
  • Retention period (per Dataset): How long to keep the data.

What Lakehouse Engine Size to Choose

Think of how much uncompressed data you plan to send per day. Choose a size that covers that ingest, leaving some headroom for spikes.

If your ingest rate changes, or you experience ingest or search latency, you can resize your lakehouse engine. If the available sizes are not enough, you can add more lakehouse engines to distribute the workload.

Lakehouse Engine Sizes Available

SizeCapacity (per day)
X-Small300 GB
Small600 GB
Medium1,200 GB
Large2,400 GB
X-Large4,800 GB
2X-Large9,600 GB
3X-Large
Contact Support
14 TB
4X-Large
Contact Support
19 TB
5X-Large
Contact Support
24 TB
6X-Large
Contact Support
28 TB

Add a New Lakehouse Engine

Search Admins and above can add lakehouse engines from the Cribl Search Engines tab.

  1. On the Cribl.Clud top bar, select Products > Search > Data.
  2. Select the Engines tab, then Add Engine.
  3. Give your engine an ID (for example, palo_alto_logs) unique across your Workspace. You won’t be able to change it later.

    The main ID is reserved.

  4. Set the Lakehouse engine Size. You can resize it later if needed.

    See What Lakehouse Engine Size to Choose.

  5. Confirm with Save.

When the lakehouse engine status is Ready, you can start connecting your Sources.

Check Lakehouse Engine Status

StatusMeaning
ProvisioningLakehouse engine is being created or resized.
ReadyLakehouse engine is healthy, fully operational, and able to process data.
RecoveringLakehouse engine is down and trying to recover.
DelayedProvisioning has timed out, but is still retrying.
DeletingLakehouse engine is being deleted.

Resize a Lakehouse Engine

Search Admins and above can resize lakehouse engines from the Cribl Search Engines tab.

  1. On the Cribl.Cloud top bar, select Products > Search > Data > Engines.
  2. Select the lakehouse engine you want to resize.
  3. Set the new lakehouse engine Size. See What lakehouse engine Size to Choose.
  4. Confirm with Save.

Wait until the lakehouse engine status changes from Provisioning to Ready again.