Lakehouse Engines in Cribl Search
Ingest your data into Cribl Search to get the best search performance and make the most of AI investigations.
What are Lakehouse Engines?
Storage and compute units that ingest, store, and accelerate your data inside Cribl Search, making queries faster and giving you deeper results from features like AI investigations or pre-search data exploration.
Why Use Lakehouse Engines
Lakehouse engines enable Cribl Search’s most powerful features:
- Easy data onboarding: Add Sources, auto-apply Datatypes, and land into Datasets in one flow.
- High-speed search: Run queries much faster than federated search.
- Data explorer: Profile Datasets instantly using pre-computed metadata.
- AI workflows: Run investigations with deep context derived from full schema discovery.
You don’t have to use Cribl Stream, Edge, or Lake. You can ingest your data directly into Cribl Search.
How a Lakehouse Engine Works
A lakehouse engine handles most of the work automatically, with a human-in-the-loop approach:
- Ingests data from one or more supported Sources.
- Recognizes, categorizes, and structures the data (we call it “Datatyping”).
- Organizes your data into Datasets.
- Drops expired data when its retention period is up.
The two main knobs you have are:
- Lakehouse engine size (per lakehouse engine): Maximum ingest rate per day.
- Retention period (per Dataset): How long to keep the data.
What Lakehouse Engine Size to Choose
Think of how much uncompressed data you plan to send per day. Choose a size that covers that ingest, leaving some headroom for spikes.
If your ingest rate changes, or you experience ingest or search latency, you can resize your lakehouse engine. If the available sizes are not enough, you can add more lakehouse engines to distribute the workload.
Lakehouse Engine Sizes Available
| Size | Capacity (per day) |
|---|---|
| X-Small | 300 GB |
| Small | 600 GB |
| Medium | 1,200 GB |
| Large | 2,400 GB |
| X-Large | 4,800 GB |
| 2X-Large | 9,600 GB |
| 3X-Large Contact Support | 14 TB |
| 4X-Large Contact Support | 19 TB |
| 5X-Large Contact Support | 24 TB |
| 6X-Large Contact Support | 28 TB |
Add a New Lakehouse Engine
Search Admins and above can add lakehouse engines from the Cribl Search Engines tab.
- On the Cribl.Clud top bar, select Products > Search > Data.
- Select the Engines tab, then Add Engine.
- Give your engine an ID (for example,
palo_alto_logs) unique across your Workspace. You won’t be able to change it later.The
mainID is reserved. - Set the Lakehouse engine Size. You can resize it later if needed.
- Confirm with Save.
When the lakehouse engine status is Ready, you can start connecting your Sources.
Check Lakehouse Engine Status
| Status | Meaning |
|---|---|
| Provisioning | Lakehouse engine is being created or resized. |
| Ready | Lakehouse engine is healthy, fully operational, and able to process data. |
| Recovering | Lakehouse engine is down and trying to recover. |
| Delayed | Provisioning has timed out, but is still retrying. |
| Deleting | Lakehouse engine is being deleted. |
Resize a Lakehouse Engine
Search Admins and above can resize lakehouse engines from the Cribl Search Engines tab.
- On the Cribl.Cloud top bar, select Products > Search > Data > Engines.
- Select the lakehouse engine you want to resize.
- Set the new lakehouse engine Size. See What lakehouse engine Size to Choose.
- Confirm with Save.
Wait until the lakehouse engine status changes from Provisioning to Ready again.