Home / Search/ Search Your Data/Dataset Acceleration (Deprecated)

Dataset Acceleration (Deprecated)

Automatically prescan selected portions of your data to improve future search performance.


Deprecated Feature

Cribl has deprecated this feature and removed it from the product. We invite you to explore Cribl Lakehouses as a new option to speed up searches on designated Datasets.

Please continue to report Dataset Acceleration issues through normal Cribl support channels, but assistance for this deprecated feature might be limited.

Although you can always point Cribl Search at any source and start searching immediately, you can also automatically prescan selected Datasets and fields beforehand, using Dataset Acceleration. This process lets Cribl Search generate relevant metadata to further optimize search performance, especially for very large Datasets.

You can use Dataset Acceleration with the following Dataset types:

As an alternative to the UI-based techniques outlined below, you can run ad hoc prescans, using the .generate metadata command.

Dataset Acceleration Costs

Dataset Acceleration runs periodic background searches that consume your organization’s credits. However, it boosts performance and saves time, which can actually help you reduce long-term costs. To make an informed decision about each of your Datasets, check the performance gain that Dataset Acceleration provides in your case. The larger the Dataset, the more improvement you should see.

After the first prescan, Cribl Search will scan only the data modified since the last scan. This happens every five minutes (a frequency that might change in the future).

To keep Dataset Acceleration costs under control, you can:

  • Choose which Datasets to accelerate.
  • Specify the time range that you’re interested in.
  • Select the exact fields that you want to prescan, using the Detailed scan mode.
  • Measure the performance gain that the Acceleration provides.
  • Disable Dataset Acceleration at any time.

Dataset Acceleration Scan Modes

You can decide how much metadata to generate for each Dataset, by selecting one of the two scan modes.

The Quick mode is a lightweight scan that lists the objects included in the Dataset, their paths, and the existing metadata.

The Detailed mode is a more thorough scan that also scans the objects themselves, collecting min/max statistics about the specific fields that you’re interested in. This scan takes longer.

What Cribl Search doesQuick modeDetailed mode
Lists all objects included in the Dataset, along with their paths.
Lists the existing metadata of all objects.
Scans selected objects to generate min/max statistics about the specified fields.

Find Out Which Datasets Are Accelerated

When you go to Data > Datasets, accelerated Datasets are marked with the lightning bolt icon Dataset Acceleration icon .

When you’re running a search, look for the lightning bolt icon Dataset Acceleration icon below the query box:

  • Acceleration icon means that at least one of the queried Datasets is accelerated.
  • Non-acceleration icon means that none of the queried Datasets are accelerated.

Enable Dataset Acceleration

To enable Dataset Acceleration for a Dataset:

  1. In Cribl Search, select Data, then Datasets.
  2. Select the Dataset you’re interested in, and then Acceleration.
  3. Toggle Enable Acceleration to Yes.
  4. In Time Range, set a rolling time window that indicates how far back in time the metadata should be maintained. Acceleration won’t apply to data generated before the start of this time range.
    If you select All Time, Cribl Search will scan the data all the way back to 1 January 1970 (the Unix epoch), but most probably it’ll only happen once, because all subsequent scans will cover only the data modified since the last scan.
  5. Select the scan mode that you want to use.
    • In the Detailed scan mode, you need to enter the exact fields that you want to generate min/max statistics for. Focus on the fields that you’ll search the most frequently. Note that _time is included by default.
  6. Select Save. Cribl Search will start scanning the data in the background. Depending on the Dataset size, the first prescan may take a while, so for very large Datasets, consider adjusting the Running time limit.

From now on, Cribl Search will periodically scan recently modified data in the background, to keep the metadata up to date.

To test Acceleration on the cribl_search_sample Dataset, you need to clone it first. Select Data, then Datasets, then cribl_search_sample, then Clone Dataset.

Disable Dataset Acceleration

To disable Dataset Acceleration for a Dataset:

  1. In Cribl Search, select Data, then Datasets.
  2. Select the Dataset you’re interested in, and then Acceleration.
  3. Toggle Enable Acceleration to No.
  4. Select Save. Cribl Search will delete the metadata associated with the Dataset, similarly to running the .drop all metadata command.

Check Dataset Acceleration Performance

To check the exact performance gain from using Dataset Acceleration:

  1. Run a search on a Dataset that’s not accelerated yet. The lightning bolt icon should display as Non-acceleration icon .
  2. Once the search starts running, select Details, then Metrics. Here, you can view the search’s performance metrics. Under Scanned and Events returned, you should see the Accelerate your Datasets link that takes you to the Dataset’s settings. This means that the queried Dataset is not accelerated.
  3. Close the window by selecting Done.
  4. Now, enable Acceleration for the Datasets you just searched. Make sure to select fields and time range that match the search you just ran.
  5. After Cribl Search finishes scanning the data, run the same query as before. The lightning bolt icon should now display as Acceleration icon and the search should run faster than before.
  6. Compare the metrics with the previous search (if you still see the Accelerate your Datasets links, Cribl Search didn’t finish prescanning the data yet).
    • Under Scanned, the Accelerated metric shows what percentage of the scanned data is affected by Dataset Acceleration.
    • Under Events returned, the Accelerated metric shows what percentage of the returned events contain accelerated fields.
Dataset Acceleration metrics
Dataset Acceleration metrics