/ / /

Optimize Searches

Get strategies and tips for optimizing your searches.

The practices on this page help you make searches more efficient and faster.

Constrain Initial Query with Limit, Count, or Time Window

Until you’re certain how large your Dataset is, you should add a limit, count(), or time limit to your initial query.

Instead of this:

dataset="cribl_search_sample"

Do this:

dataset="cribl_search_sample" | limit 1000

Or this:

dataset="cribl_search_sample" dataSource="access_combined"
 | limit 1000
 | summarize count() by host, clientip

Or this, to constrain results to a 2-month window:

dataset="cribl_search_sample" earliest=-2mon

Unstructured Searches Are Fastest

Unstructured term searches will return faster than structured searches. Consider using a broad search (with a limit) in initial investigations.

Instead of this:

dataset="my_dataset" email="jim@company.com" | limit 1000

Do this:

dataset="my_dataset" "jim@company.com" | limit 1000

Place Filters Early in Queries

Try to push filtering expressions to the left-most side of your query wherever possible.

Instead of this:

dataset="<my-datasource>" 
| where tenantId == "foo-bar-12345" 
| where proc == "bash" 
| where data_source == "stdout"

Do this:

dataset="<my-datasource>" tenantId="foo-bar-12345" proc="bash" data_source="stdout"

Place Intensive Operations Last

To minimize costs, try to move functions like lookup and sort to the right-most part of a query. First, filter or aggregate your Dataset to reduce the data volume sent to these functions.

Instead of this:

dataset="<my-datasource>" dataSource=”VPC Flow Logs” | lookup service_names on dst_port | summarize count() by service_name

Do this:

dataset="<my-datasource>" dataSource=”VPC Flow Logs” | summarize count() by dst_port | lookup service_names on dst_port

Summarize to Search Faster

Any search that returns a large number of raw events back to the UI will be slower than a summary result set.

Instead of this:

dataset="my_data"

Do this:

dataset="my_data" | summarize count() by cid

Summarize Before Join

When using joins with aggregation operators, it’s more efficient to summarize-then-join rather than join-then-summarize.

Instead of this:

let URLMethods=dataset="access_common_data"; 

dataset="my-datasource" | join URLMethods on URL | summarize count() by method

Do this:

let URLMethods=dataset="access_common_data" | summarize count() by URL, method;

dataset="my-datasource" | summarize count() by URL, method | join URLMethods on URL 

Search Faster with Comma Syntax

Many operators can perform multiple functions simultaneously if you link your query together using commas instead of pipes.

Instead of this:

... | extend field1=”foo” | extend field2=”bar” | extend field3=”pike”

Do this:

... | extend field1=”foo”, field2=”bar”, field3=”pike”

Move Partition Tokens to Queries

Token-based partitions in your Dataset can drastically increase your search time if the directory paths are very broad. If you instead specify the tokens as part of your search, this will reduce the search span to only those sub-trees. Ideally, keep time to the left-most portion of your path, and keep tokens to the right wherever possible, as shown here.

If this is your Dataset definition:

data/${dataSource}/${_time:%Y}/${_time:%m}/${_time:%d}/${_time:%H}

Then instead of this:

dataset="myDataset" | summarize count() by destination | where dataSource=”cisco”

Do this:

dataset="myDataset" dataSource=”cisco” | summarize count() by destination

Send Exclusively to Cribl Stream to Speed Up Large Result Sets

When search results expand beyond several thousand events, sending the results to Cribl Stream via the send operator is faster than returning events to the Cribl Search UI.

Instead of this:

dataset="my_data" | send tee=true

Do this:

dataset="my_data" | send

Optimize Parquet with `project`

When searching Parquet files, use project in the second query clause, to narrow the subsequent expressions to only your fields of interest.

Instead of this:

dataset="a-parquet-datasource" | summarize sum(bytes) by customer, account

Do this:

dataset="a-parquet-datasource" | project bytes, customer, account | summarize sum(bytes) by customer, account

Optimize Searches

Constrain Initial Query with Limit, Count, or Time Window

Unstructured Searches Are Fastest

Place Filters Early in Queries

Place Intensive Operations Last

Summarize to Search Faster

Summarize Before Join

Search Faster with Comma Syntax

Move Partition Tokens to Queries

Send Exclusively to Cribl Stream to Speed Up Large Result Sets

Optimize Parquet with `project`

Common Resources

Need more help?

Cribl Suite v4.10

Cribl Suite v4.9.3

Cribl Suite 4.9.2

Cribl Suite v4.9.1

Cribl Suite v4.9

Optimize Searches

Constrain Initial Query with Limit, Count, or Time Window​

Unstructured Searches Are Fastest​

Place Filters Early in Queries​

Place Intensive Operations Last​

Summarize to Search Faster​

Summarize Before Join​

Search Faster with Comma Syntax​

Move Partition Tokens to Queries​

Send Exclusively to Cribl Stream to Speed Up Large Result Sets​

Optimize Parquet with project​

Common Resources

Need more help?

Cribl Suite v4.10

Cribl Suite v4.9.3

Cribl Suite 4.9.2

Cribl Suite v4.9.1

Cribl Suite v4.9

Constrain Initial Query with Limit, Count, or Time Window

Unstructured Searches Are Fastest

Place Filters Early in Queries

Place Intensive Operations Last

Summarize to Search Faster

Summarize Before Join

Search Faster with Comma Syntax

Move Partition Tokens to Queries

Send Exclusively to Cribl Stream to Speed Up Large Result Sets

Optimize Parquet with `project`