Cribl LogStream – Docs

Cribl LogStream Documentation

Questions? We'd love to help you! Meet us in #Cribl Community Slack (sign up here)
Download entire manual as PDF - v2.4.4

Lookups Library

What Are Lookups

Lookups are data tables that can be used in Cribl LogStream to enrich events as they're processed by the Lookup Function. You can access the Lookups library under Knowledge > Lookups, which provides a management interface for all lookups.

This library is searchable, and each lookup can be tagged as necessary. There's full support for .csv files. Compressed files are supported, but must be in gzip format (.gz extension). You can add files in multimedia database (.mmdb) binary format, but you cannot edit these binary files through LogStream's UI.

Lookups Library

How Does the Library Work

In single-instance deployments, all files handled by the interface are stored in $CRIBL_HOME/data/lookups. In distributed deployments, the storage path on the Master Node is $CRIBL_HOME/groups/<groupname>/data/lookups/ for each Worker Group.

📘

For large and/or frequently replicated lookup files, you might want to bypass the Lookups Library UI and instead manually place the files in a different location. This can both reduce deploy traffic and prevent errors with LogStream's default Git integration. For details, see Managing Large Lookups.

Adding Lookup Files

To upload or create a new lookup file, click + Add New, then click the appropriate option from the drop-down.

Adding a lookup file

Modifying Lookup Files

To re-upload, expand, edit, or delete an existing .csv or .gz lookup file, click its row on the Lookups page. (No editing option is available for .mmdb files; you can only re-upload or delete these.)

In the resulting modal, you can edit files in Table or Text mode. However, Text mode is disabled for files larger than 1 MB.

Editing in table mode

Editing in text mode

Memory Sizing for Large Lookups

For large lookup files, you'll need to provide extra memory beyond basic requirements for LogStream and the OS. To determine how much extra memory to add to a Worker Node for a lookup, use this formula:

Lookup file's uncompressed size (MB) * 2.25 (to 2.75) *  numWorkerProcesses = Extra RAM required for lookup

For example, if you have a lookup file that is 1 GB (1,000 MB) on disk, and three Worker Processes, you could use an average 2.50 as the multiplier:

1,000 * 2.50 * 3 = 7,500

In this case, the Node's server or VM would need an extra 7,500 MB (7.5 GB) to accommodate the lookup file across all three worker processes.

What's with That Multiplier?

We've cited a squishy range of 2.25–2.75 for the multiplier, because we've found that it varies inversely with the number of columns in the lookup file:

  • The fewer columns, the higher the extra memory overhead (2.75 multiplier).
  • The more columns, the lower the overhead (2.25 multiplier).

In Cribl's testing:

  • 5 columns required a multiplier of 2.75
  • 10 columns required a multiplier of only 2.25.

These are general (not exact) guidelines, and this multiplier depends only on the lookup table's number of columns. The memory overhead imposed by each additional row appears to decline when more columns are present in the data.

Other Scenarios

See also:

Updated about a month ago

Lookups Library


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.