These docs are for Cribl Edge 4.7 and are no longer actively maintained.
See the latest version (4.14).
Lookup
The Lookup Function enriches events with external fields, using lookup table files in CSV, compressed .csv.gz, or binary .mmdb format.
CSV files are text-based and untyped. This means all data within the file is treated as strings, regardless of its intended type (like, numbers or dates).
Usage
Filter: Filter expression (JS) that selects data to feed through the Function. Defaults to true, meaning it evaluates all events.
Description: Simple description about this Function. Defaults to empty.
Final: If toggled to Yes, stops feeding data to the downstream Functions. Defaults to No.
Lookup file path (.csv, .csv.gz): Path to the lookup file. Select an existing file that you’ve uploaded via Cribl Edge’s UI at Knowledge > Lookups Library, or specify the path. You can reference environment variables via $, e.g.: $CRIBL_HOME/file.csv.
When you configure this field via a distributed deployment’s Leader Node, Cribl Edge will swap
$CRIBL_HOME/groups/<groupname>/for$CRIBL_HOMEwhen validating whether the file exists. In this case, the default upload path changes from$CRIBL_HOME/data/lookups(single-instance deployments) to$CRIBL_HOME/groups/<groupname>/data/lookups/(distributed deployments).
Match mode: Defines the format of the lookup file, and indicates the matching logic that will be performed. Defaults to Exact.
Match type: For CIDR and Regex Match modes, this attribute refines how to resolve multiple matches. First match will return the first matching entry. Most specific will scan all entries, finding the most specific match. All will return all matches in the output, as arrays. (Defaults to First match. Not displayed for Exact Match mode.)
Lookup fields (.csv): Field(s) that should be used to key into the lookup table.
- Lookup field name in event: Exact field name as it appears in events. Nested addressing supported.
- Corresponding field name in lookup: The field name as it appears in the lookup file. Defaults to the Lookup field name in event value. This input is optional.
Case-Sensitive / Multiple Matches
Lookups are case-sensitive by default. (See the Ignore case option below.)
If the lookup file contains duplicate key names with different values, all Match modes of this Function will use only the value in the key’s final instance, ignoring all preceding instances.
Output field(s): Field(s) to add to events after matching the lookup table. Defaults to all if not specified.
- Output field name from lookup: Field name, as it appears in the lookup file.
- Lookup field name in event: Field name to add to event. Defaults to the lookup field name. This input is optional. Nested addressing is supported.
- Default value: Optional string value to assign to the field when the lookup entry is not found.
Advanced Settings
Reload period (sec): To periodically check the underlying file for mod-time changes, and reload the file if necessary, change the default -1 value (disabled) to a positive integer representing the check interval in seconds.
In distributed deployments, enabling a Reload period can generate conflicts with configuration updates, causing Pipelines to skip executing some Lookup Functions. Cribl recommends that you enable it only for lookup files not managed by Cribl Edge’s UI, or lookup files that can be updated by an external process. (E.g., a threat list that you update via a cron job.)
For lookup files that are managed by Cribl Edge’s UI, a distributed Cribl Edge deployment will override this setting as necessary – skipping checks to prevent conflicts that could trigger skipped lookups. These restrictions do not apply to single-instance deployments.
Ignore case: Ignore case when performing Match mode: Exact lookups. Defaults to No.
Add to raw event: Whether to append the looked-up values to the _raw field, as key=value pairs. Defaults to No.
Examples
Example 1: Regex Lookups
Assign a sourcetype field to events if their _raw field matches a particular regex.
regex,sourcetype
"^[^,]+,[^,]+,[^,]+,THREAT",pan:threat
"^[^,]+,[^,]+,[^,]+,TRAFFIC",pan:traffic
"^[^,]+,[^,]+,[^,]+,SYSTEM",pan:systemMatch mode: Regex
Match type: First match
When using the Lookup Function with Regex and
First match, ensure that your lookup file contains no empty lines - not even at the bottom. Any empty row will cause the function to always returntrue.
Lookup field name in event: _raw
Corresponding field name in lookup: regex
### BEFORE:
{"_raw": "Sep 20 13:03:55 PA-VM 1,2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0"}
{"_raw": "Sep 20 13:03:55 PA-VM 1,2018/09/20 13:03:58,FOOBAR,THREAT,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0"}
### AFTER:
{"_raw": "Sep 20 13:03:55 PA-VM 1,2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0",
  "sourcetype": "pan:traffic"
  }
{"_raw": "Sep 20 13:03:55 PA-VM 1,2018/09/20 13:03:58,FOOBAR,THREAT,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0"
  "sourcetype": "pan:threat"
  }Example 2: CIDR Lookups
Assign a location field to events if their destination_ip field matches a particular CIDR range.
range,location
10.0.0.0/24,San Francisco
10.0.0.0/16,California
10.0.0.0/8,USMatch mode: CIDR
Match type: See options below
Lookup field name in event: destination_ip
Corresponding field name in lookup: range
In Match mode: CIDR with Match type: Most specific, the lookup will implicitly search for matches from most specific to least specific. There is no need to pre-sort data.
Note that Match mode: CIDR with Match type: First Match is likely the most performant with large lookups. This can be used as an alternative to Most specific, if the file is sorted with the most specific/relevant entries first. This mode still performs a table scan, top to bottom.
### BEFORE:
{"_raw": "Sep 20 13:03:55 PA-VM 1, 2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0",
  "destination_ip": "10.0.0.102"
  }
  
### AFTER with Match Type: First Match
 
{"_raw": "Sep 20 13:03:55 PA-VM 1, 2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0",
  "destination_ip": "10.0.0.102",
  "location": "San Francisco"
  }
  
### AFTER with Match Type: Most Specific
 
{"_raw": "Sep 20 13:03:55 PA-VM 1, 2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0",
  "destination_ip": "10.0.0.102",
  "location": "San Francisco"
  }
### AFTER with Match Type: All
 
{"_raw": "Sep 20 13:03:55 PA-VM 1, 2018/09/20 13:03:58,FOOBAR,TRAFFIC,end,2049,2018/09/20 13:03:58,34.217.108.226,10.0.0.102,34.217.108.226,10.0.2.65,splunk,,,incomplete,vsys1,untrusted,trusted,ethernet1/3,ethernet1/2,log-forwarding-default,2018/09/20 13:03:58,574326,1,53722,8088,53722,8088,0x400064,tcp,allow,296,296,0,4,2018/09/20 13:03:45,7,any,0,730277,0x0,United States,10.0.0.0-10.255.255.255,0,4,0,aged-out,0,0,0,0,,PA-VM,from-policy,,,0,,0,,N/A,0,0,0,0",
  "destination_ip": "10.0.0.102",
  "location": [
    "San Francisco",
    "California",
    "US",
  ]}More Examples and Scenarios
More examples:
See also:
- Managing Large Lookups to optimize file locations for large lookup files.
- Redis Function for faster lookups using a Redis integration.