On This Page

Home / Stream/ Working with Data/ Event Data Structure and Flow/ Event Breaker Types/File Header Event Breaker

File Header Event Breaker

The File Header Event Breaker ingests highly structured log formats where the field names are explicitly defined in a header section at the top of the file. This Event Breaker relies on a combination of regular expressions to first identify the header, and then use the captured field names to parse all subsequent data lines.

Use this Event Breaker:

  • For logs that use a standard file header structure, such as Bro, IIS, or Apache access logs.
  • Instead of a simple Regex Breaker when the field names are complex but consistent across the log file.

See Event Breakers for general information about event breakers.

Settings

The File Header Event Breaker requires you to define the structure of the log using a set of coordinated regular expressions:

  • Header line: The regex used to match and identify a line as part of the header section. Any line that matches this pattern will be treated as metadata and not as an event. For example, this expression matches any line starting with a hash symbol: ^#.

  • Field delimiter: The regex used to split the field names captured by the Field Regex into individual field names. This delimiter is also implicitly used to break the subsequent data lines into values. For example, this expression matches one or more whitespace characters, common in space-separated log formats (such as Bro): \s+.

  • Fields regex: This is the most critical regex. It must have exactly one capturing group that isolates the list of field names from the rest of the header text. This captured group is what the Field Delimiter will then process. For example, this expression matches a line starting with #fields and captures everything that follows the label into a single group: ^#[Ff]ields[:]?\s+(.*).

  • Null value: The string representation of a null or empty value in the data lines. Fields with this value will not be added to the event object, keeping the event clean. The field is blank by default. Enter a string such as - if necessary, which is common in firewall or security logs.

  • Clean fields: Toggling this setting on will sanitize the extracted field names by replacing any non-alphanumeric characters with an underscore (_). This is highly recommended to ensure compatibility with downstream systems like Splunk or Elasticsearch. For example, a field named id.orig_h becomes id_orig_h.

Configuration Example

The following is an example of data input before the File Header Event Breaker processes it. The header section is typically at the top of the file. It can be a single line or more:

Example raw input - File Header format
#fields ts      uid     id.orig_h       id.orig_p       id.resp_h       id.resp_p       proto
#types  time    string  addr    port    addr    port    enum
1331904608.080000       -     192.168.204.59  137     192.168.204.255 137     udp
1331904609.190000       -     192.168.202.83  48516   192.168.207.4   53      udp

Example Settings

SettingValuePurpose
Header Line^#Matches all header lines.
Field Delimiter\s+Separates fields by whitespace.
Field Regex^#[Ff]ields[:]?\s+(.*)Captures the field names.
Clean FieldsOnConverts id.orig_h to id_orig_h.

Output

In the output, the Header Line (starting with #fields) and the Type Line (starting with #types) are ignored. The field names extracted from the Field Regex line are used to key the values in the output events.

From the example raw data, the File Header Event Breaker would generate two output events:

Example Output as JSON
{
  "_raw": "1331904608.080000       -     192.168.204.59  137     192.168.204.255 137     udp",
  "ts": "1331904608.080000",
  "id_orig_h": "192.168.204.59",
  "id_orig_p": "137",
  "id_resp_h": "192.168.204.255",
  "id_resp_p": "137",
  "proto": "udp",
  "_time": 1331904608.08
}
{
  "_raw": "1331904609.190000       -     192.168.202.83  48516   192.168.207.4   53      udp",
  "ts": "1331904609.190000",
  "id_orig_h": "192.168.202.83",
  "id_orig_p": "48516",
  "id_resp_h": "192.168.207.4",
  "id_resp_p": "53",
  "proto": "udp",
  "_time": 1331904609.19
}