JSON Array Event Breaker
The JSON Array Event Breaker handles log formats where multiple distinct records are bundled together within a single top-level JSON object. It takes one large event and performs a targeted unrolling operation: it identifies an array nested within that parent JSON object and splits each element of that array into a separate, individual event.
Use this Event Breaker for:
- Logs where a single log entry contains an array of event records.
- Sources like AWS CloudTrail, EKS CloudWatch, or Google Cloud Audit Logs.
If a Source supports Event Breakers (such as AWS Sources), it is much more efficient to unroll JSON in an Event Breaker rather than a Pipeline Function.
See Event Breakers for general information about event breakers.
Settings
The JSON Array Event Breaker has these settings:
Array field: Optional path to array in a JSON event with records to extract. For example,
Records.Parent fields to copy: This setting allows you to retain key metadata from the original parent event and include it with each of the extracted child events. By default, only the extracted events from the array are passed down the Pipeline. For example, you can use this to copy
host,source,region, or other contextual information that is not contained within the array itself.To specify which fields to copy, enter a comma-separated list of top-level field names. You can also use
*as a wildcard to copy all top-level fields. Be aware that:- This setting only supports copying fields from the top level of the JSON document. Nested fields are not supported.
- The
Array fieldyou specify to be unrolled is always excluded from the copied fields. This prevents redundant data. - Enclose field names containing special characters (such as
my-field) in single or double quotes.
JSON extract fields: This toggle controls whether Cribl Stream parses the JSON array and automatically extracts key-value pairs from each event:
- When enabled, Cribl Stream will fully parse each JSON object and extract all fields (such as
user_id,status,event_type). This is the ideal setting for sending data to Destinations that need structured data for analysis like Splunk or Elasticsearch for searching, filtering, and dashboard-building. When this toggle is on, it exposes the Timestamp field setting. - When disabled, Cribl Stream will not extract any fields from the JSON objects. Only the
_raw(the full event text) andtimewill be available. Choose this option if your primary goal is to send raw data for archival purposes with the highest possible throughput.
The JSON extract fields toggle impacts the Timestamp format setting in the Timestamp Settings. If the JSON extract fields toggle is off, do not select the Manual format option. If the toggle is on, you can select any option.
- When enabled, Cribl Stream will fully parse each JSON object and extract all fields (such as
Timestamp field: This optional setting appears when JSON extract fields is toggled on. This allows you to specify a path to the timestamp field within each extracted event. For example,
eventTimeorlevel1.level2.eventTime. Cribl Stream will use this value to set the event’s_timefield.Max event bytes: The maximum size (in number of bytes) that an event can be before being flushed to the Pipelines.
Event byte limit: The maximum size (in number of bytes) that an event can be before being flushed to the Pipelines.
Configuration Examples
The following is an example of data input before the JSON Array Event Breaker processes it:
{
"someField": 42,
"someArray": [1, 2, 3],
"myObjectArray": [
{"one": 1, "two": 2},
{"three": 3, "four": 4}
],
"Records": [
{ "eventVersion": "1.0" },
{ "eventVersion": "2.0" }
],
"myObject": {
"someField": {
"test1": 1,
"test2": "two"
},
anotherArray: [
{"id": 1},
{"id": 2}
]
}
}Example 1 - Copy All Top-Level Fields
This first example shows how to use the wildcard * to copy all top-level fields from the original JSON event to each of the new, extracted events.
{
"parentFieldsToCopy": ["*"],
"jsonArrayField": "Records"
}- Parent fields to copy:
* - Array field:
Records
With these configurations, the original raw data will have this output:
[
{
_raw: "{ "eventVersion": "1.0" }",
"someField": 42,
"someArray": [1, 2, 3],
"myObjectArray": [
{"one": 1, "two": 2},
{"three": 3, "four": 4}
],
"myObject": {
"someField": {
"test1": 1,
"test2": "two"
},
anotherArray: [
{"id": 1},
{"id": 2}
]
}
},
{
_raw: "{ "eventVersion": "2.0" }",
"someField": 42,
"someArray": [1, 2, 3],
"myObjectArray": [
{"one": 1, "two": 2},
{"three": 3, "four": 4}
],
"myObject": {
"someField": {
"test1": 1,
"test2": "two"
},
anotherArray: [
{"id": 1},
{"id": 2}
]
}
}
]Example 2 - Copy Specific Fields
This example shows how to select only specific top-level fields to copy. By providing a comma-separated list of field names, you can control which metadata carries over to the extracted events, resulting in smaller events and reduced data volume.
{
"parentFieldsToCopy": ["someField", "myObject"],
"jsonArrayField": "Records"
}- Parent fields to copy:
"someField","myObject" - Array field:
Records
With these configurations, the original raw data will have this output:
[
{
_raw: "{ "eventVersion": "1.0" }",
"someField": 42,
"myObject": {
"someField": {
"test1": 1,
"test2": "two"
},
anotherArray: [
{"id": 1},
{"id": 2}
]
}
},
{
_raw: "{ "eventVersion": "2.0" }",
"someField": 42,
"myObject": {
"someField": {
"test1": 1,
"test2": "two"
},
anotherArray: [
{"id": 1},
{"id": 2}
]
}
}
]Example 3 - Use a Wildcard with a Prefix
You can also use a wildcard with a prefix to copy fields that share a common name. This example uses some* to copy any top-level fields that start with “some,” such as someField and someArray. This is useful when you have a consistent naming convention for related fields.
{
"parentFieldsToCopy": ["some*"],
"jsonArrayField": "Records"
}- Parent fields to copy:
"some*" - Array field:
Records
With these configurations, the original raw data will have this output:
[
{
_raw: "{ "eventVersion": "1.0" }",
"someField": 42,
"someArray": [1, 2, 3]
},
{
_raw: "{ "eventVersion": "2.0" }",
"someField": 42,
"someArray": [1, 2, 3]
}
]Example 4 - Unroll a Nested Array
This example demonstrates how to extract events from an array nested inside another object, as defined by the jsonArrayField path myObject.anotherArray. In this scenario, Cribl Stream excludes the entire top-level parent object (myObject) from the extracted events. The other top-level fields are copied to the new events.
{
"parentFieldsToCopy": ["*"],
"jsonArrayField": "myObject.anotherArray"
}- Parent fields to copy:
* - Array field:
myObject.anotherArray
With these configurations, the original raw data will have this output:
[
{
"_raw": {"id": 1},
"someField": 42,
"someArray": [1, 2, 3],
"myObjectArray": [
{"one": 1, "two": 2},
{"three": 3, "four": 4}
],
"Records": [
{ "eventVersion": "1.0" },
{ "eventVersion": "2.0" }
]
},
{
"_raw": {"id": 2},
"someField": 42,
"someArray": [1, 2, 3],
"myObjectArray": [
{"one": 1, "two": 2},
{"three": 3, "four": 4}
],
"Records": [
{ "eventVersion": "1.0" },
{ "eventVersion": "2.0" }
]
}
]