On This Page

Home / Cribl Insights/Data Insights

Data Insights

Data Insights provides an interactive topology view of your Cribl Stream data flows, showing how data moves from Sources through Pre-Processing Pipelines, Routes/Quick Connect, and Post-Processing Pipelines to Destinations, with metrics for volume, freshness, and shape. It is designed for quick troubleshooting and validation of end-to-end flows across Worker Groups and time ranges.

Why Use Data Insights

  • Locate issues fast: Spot partial failures, bottlenecks, drops, and misconfigurations by visualizing how Sources, Pipelines, and Destinations connect. Use the map to see when a Source is not attached to the expected Route, identify stale or inactive Sources, and trace events/bytes and freshness along the flow path.
  • Validate changes: Compare metrics across time windows to confirm the impact of configuration or deployment changes.
  • Understand flow shape: See where volume changes (reduction/enrichment) occur and where the Shape metric (field count per event) grows or shrinks across nodes. Use this to spot Sources with excessive fields, Pipelines/Packs that expand or reduce fields, and unexpected field-count variance that may indicate data-quality issues.

How It Works

Topology map: Nodes (the boxes) represent Sources, Pre-Processing Pipelines, Routes/Quick Connect, Post-Processing Pipelines, and Destinations. The arrows represent flow. The following columns are displayed (left-to-right):

  • Source: Ingest points producing Events, Bytes In and Freshness In, and (when enabled) Shape, which reflects the number of fields per event.
  • Pre-Processing Pipeline: The first processing stage, where you typically parse, normalize, and filter data before routing. Expect changes in volume and Shape here as fields are extracted or dropped.
  • Routes / Quick Connect: Data routing/branching layer that directs events to Pipelines and Destinations. Use this column to verify that Sources connect to the expected paths.
  • Post-Processing Pipeline: Where transformations, enrichment, and reduction occur. You can expect divergence between Events and Bytes In vs. Out, and Shape changes as Pipelines or Packs add, modify, or remove fields.
  • Destination: Egress targets showing Events, Bytes Out, and related Freshness at the boundary.
  • Drilldown: Select any node to open a details pane with time-series views for events, bytes, and freshness, aligned to your time range and filters.

When Shape metrics are enabled, nodes can also show the minimum and maximum number of fields per event over the selected time range. This helps you see how schema complexity changes through the flow, identify Sources with excessive fields, and find Pipelines/Packs that expand or reduce fields unexpectedly.

Select the action menu (upper-right of the node) to configure the object the node represents, filter the view to it, or copy its IDs.

Caveats

  • Source and Destination volumes differ: Bytes In (Source) and Bytes Out (Destination) don’t match because Data Insights accounts for compression, formatting, and protocol overhead at the Destination. Both values are correct relative to their specific measurement points.
  • Aggregation Functions affect Source filtering: When you filter the map by Source, the Destination still displays the full aggregated volume if an Aggregation Function is used in the Pipeline. If a Function combines events and removes attribution fields, it prevents Data Insights from isolating that specific Source’s contribution.

Map Settings

Use these controls (top bar and sidebar) to refine what you see.

Refresh: Refreshes the map and the details pane metrics using the current time range and filters.

Worker Group: The map is limited to a specific Worker Group to focus on a subset of infrastructure. Changing the Worker Group updates the nodes and metrics displayed.

Time period: Select a time range (for example, last 15 minutes, 1 hour, 1 day). The map and details pane re-render using aggregated resolution appropriate to the range.

  • Use Compare to see how the current time range’s metrics differ from the same-length window at an earlier point in time. This is helpful to validate recent changes, confirm regressions, or distinguish new issues from normal historical patterns. Select a Comparison period to see the current metric value on each node card, with a percentage change versus the comparison period. Both periods are plotted on the same charts so you can quickly see increases, decreases, or pattern changes. The Comparison period controls how far back the earlier window starts.

Filter: Narrow the map to specific components. You can filter by: Source, Routes/Quick Connect, Destination, and Metric Value (when a Metric type is selected).

Metric Controls

Below the filters, use the metric controls to choose what each node card and sparkline shows.

  • Metric: Select the type of metric to display on node cards:
    • None: Hide per-node metric values. Show only topology.
    • Volume: Show volume metrics. For example, events and bytes in and out.
    • Freshness: Show data freshness metrics (age of events) instead of volume.
    • Shape: Show schema metrics based on field count. For example, minimum and maximum number of fields per event. Use this to spot Sources with excessive fields and Pipelines or Packs that unexpectedly expand or reduce the number of fields.
  • Metric Display: Choose which values to show for metrics that have In/Out pairs (Volume and Freshness):
    • Max In/Max Out: Display the maximum value seen at input and output over the selected time range.
      • For Volume, this highlights peak events/bytes.
      • For Freshness, this highlights the stalest data.
    • Min In/Min Out: Display the minimum value seen at input and output.
      • For Volume, this shows the lowest observed rates.
      • For Freshness, this shows the freshest (lowest-latency) data.
  • Sparkline: Choose which metric the small per-node sparkline represents. For example, Max Freshness In/Out or Min Freshness In/Out. This helps you see freshness trends at a glance without opening the details pane.
  • Display active data only: When toggled on, hides nodes and edges with no activity in the selected time range. Use this to declutter the map and focus on components that actually processed data.

Details Pane (node drilldown)

Select any node to open a right-hand details pane with three tabs:

Events: Time-series of events metrics for the selected component. For example, Events In/Out, totals, and maximum values appropriate to the overlay and component. Use this to verify symmetry and detect gaps or spikes.

Bytes: Time-series of bytes metrics (Bytes In/Out) for bandwidth and reduction checks. Use this to confirm compression/reduction effects and identify downstream throttling symptoms.

Freshness: Time-series of minimum and maximum freshness (age) of events across the time range. Use this to detect stale or delayed data and align freshness anomalies with component behavior.

Shape: Time-series of minimum and maximum field counts per event. Use this to see where schemas expand or contract, identify Sources with excessive fields, and spot unexpected field-count changes that may indicate data-quality or parsing issues.

What to expect:

  • The details pane respects your Worker Group, time period, and filter selections.
  • Freshness tabs expose min/max variants (for example, min in, max out) that correspond to where the metric is measured in the flow.

Use the top-right controls to filter the map to the selected object (filter icon) or copy its identifier (copy icon). In the bottom bar, select Configure to open the object’s configuration.

Workflows

Detect partial failures: Set Metric to Volume, filter to a specific Route or Destination, and scan for divergence between In and Out across nodes. Drill into nodes with anomalies and review the details pane to localize the interruption.

Validate transformations: With Metric set to Volume, compare In vs. Out across Post-Processing Pipelines to confirm expected reduction or enrichment. Then pivot to the Events, Bytes, or Freshness tabs in the details pane to make sure behavior is stable under load and time.

Investigate staleness: Set Metric to Freshness, use Metric Display and Sparkline to focus on Max or Min In/Out, and drill into upstream components with high maximum freshness to identify where delays enter the flow.