Data Insights
Data Insights provides an interactive topology view of your Cribl Stream data flows, showing how data moves from Sources through Pre-Processing Pipelines, Routes/QuickConnect, and Post-Processing Pipelines to Destinations, with metrics for volume, freshness, and shape. It is designed for quick troubleshooting and validation of end-to-end flows across Worker Groups and time ranges.
Why Use Data Insights
- Locate issues fast: Spot partial failures, bottlenecks, drops, and misconfigurations by visualizing how Sources, Pipelines, and Destinations connect. Use the map to see when a Source is not attached to the expected Route, identify stale or inactive Sources, and trace events/bytes and freshness along the flow path.
- Validate changes: Compare metrics across time windows to confirm the impact of configuration or deployment changes.
- Understand flow shape: See where volume changes (reduction/enrichment) occur and where the Shape metric (field count per event) grows or shrinks across map cards. Use this to spot Sources with excessive fields, Pipelines/Packs that expand or reduce fields, and unexpected field-count variance that may indicate data-quality issues.
How It Works
Topology map: Map cards (the boxes) represent Sources, Pre-Processing Pipelines, Routes/QuickConnect, Post-Processing Pipelines, and Destinations. The arrows represent flow. The following columns are displayed (left-to-right):
- Source: Ingest points producing Events, Bytes In and Freshness In, and (when enabled) Shape, which reflects the number of fields per event.
- Pre-Processing Pipeline: The first processing stage, where you typically parse, normalize, and filter data before routing. Expect changes in volume and Shape here as fields are extracted or dropped.
- Routes / QuickConnect: Data routing/branching layer that directs events to Pipelines and Destinations. Use this column to verify that Sources connect to the expected paths.
- Post-Processing Pipeline: Where transformations, enrichment, and reduction occur. You can expect divergence between Events and Bytes In vs. Out, and Shape changes as Pipelines or Packs add, modify, or remove fields.
- Destination: Egress targets showing Events, Bytes Out, and related Freshness at the boundary.
- Drilldown: Select any map card to open a details pane with time-series views for events, bytes, and freshness, aligned to your time range and filters.
When Shape metrics are enabled, map cards can also show the minimum and maximum number of fields per event over the selected time range. This helps you see how schema complexity changes through the flow, identify Sources with excessive fields, and find Pipelines/Packs that expand or reduce fields unexpectedly.
Select the action menu (upper-right of the map card) to configure the object the map card represents, filter the view to it, or copy its IDs.
Caveats
- Source and Destination volumes differ: Bytes In (Source) and Bytes Out (Destination) don’t match because Data Insights accounts for compression, formatting, and protocol overhead at the Destination. Both values are correct relative to their specific measurement points.
- Aggregation Functions affect Source filtering: When you filter the map by Source, the Destination still displays the full aggregated volume if an Aggregation Function is used in the Pipeline. If a Function combines events and removes attribution fields, it prevents Data Insights from isolating that specific Source’s contribution.
- Shared Pipelines and QuickConnect: If a Pipeline or QuickConnect is used by multiple Sources or Destinations, it appears as a distinct map card for each connection. This ensures that metrics remain specific to the traffic of that particular path, rather than being aggregated across all connections sharing that name.
Map Settings
Use these controls (top bar and sidebar) to refine what you see.
Refresh: Refreshes the map and the details pane metrics using the current time range and filters.
Worker Group: The map is limited to a specific Worker Group to focus on a subset of infrastructure. Changing the Worker Group updates the map cards and metrics displayed.
Time period: Select a time range (for example, last 15 minutes, 1 hour, 1 day). The map and details pane re-render using aggregated resolution appropriate to the range.
- Use Compare to see how the current time range’s metrics differ from the same-length window at an earlier point in time. This is helpful to validate recent changes, confirm regressions, or distinguish new issues from normal historical patterns. Select a Comparison period to see the current metric value on each map card, with a percentage change versus the comparison period. Both periods are plotted on the same charts so you can quickly see increases, decreases, or pattern changes. The Comparison period controls how far back the earlier window starts.
Filter: Focus the map to specific components. You can filter by: Source, Routes/QuickConnect, Destination, and Metric Value (when a Metric type is selected).
- When you select a Source, Route/QuickConnect, or Destination, the map highlights that card and renders its directly connected cards to preserve context, rather than hiding all other cards that do not strictly match the filter.
- If you select multiple cards, the map shows the union of their focused subgraphs (everything connected to any selected card), not the intersection. This avoids empty or misleading graphs when selected cards are not connected to each other, while still narrowing the view to the most relevant parts of your topology.
Metric Controls
Below the filters, use the metric controls to choose what each map card and sparkline shows.
- Metric: Select the type of metric to display on map cards:
- None: Hide per-card metric values. Show only topology.
- Volume: Show volume metrics. For example, events and bytes in and out.
- Freshness: Show data freshness metrics (age of events) instead of volume.
- Shape: Show schema metrics based on field count. For example, minimum and maximum number of fields per event. Use this to spot Sources with excessive fields and Pipelines or Packs that unexpectedly expand or reduce the number of fields.
- Metric Display: Choose which values to show for metrics that have In/Out pairs (Volume and Freshness):
- Max In/Max Out: Display the maximum value seen at input and output over the selected time range.
- For Volume, this highlights peak events/bytes.
- For Freshness, this highlights the stalest data.
- Min In/Min Out: Display the minimum value seen at input and output.
- For Volume, this shows the lowest observed rates.
- For Freshness, this shows the freshest (lowest-latency) data.
- Max In/Max Out: Display the maximum value seen at input and output over the selected time range.
- Sparkline: Choose which metric the small per-card sparkline represents. For example, Max Freshness In/Out or Min Freshness In/Out. This helps you see freshness trends at a glance without opening the details pane.
- Display active data only: When toggled on, hides cards and connections with no activity in the selected time range. Use this to declutter the map and focus on components that actually processed data.
Details Pane (card drilldown)
Select any map card to open a right-hand details pane with three tabs:
Events: Time-series of events metrics for the selected component. For example, Events In/Out, totals, and maximum values appropriate to the overlay and component. Use this to verify symmetry and detect gaps or spikes.
Bytes: Time-series of bytes metrics (Bytes In/Out) for bandwidth and reduction checks. Use this to confirm compression/reduction effects and identify downstream throttling symptoms.
Freshness: Time-series of minimum and maximum freshness (age) of events across the time range. Use this to detect stale or delayed data and align freshness anomalies with component behavior.
Shape: Time-series of minimum and maximum field counts per event. Use this to see where schemas expand or contract, identify Sources with excessive fields, and spot unexpected field-count changes that may indicate data-quality or parsing issues.
What to expect:
- The details pane respects your Worker Group, time period, and filter selections.
- Freshness tabs expose min/max variants (for example, min in, max out) that correspond to where the metric is measured in the flow.
Use the top-right controls to filter the map to the selected object (filter icon) or copy its identifier (copy icon). In the bottom bar, select Configure to open the object’s configuration.
Workflows
Detect partial failures: Set Metric to Volume, filter to a specific Route or Destination, and scan for divergence between In and Out across map cards. Drill into cards with anomalies and review the details pane to localize the interruption.
Validate transformations: With Metric set to Volume, compare In vs. Out across Post-Processing Pipelines to confirm expected reduction or enrichment. Then pivot to the Events, Bytes, or Freshness tabs in the details pane to make sure behavior is stable under load and time.
Investigate staleness: Set Metric to Freshness, use Metric Display and Sparkline to focus on Max or Min In/Out, and drill into upstream components with high maximum freshness to identify where delays enter the flow.