On This Page

Home / Reference Architectures/ Cribl Validated Architectures/ Overlays: Common Patterns/Functional Split Overlay

Functional Split Overlay

This overlay partitions Worker Groups by workload type (or role) to achieve isolation and focused optimization.

The goal is to align each Worker Group to a specific set of constraints (for example, latency, security, resource usage), partitioning the data plane by role, not geography.

Common Functional Examples

The primary use cases and examples of this overlay are:

  • Push Worker Group: Handles live, high-volume, push-based Sources (syslog, HEC, OTLP). This group is critically tuned for low-latency ingest, handling a large number of concurrent connections, and managing backpressure effectively to prevent dropped data.

  • Pull and Collector Worker Group: Handles Collections, S3, SQS, database pulls, and other batch or on-demand jobs. This group is tuned for longer-running tasks and has higher tolerance for CPU and memory usage associated with bursty, scheduled throughput.

  • Replay Worker Group: Dedicated Workers for on-demand or bulk replay from object storage (for example, Cribl Lake, S3). This group is treated as an elastic service, tuned for high-throughput reads from storage and built for elastic scale-up and scale-down based on investigation periods and job volume.

  • Additional Groups: Functional segregation can extend to specialized groups like a Security Observability Worker Group for sensitive Pipelines (PII masking, DLP controls) or a Metrics-only Worker Group to isolate high-cardinality flows.

Benefits

  • Isolated failure domains: Replay, Collections, or resource-heavy Pipelines cannot degrade the quality or service level of continuous, live ingest streams.

  • Focused optimization: Each Worker Group can be tuned for its dedicated purpose, Worker Process counts, and persistent queue (PQ) strategy explicitly aligned with its workload.

  • Independent scaling: Capacity planning and scaling (such as scaling only Replay Workers during an investigation) is independent of other functions.

  • Clear SLOs: Align Worker Groups to roles, making capacity planning and Service Level Objectives (SLOs) explicit (for example, “Push Worker Group SLO: ingest latency < 1 second”).

Design Notes

  • Shared Data Schemar: Implement a shared normalization layer (using common Packs and Pipelines) to ensure that all processed data, regardless of the Source (for example, live ingest vs. Replay Worker Group), maintains an identical event shape. This guarantees consistent data quality for all downstream tools.

  • Ownership: Define operational responsibility by mapping each Worker Group to a specific owning team. This mapping must include the associated SLOs and the responsible on-call rotation/staffing for seamless operations.