Assumptions and Terminology
Before diving into specific architectures, it helps to understand the terms used in these designs. These core concepts apply to every deployment, whether you are running a small setup or a global enterprise environment.
Architecture Components: Topologies and Overlays
CVA designs are built using two distinct layers:
- Topologies (The Foundation): A topology defines the structural relationship between the Leader and Worker Groups/Fleets. It answers the question: “Which Cribl deployment am I implementing, and how are the components connected?” You must select a topology (Single-Instance, Distributed: Single-Worker Group, or Distributed: Multi-Worker Group/Fleet) as your starting point.
- Overlays (The Logic): An overlay is a strategic pattern layered onto a Distributed: Multi-Worker Group/Fleet topology. It defines the rules for how data flows between groups to meet specific goals like regional compliance, workload isolation, or secure bridging across trust boundaries.
Global Assumptions
These reference topologies assume the standard Leader-Worker-Edge (Distributed) model used across all Cribl deployments.
- Control plane vs. data plane: The Leader runs the control plane (UI, API, configuration, orchestration) and never processes production data. Workers and Edge Nodes run the data plane, handling all data processing and movement.
- Stateless Workers: Worker Nodes are treated as stateless, horizontally scalable units. Configuration and state required for processing are externalized (for example, Packs, object storage, KV stores), ensuring that Worker Nodes can be replaced without data loss or interruption.
- N-1 sizing: Production Worker Groups are sized with N-1 capacity. This means the environment is designed to sustain normal traffic even when one Worker Node is down for maintenance or an unexpected outage. For details, see Best Practice: Scale for N+1 Redundancy.
- Placement close to data: A key design guideline is to place Worker Groups and Edge Fleets as close as practical to the data they handle. This minimizes latency, reduces network egress costs, and simplifies firewall rules.
- Environment-independent architectures: These topologies are functionally identical regardless of whether they are hosted on Cribl.Cloud or in your self-managed environment. The distinction lies only in operational responsibility and the physical location of the Worker Nodes.
Key Terminology
| Term | Definition |
|---|---|
| Core Worker Group | A centrally located Worker Group that performs heavy or shared duties such as global normalization, enrichment, data governance, and broad fan-out to many Destinations. This is typically found in hub-and-spoke overlays. |
| Edge Node | A Cribl Edge instance deployed on a host or endpoint. It performs local collection, light processing, and forwarding of data to Stream or other destinations. |
| Edge Fleet/Subfleet | A logical grouping of Edge Nodes that share configuration. This is the Cribl Edge equivalent of a Worker Group in Cribl Stream. |
| Leader | The Cribl control-plane component. It manages configuration, hosts the UI/API, coordinates Worker Groups and Edge Fleets, and enforces licenses. Leaders never process production data. |
| Overlays | Strategic logic patterns (such as Functional Split or Regional Split) layered onto a Multi-Worker Group/Fleet topology to manage specific data flow, compliance, or scaling requirements. |
| Placement rule | A design guideline to place Worker Groups and Edge Fleets close to the data they process and aligned to logical or physical boundaries (such as region, data center, security zone, business domain). For details, see Worker Group and Fleet Placement. |
| Spoke Worker Group | A Worker Group located close to data Sources (such as a regional or DMZ Group). It handles initial ingest, filtering, and shaping before optionally forwarding the data to a Core Worker Group or object storage. |
| Topology | The structural foundation of a deployment (such Single-Instance or Distributed) that defines Leader placement and Worker Group/Fleet composition. |
| Worker Node | A host (VM, bare metal, or container) running Cribl Stream in Worker mode. It runs one or more Worker Processes that handle the actual ingest, processing, and delivery of data. |
| Worker Group | A logical grouping of one or more Worker Nodes that share a common configuration bundle and operational profile. This is the primary data-plane scaling unit in Stream. |
| Worker Group→Worker Group bridging | A data movement pattern between separate Worker Groups or Leaders, typically using Cribl HTTP or TCP Source/Destination pairs. It’s used to cross trust boundaries (such as DMZ to Core, or regional to central). |