On This Page

Home / Reference Architectures/ Cribl Validated Architectures/Assumptions and Terminology

Assumptions and Terminology

Before diving into specific architectures, it helps to understand the terms used in these designs. These core concepts apply to every deployment, whether you are running a small setup or a global enterprise environment.

Architecture Components: Topologies and Overlays

CVA designs are built using two distinct layers:

  • Topologies (The Foundation): A topology defines the structural relationship between the Leader and Worker Groups/Fleets. It answers the question: “Which Cribl deployment am I implementing, and how are the components connected?” You must select a topology (Single-Instance, Distributed: Single-Worker Group, or Distributed: Multi-Worker Group/Fleet) as your starting point.
  • Overlays (The Logic): An overlay is a strategic pattern layered onto a Distributed: Multi-Worker Group/Fleet topology. It defines the rules for how data flows between groups to meet specific goals like regional compliance, workload isolation, or secure bridging across trust boundaries.

Global Assumptions

These reference topologies assume the standard Leader-Worker-Edge (Distributed) model used across all Cribl deployments.

  • Control plane vs. data plane: The Leader runs the control plane (UI, API, configuration, orchestration) and never processes production data. Workers and Edge Nodes run the data plane, handling all data processing and movement.
  • Stateless Workers: Worker Nodes are treated as stateless, horizontally scalable units. Configuration and state required for processing are externalized (for example, Packs, object storage, KV stores), ensuring that Worker Nodes can be replaced without data loss or interruption.
  • N-1 sizing: Production Worker Groups are sized with N-1 capacity. This means the environment is designed to sustain normal traffic even when one Worker Node is down for maintenance or an unexpected outage. For details, see Best Practice: Scale for N+1 Redundancy.
  • Placement close to data: A key design guideline is to place Worker Groups and Edge Fleets as close as practical to the data they handle. This minimizes latency, reduces network egress costs, and simplifies firewall rules.
  • Environment-independent architectures: These topologies are functionally identical regardless of whether they are hosted on Cribl.Cloud or in your self-managed environment. The distinction lies only in operational responsibility and the physical location of the Worker Nodes.

Key Terminology

TermDefinition
Core Worker GroupA centrally located Worker Group that performs heavy or shared duties such as global normalization, enrichment, data governance, and broad fan-out to many Destinations. This is typically found in hub-and-spoke overlays.
Edge NodeA Cribl Edge instance deployed on a host or endpoint. It performs local collection, light processing, and forwarding of data to Stream or other destinations.
Edge Fleet/SubfleetA logical grouping of Edge Nodes that share configuration. This is the Cribl Edge equivalent of a Worker Group in Cribl Stream.
LeaderThe Cribl control-plane component. It manages configuration, hosts the UI/API, coordinates Worker Groups and Edge Fleets, and enforces licenses. Leaders never process production data.
OverlaysStrategic logic patterns (such as Functional Split or Regional Split) layered onto a Multi-Worker Group/Fleet topology to manage specific data flow, compliance, or scaling requirements.
Placement ruleA design guideline to place Worker Groups and Edge Fleets close to the data they process and aligned to logical or physical boundaries (such as region, data center, security zone, business domain). For details, see Worker Group and Fleet Placement.
Spoke Worker GroupA Worker Group located close to data Sources (such as a regional or DMZ Group). It handles initial ingest, filtering, and shaping before optionally forwarding the data to a Core Worker Group or object storage.
TopologyThe structural foundation of a deployment (such Single-Instance or Distributed) that defines Leader placement and Worker Group/Fleet composition.
Worker NodeA host (VM, bare metal, or container) running Cribl Stream in Worker mode. It runs one or more Worker Processes that handle the actual ingest, processing, and delivery of data.
Worker GroupA logical grouping of one or more Worker Nodes that share a common configuration bundle and operational profile. This is the primary data-plane scaling unit in Stream.
Worker Group→Worker Group bridgingA data movement pattern between separate Worker Groups or Leaders, typically using Cribl HTTP or TCP Source/Destination pairs. It’s used to cross trust boundaries (such as DMZ to Core, or regional to central).