Design Principles by Tier and Plane
CVA design patterns are built on the Cribl Three-Plane Model to ensure a clean separation of concerns. While components may interact across layers, each tier is anchored to a primary “home” plane to maintain stability at scale.
In the CVA context, the Leader Node anchors the control plane, Worker Nodes anchor the central data plane, and Edge Nodes anchor the data plane at the Source. For a deep dive into these foundational concepts, review the Reference Architecture Overview.
Control Plane: Leader
To maintain high availability and operational integrity, the Leader must be managed as a specialized administrative service rather than a data processing engine.
- Keep Leaders off the “hot” data path: Leaders manage configuration and coordinate Worker Groups and Edge Fleets. They must not run production data Pipelines, Sources, or Destinations. All data processing is strictly delegated to the data plane.
- Implement Leader High Availability (HA): In self-hosted production environments, run an active Leader with at least one standby Leader that shares a common configuration store (for example, shared file storage or equivalent). Optionally front them with a load balancer or virtual address to provide a single stable endpoint for users and data-plane components. In lower-criticality environments, a single Leader can be acceptable if configuration is regularly exported or backed up so it can be quickly recreated. For detailed patterns, see High Availability Architecture.
- Treat configuration as externalized state: The Leader is a control surface, not the long-term source of truth. Pipelines, Routes, Packs, Destinations, Edge Fleet/Worker Group settings, and related configuration should live in an external system with clear promotion paths from non-production to production. In a failure scenario, you stand up a new Leader and rehydrate it from this external configuration source. For details, see Configuration Management.
- Harden the administrative surface: Protect the Leader with RBAC and Securing On-Prem and Hybrid Deployments best practices. Place it in a restricted network segment and treat it as a privileged interface with full audit logging enabled.
- Decouple scaling from throughput: Leader sizing is driven by the number of managed components and concurrent admins, not event throughput. Review the Architectural Considerations guide to keep the control plane stable as your data volume grows.
- Establish continuous monitoring: Proactively track the health of the control plane by monitoring system metrics, API latency, health checks, and alerts. For details, see Monitor Leader Health and Logs.
Central Data Plane: Worker Nodes/Groups
To maintain high performance and reliable data delivery, the Worker tier must be architected to handle varying load types and downstream unavailability.
- Scale Worker Nodes horizontally: Worker Nodes form the core of the data plane. As volume or processing complexity increases, scale out by adding Worker Nodes or Worker Groups instead of driving a small number of nodes to very high CPU and memory utilization. Design with headroom to absorb bursts, maintenance, and failures. For details, see Sizing Your Deployment.
- Tune processes by workload: Different workloads behave differently. High-throughput streaming Sources benefit from tuned process counts and buffer sizing, while pull-based Collectors (cloud storage, APIs, databases) often need carefully adjusted concurrency and timeouts. Build Worker Groups so that similar workloads live together and can be tuned as a unit. For details, see Worker Process counts.
- Segment by trust and function: Group Worker Nodes in ways that mirror how your environment is segmented: by function (security, observability, compliance/archive, FinOps), by region or data center, or by network/regulatory zone. This keeps data-plane flows clear, reduces cross-boundary traffic, and makes it easier to enforce different retention, security, or routing rules. For details, see Worker Group and Fleet Placement.
- Align Worker Groups with security and compliance boundaries: Where specific data classes or tenants have unique requirements, dedicate Worker Groups (or clear routing segments) to those flows. This keeps sensitive data within well-defined data-plane segments and simplifies auditing and access control.
- Use DNS hostnames or virtual endpoints for Destinations: Configure Destinations with DNS names, virtual IPs, or load-balanced endpoints rather than static IP addresses. This allows you to scale or migrate downstream platforms (such as SIEMs, data lakes, and observability tools) without editing every Route or Pipeline that targets them.
- Prefer modular, composable Pipelines over monoliths: Treat Pipelines as small, focused building blocks. Separate stages for parsing, normalization, enrichment, reduction, and routing, then compose them with Routes and shared content. This simplifies debugging, allows independent evolution of each stage, and keeps the data-plane behavior predictable as CVA patterns grow. Explore Pack-Based Configuration Management for modular design strategies.
- Design for data resilience and workload stability: Implement robust backpressure handling and queuing strategies to maintain data integrity during downstream outages or resource contention. For detailed implementation patterns, see Data Resilience and Workload Architecture.
- Design Worker Nodes for replacement and automation: Treat Worker Nodes as replaceable instances. Use patterns like auto-scaling groups, container orchestration, or the equivalent so Worker Nodes can be added or removed automatically, while the Leader and configuration system remain the control-plane source of truth.
Source Data Plane: Edge Nodes/Fleets
To optimize collection at the origin while minimizing host impact, the Edge Node/Fleet tier functions as a distributed, lightweight entry point for the data plane.
- Deploy for proximity: Edge Nodes live near the data they collect (such as servers, endpoints, remote sites, branch offices, or isolated networks) and act as the data-plane entry point. They collect locally and forward to dedicated Worker Groups (for more processing) or other Destinations according to centrally defined policies.
- Maintain “thin” logic: In the plane model, Edge is still data-plane, but should behave like a “thin” layer. Use it for local discovery, basic parsing, early filtering, and the tagging needed for routing or policy enforcement. Avoid turning Edge Nodes/Fleets into complex processing engines, such as running large lookups, expensive regular expressions, and multi-step transformations. Reserve heavier data-plane work for dedicated Worker Groups.
- Respect CPU, memory, and disk constraints on Edge Nodes: Edge Nodes share host resources with application workloads or runs on constrained devices. Monitor host impact carefully and tune Collectors, Pipeline complexity, and local buffering so Cribl Edge remains a low-friction data-plane component instead of a source of resource contention.
- Use Fleets/Subfleets to push control-plane intent to Edge: Fleets and Subfleets are how you express control-plane decisions to large groups of Edge Nodes. Group by environment, platform, location, or role, and manage configuration at the Fleet level. This prevents configuration drift, makes rollouts predictable, and keeps local data-plane behavior aligned with global policies. For details, see Design Fleet Hierarchy.
- Design Edge data paths for intermittent connectivity: Because Edge Nodes can frequently operate where network links are unreliable, configure local buffering, retries, and reasonable backoff behavior. Document expectations around delay and potential loss so the wider architecture explicitly acknowledges real-world conditions at the edge of the network.
For more considerations, refer to Cribl Edge-Specific Considerations.