A Cribl LogStream installation can be scaled up within a single instance and/or scaled out across multiple instances. Scaling allows for:
- Increased data volumes of any size.
- Increased processing complexity.
- Increased deployment availability.
- Increased number of destinations.
A LogStream installation can be configured to scale up and utilize as many resources on the host as required. In a single-instance deployment, you govern resource allocation through the General Settings > Worker Processes Settings section.
In a distributed deployment, you allocate resources per Worker Group. Navigate to Worker Groups > Group Name > System Settings > Worker Processes.
Either way, these controls are available:
Process count: Indicates the number of Worker Processes to spawn. Positive numbers specify an absolute number of Workers. Negative numbers specify the number of Workers relative to the number of CPUs in the system. like this:
number of CPUs availableminus
this setting}. The default is
0setting is interpreted as
1Worker Process. (LogStream corrects for excessive negative offsets by guaranteeing at least
Memory (MB): Amount of memory available to each Worker Process, in MB. Defaults to
2048. (See Estimating Memory Requirements below.)
Throughout these guidelines, we assume that 1 physical core is equivalent to 2 virtual/hyperthreaded CPUs (vCPUs). Each LogStream instance requires the following resources to run, beyond those reserved for the vCPU's operating system:
- +4 physical cores, +8GB RAM
- 5GB free disk space (more if persistent queuing is enabled)
For example, assuming a Cribl LogStream system with 6 physical cores (12 vCPUs):
- If Process count is set to
4, then the system will spawn 4 processes, using up to 4 vCPUs, leaving 8 free.
- If Process count is set to
-2, then the system will spawn 10 processes (12-2), using up to 10 vCPUs. This will leave 2 vCPUs free.
LogStream incorporates guardrails that prevent spawning more processes than available vCPUs.
It's important to understand that Worker Processes operate in parallel, i.e., independently of each other. This means that:
Data coming in on a single connection will be handled by a single Worker Process. To get the full benefits of multiple Worker Processes, data should come in over multiple connections.
E.g., it's better to have 5 connections to TCP 514, each bringing in 200GB/day, than one at 1TB/day.
Each Worker Process will maintain and manage its own outputs. E.g., if an instance with 2 Worker Processes is configured with a Splunk output, then the Splunk destination will see 2 inbound connections.
As with most data processing applications, Cribl LogStream's expected resource utilization will be commensurate with the type of processing that is occurring. For instance, a function that adds a static field on an event will likely perform faster than one that applies a regex to finding and replacing a string. At the time of this writing:
- A Worker Process will use up to 1 physical core, or 2 vCPUs.
- Processing performance is proportional to CPU clock speed.
- All processing happens in-memory.
- Processing does not require significant disk allocation.
Current guidance for capacity planning is: Allocate 1 physical core for each 400GB/day of IN+OUT throughput. So, to estimate the number of cores needed: Sum your expected input and output volume, then divide by 400GB.
- Example 1: 100GB IN -> 100GB out to each of 3 destinations = 400GB total = 1 physical core.
- Example 2: 3TB IN -> 1TB out = 4TB total = 10 physical cores.
- Example 3: 4 TB IN -> full 4TB to Destination A, plus 2 TB to Destination B = 10TB total = 25 physical cores.
The general guideline for memory allocation is to start with the default 2048 MB (2 GB) per Worker Process, and then add more memory as you find that you're hitting limits.
Memory use is consumed per component, per Worker Process, as follows:
- Lookups are loaded into memory.
- Memory is allocated to in-memory buffers to hold data to be delivered to downstream services.
- Stateful Functions (Aggregations and Suppress) consume memory proportional to the rate of data throughput.
- The Aggregations Function's memory consumption further increases with the number of Group by's.
- The Suppress Function's memory use further increases with the cardinality of events matching the Key expression. A higher rate of distinct event values will consume more memory.
You could meet the requirement above with multiples of the following instances:
c5d.2xlarge (4 physical cores, 8vCPUs)
c5d.4xlarge or higher (8 physical cores, 16vCPUs)
Azure – Compute Optimized Instances
Standard_F8s_v2 (4 physical cores, 8vCPUs)
Standard_F16s_v2 or higher (8 physical cores, 16vCPUs)
c2-standard-8 (4 physical cores, 8vCPUs)
c2-standard-16 or higher (8 physical cores, 16vCPUs)
When data volume, processing needs, or other requirements exceed what a single instance can sustain, a Cribl LogStream deployment can span multiple nodes. This is known as a distributed deployment, and it can be configured and managed centrally by a single master instance. See Distributed Deployment for more details.
Updated 10 days ago