Skip to main content
Version: 3.2

Sizing and Scaling

A Cribl LogStream installation can be scaled up within a single instance and/or scaled out across multiple instances. Scaling allows for:

  • Increased data volumes of any size.
  • Increased processing complexity.
  • Increased deployment availability.
  • Increased number of destinations.

Scale Up

A LogStream installation can be configured to scale up and utilize as many resources on the host as required. In a single-instance deployment, you govern resource allocation through the global ⚙️ Settings (lower left) > System > Worker Processes section.

In a distributed deployment, you allocate resources per Worker Group. Navigate to Groups > group-name > Settings (upper right) > Worker Processes.

Either way, these controls are available:

  • Process count: Indicates the number of Worker Processes to spawn. Positive numbers specify an absolute number of Workers. Negative numbers specify a number of Workers relative to the number of CPUs in the system. like this: {<number of CPUs available> minus <this setting>}. The default is -2.

    You can enter and save 0 or 1, but LogStream will interpret these entries by attempting to spawn 2 Worker Processes. LogStream will similarly correct for excessive negative offsets by guaranteeing at least 2 Processes.

  • Minimum process count: Indicates the minimum number of Worker Processes to spawn. Overrides the Process count's effective result, and always enforces at least 2 Processes. (So here again, a 0 or 1 setting is interpreted as 2 Processes.)

  • Memory (MB): Amount of memory available to each Worker Process, in MB. Defaults to 2048. (See Estimating Memory Requirements below.)

For changes in any of the above controls to take effect, you must click Save on the Manage Processes page, and then restart the LogStream server via global ⚙️ Settings (lower left) > System > Controls > Restart. In a distributed deployment, also deploy your changes to the Groups.

Worker Processes' vCPU Requirements

Throughout these guidelines, we assume that 1 physical core is equivalent to:

  • 2 virtual/hyperthreaded CPUs (vCPUs) on Intel/Xeon or AMD processors.
  • 1 (higher-throughput) vCPU on Graviton2/ARM64 processors.

Each LogStream instance requires the following resources to run:

For example, assuming a Cribl LogStream system running on Intel or AMD processors with 6 physical cores (12 vCPUs):

  • If Process count is set to 4, then the system will spawn exactly 4 processes.
  • If Process count is set to -2, then the system will spawn 10 processes (12-2).

For CPU utilization, see Capacity and Performance Considerations below.

LogStream incorporates guardrails that prevent spawning more processes than available vCPUs.

Workers' Independence

It's important to understand that Worker Processes operate in parallel, i.e., independently of each other. This means that:

  1. Data coming in on a single connection will be handled by a single Worker Process. To get the full benefits of multiple Worker Processes, data should come in over multiple connections.
    E.g., it's better to have 5 connections to TCP 514, each bringing in 200GB/day, than one at 1TB/day.

  2. Each Worker Process will maintain and manage its own outputs. E.g., if an instance with 2 Worker Processes is configured with a Splunk output, then the Splunk destination will see 2 inbound connections.
    For further details about Workers' independence, see Shared-Nothing Architecture.

Capacity and Performance Considerations

As with most data processing applications, Cribl LogStream's expected resource utilization will be proportional to the type of processing that is occurring. For instance, a Function that adds a static field on an event will likely perform faster than one that applies a regex to finding and replacing a string. Currently:

  • A Worker Process will utilize up to 1 physical core (encompassing either 1 or 2 vCPUs, depending on the processor type).
  • Processing performance is proportional to CPU clock speed.
  • All processing happens in-memory.
  • Processing does not require significant disk allocation.

Estimating Core Requirements

Our current guidance for capacity planning depends on the processor type of your bare-metal or VM instance.

Intel/Xeon and AMD Processors

Allocate 1 physical core (2 vCPUs) for each 400GB/day of IN+OUT throughput. So, to estimate the number of cores needed: Sum your expected input and output volume, then divide by 400GB.

  • Example 1: 100GB IN -> 100GB out to each of 3 destinations = 400GB total = 1 physical core.
  • Example 2: 3TB IN -> 1TB out = 4TB total = 10 physical cores.
  • Example 3: 4 TB IN -> full 4TB to Destination A, plus 2 TB to Destination B = 10TB total = 25 physical cores.

Graviton2/ARM64 Processors

Here, 1 physical core = 1 vCPU, but overall throughput is ~20% higher than a corresponding Intel or AMD vCPU. So:

Allocate 1 physical core (1 vCPU) for each 240GB/day of IN+OUT throughput. To estimate the number of cores needed: Sum your expected input and output volume, then divide by 240GB.

  • Example 1: 100GB IN -> 100GB out to each of 3 destinations = 400GB total = 2 physical cores.
  • Example 2: 3TB IN -> 1TB out = 4TB total = 17 physical cores.
  • Example 3: 4 TB IN -> full 4TB to Destination A, plus 2 TB to Destination B = 10TB total = 42 physical cores.

Estimating Memory Requirements

The general guideline for memory allocation is to start with the default 2048 MB (2 GB) per Worker Process, and then add more memory as you find that you're hitting limits.

Memory use is consumed per component, per Worker Process, as follows:

  1. Lookups are loaded into memory.
  2. Memory is allocated to in-memory buffers to hold data to be delivered to downstream services.
  3. Stateful Functions (Aggregations and Suppress) consume memory proportional to the rate of data throughput.
  4. The Aggregations Function's memory consumption further increases with the number of Group by's.
  5. The Suppress Function's memory use further increases with the cardinality of events matching the Key expression. A higher rate of distinct event values will consume more memory.

Measuring CPU Load

You can profile CPU usage on individual Worker Processes.

Single-Instance Deployment

Go to global ⚙️ Settings (lower left) > System > Worker Processes, and click Profile on the desired row.

Worker CPU profiling (single-instance)

Distributed Deployment

This requires a few more steps:

  1. Enable Worker UI Access if you haven't already.
  2. Select Workers in the left nav.
  3. Click on the GUID link of the Worker Node you want to profile. (You will now see that GUID in a Worker drop-down at the top left, above an orange header that confirms that you've tunneled through to the Worker Node's UI.)
  4. Select Settings from that Worker Node's top nav.
  5. Select System > Worker Processes from the resulting side nav.
  6. Click Profile on the desired Worker Process.
Worker CPU profiling (distributed)

Generating a CPU Profile

In either a single-instance or distributed deployment, you will now see a Worker Process Profiler modal.

The default Duration (sec) of 10 seconds is typically enough to profile continuous issues, but you might need to adjust this – up to several minutes – to profile intermittent issues. (Longer durations can dramatically increase the lag before LogStream formats and displays the profile data.)

Click Start to begin profiling. After the duration you've chosen (watch the progress bar), plus a lag to generate the display, you'll see a profile something like this:

Worker CPU profile

Below the graph, tabs enable you to select among Summary, Bottom‑Up, Call Tree, and Event Log table views.

To save the profile to a JSON file, click the very small tiny minuscule Save profile (⬇︎) button we've highlighted at the modal's upper left.

Whether you've saved or not, when you close the modal, you'll be prompted to confirm discarding the in-memory profile data.

Recommended AWS, Azure, and GCP Instance Types

You could meet the requirement above with multiples of the following instances:

AWS – Intel processors, Compute Optimized Instances. For other options, see here.

c5d.2xlarge (4 physical cores, 8vCPUs)
c5.2xlarge (4 physical cores, 8vCPUs)
c5d.4xlarge or higher (8 physical cores, 16vCPUs)
c5.4xlarge or higher (8 physical cores, 16vCPUs)

AWS – Graviton2/ARM64 processors, Compute Optimized Instances. For other options, see here.

c6g.2xlarge (8 physical cores, 8vCPUs)
c6gd.2xlarge (8 physical cores, 8vCPUs)
c6g.4xlarge or higher (16 physical cores, 16vCPUs)
c6gd.4xlarge or higher (16 physical cores, 16vCPUs)

Azure – Compute Optimized Instances

Standard_F8s_v2 (4 physical cores, 8vCPUs)Standard_F16s_v2 or higher (8 physical cores, 16vCPUs)

GCP – Compute Optimized Instances

c2-standard-8 (4 physical cores, 8vCPUs)
n2-standard-8 (4 physical cores, 8vCPUs)
c2-standard-16 or higher (8 physical cores, 16vCPUs)
n2-standard-16 or higher (8 physical cores, 16vCPUs)

In all cases, reserve at least 5GB disk storage per instance, and more if persistent queuing is enabled.

Scale Out

When data volume, processing needs, or other requirements exceed what a single instance can sustain, a Cribl LogStream deployment can span multiple nodes. This is known as a distributed deployment, and it can be configured and managed centrally by a single master instance. See Distributed Deployment for more details.