Cribl LogStream – Docs

Getting started with Cribl LogStream

Questions? We'd love to help you! Meet us in #cribl (sign up)
Download manual as PDF - v2.2.0

    Docs Home

Distributed Deployment

Getting started with Cribl LogStream on a distributed deployment

Distributed Deployment

To sustain higher incoming data volumes, and/or increased processing, you can scale from a single instance up to a multi-instance, distributed deployment. Instances in the deployment serve all inputs, process events, and send to outputs independently. The instances are managed centrally by a single Master Node, which is responsible for keeping configurations in sync, and for tracking and monitoring their activity metrics.


Single Instance – a normal Cribl LogStream instance, running by itself.

Master Node – a Cribl LogStream instance running in master mode, used to centrally author configurations and monitor a distributed deployment.

Worker Node – a Cribl LogStream instance running as a managed worker, whose configuration is fully managed by a Master Node.

Worker Group – a collection of Worker Nodes that share the same configuration.

Worker Process – a process within a Single Instance or Worker Nodes that handles data inputs, processing, and output

Mapping Ruleset – an ordered list of Filters, used to map Workers to Worker Groups.


A Worker Node's local running config can be manually overridden/changed, but changes won't persist on the filesystem.


Master Node Requirements

  • OS:
    • Linux: RedHat, CentOS, Ubuntu, AWS Linux, Suse (64bit)
    • macOS 10.13 or 10.14
  • System:
    • +4 physical cores, +8GB RAM
    • 5GB free disk space
  • Git: git must be available on the Master Node. See details below.
  • Browser Support: Firefox 65+, Chrome 70+, Safari 12+, Microsoft Edge


We assume that 1 physical core is equivalent to 2 virtual/hyperthreaded CPUs (vCPUs). All quantities listed above are minimum requirements.

Worker Node Requirements

See Single-Instance Deployment for requirements and Sizing and Scaling for capacity planning details.

Network Ports – Master Node

In a distributed deployment, Workers communicate with the Master Node these ports. Ensure that the Master is reachable on those ports from all Workers.

ComponentDefault Port

Network Ports – Worker Nodes

By default, all LogStream Worker instances listen on the following ports:

ComponentDefault Port
HTTP In10080
User options                            + Other data ports as required.

Installing on Linux/Mac

See Single-Instance Deployment, as the installation procedures are identical.

Version Control with git

LogStream requires git (version or higher) to be available locally on the host where the Master Node will run. Configuration changes must be committed to git before they're deployed.

If you don't have git installed, check here for details on how to get started.

The Master node uses git to:

  • Manage configuration versions across worker groups.
  • Provide users with an audit trail of all configuration changes.
  • Allow users to display diffs between current and previous config versions.

Setting up Master and Worker Nodes

1. Configuring a Master Node

Using the UI:

In Settings > Distributed Management, select Mode Master. Supply the required Master settings (Address and Port). Customize the optional settings if desired. Then click Save to restart.

Or, through instance.yml:

In $CRIBL_HOME/local/_system/instance.yml, under the distributed section, set mode to master:

  mode: master
    host: <IP or>
    port: 4200
      disabled: true
    ipWhitelistRegex: /.*/
    authToken: <auth token>
    enabledWorkerRemoteAccess: false
    compression: none
    connectionTimeout: 5000
    writeTimeout: 10000


Worker UI Access

If you enable the Worker UI access option (enabledWorkerRemoteAccess key), you will be able to click through from the Master's Manage Worker Nodes screen to an authenticated view of each Worker's UI. An orange header labeled Viewing Worker: <host/GUID> will appear to confirm that you are remotely viewing a Worker's UI.

2. Configuring a Worker Node

Using the UI:

In Settings > Distributed Management, select Mode Worker. Supply the required Master settings (Address and Port). Customize the optional settings if desired. Then click Save to restart.

Or, through instance.yml:

In $CRIBL_HOME/local/_system/instance.yml, under the distributed section, set mode to worker:

  mode: worker
  envRegex: /^CRIBL_/
    host: <master address>
    port: 4200
    authToken: <token here>
    compression: none
      disabled: true
    connectionTimeout: 5000
    writeTimeout: 10000
       - tag1
       - tag2
       - tag42
  group: teamsters

Alternatively, you can start Worker Nodes with environment variables. For example:

CRIBL_DIST_MASTER_URL=tcp://[email protected]:4203 ./cribl start

See the Environment Variables section for more details.

How Do Workers and Master Work Together

The Master Node has two primary roles:

  1. Serves as a central location for Workers' operational metrics. The Master ships with a monitoring console that has a number of dashboards covering almost every operational aspect of the deployment.

  2. Serves as a central location for authoring, validating, deploying, and synchronizing configurations across Worker Groups.

Workers will periodically send a heartbeat to the Master which includes information about themselves and a set of current system metrics. The heartbeat payload includes facts – such as hostname, IP address, GUID, tags, environment variables, current software/configuration version, etc. – that the Master tracks with the connection.

When a Worker Node checks in with the Master:

  • The Worker sends heartbeat to Master.
  • The Master uses the Worker’s facts and Mapping Rules to map it to a Worker Group.
  • The Worker Node pulls its Group's updated configuration bundle, if necessary.

Config Bundle Management

Config bundles are compressed archives of all config files and associated data that a Worker needs to operate. The Master creates bundles upon Deploy, and manages them as follows:

  • Bundles are wiped clean on startup.
  • While running, at most 5 bundles per group are kept.
  • Bundle cleanup is invoked when a new bundle is created.

The Worker pulls bundles from the Master and manages them as follows:

  • Last 5 bundles and backup files are kept.
  • At any point in time, all files created in the last 10 minutes are kept.
  • Bundle cleanup is invoked after a reconfigure.

Network Port Requirements (Defaults)

  • UI access to Master Node: TCP 9000.
  • Worker Node to Master Node: TCP 9000 (API access).
  • Worker Node to Master Node: TCP 4200 (Heartbeat/Metrics).

Worker Groups

Worker Groups facilitate authoring and management of configuration settings for a particular set of Workers. To create a new Worker Group, go to the Worker Groups top-level menu and click + Add New.

Configuring a Worker Group

Clicking on the newly created group will present you with an interface for authoring and validating its configuration. You can configure everything for this Group as if it were a single Cribl LogStream instance – using exactly the same visual interface for Routes, Pipelines, Sources, Destinations and System Settings.


To explicitly set passwords for Worker Groups, see User Authentication.

Mapping Workers to Worker Groups

Mapping Rulesets are used to map Workers to Worker Groups. Only one Mapping Ruleset can be active at any one time. A ruleset is a list of rules that evaluate Filter expressions on the information that Workers send to the Master.

The ruleset behavior is similar to Routes, where the order matters and the Filter section supports full JS expressions. The ruleset matching strategy is first-match, and one Worker can belong to only one Worker Group. At least one Worker Group should be defined and present in the system.


Define a rule for all hosts that satisfy this condition:

  • IP address starts with 10.10.42, AND
  • More than 6 CPUs, OR CRIBL_HOME environment variable contains w0, AND
  • Belongs to Group420.

Rule Configuration

  • Rule Name: myFirstRule
  • Filter: (conn_ip.startsWith('10.10.42.') && cpus > 6) || env.CRIBL_HOME.match('w0')
  • Group: Group420

Creating a Mapping Ruleset

To create a Mapping Ruleset, start on the Mappings top-level menu, then click + Add New.


The Mappings top-level menu appears only when you have started LogStream with the DISTRIBUTED MANAGEMENT > Mode set to Master.

Click on the newly created item, and start adding rules by clicking on + Add Rule. While working with or tuning rules, the Preview in the right pane will show which currently reporting and tracked workers map to which Worker Groups.

A ruleset must be activated before it can be used by the Master. To activate it, go to Mappings and click Activate on the required ruleset. You can also Clone a ruleset if you'd like to work on it offline, test different filters, etc.

Although not required, Workers can be configured to send a group with their payload. See below how this ranks in mapping priority.

When an instance runs as Master, the following are created automatically:

  • A default Worker Group.
  • A default Mapping Ruleset,
    • with a default Rule matching all (true).

Mapping Order of Priority

Priority for mapping to a group is as follows: Mapping Rules > Group sent by Worker > default Group.

  • If a Filter matches, use that Group.
  • Else, if a Worker has a Group defined, use that.
  • Else, map to the default Group.

Deploying Configurations

The typical workflow for deploying configurations is the following:

  1. Work on configs.
  2. Commit (and optionally push).
  3. Deploy.

Deployment is the last step after configuration changes have been saved and committed. Deploying here means propagating updated configs to Workers. Deploying new configurations is done at the Group level. To deploy, locate your desired Group and click on Deploy. Workers that belong to the group will start pulling updated configurations on their next check-in.


When a Worker Node pulls its first configs, the admin password will be randomized, unless specifically changed. I.e., users won't be able to log in on the Worker Node with default credentials.

Configuration Files

On the Master, a group's configuration lives under: $CRIBL_HOME/groups/<groupName>/local/cribl/.
On the managed Worker, after configs have been pulled, they're extracted under: $CRIBL_HOME/local/cribl/.

Lookup Files

On the Master, a group's lookup files live under: $CRIBL_HOME/groups/<groupName>/data/lookups.

On the managed Worker, after configs have been pulled, lookups are extracted under: $CRIBL_HOME/data/lookups. When deployed via the Master, lookup files are distributed to Workers as part of a configuration deployment.

If you want your lookup files to be part of the LogStream configuration's version control process, we recommended deploying using the Master Node. Otherwise, you can update your lookup file out-of-band on the individual workers. The latter is especially useful for larger lookup files ( > 10 MB, for example), or for lookup files maintained using some other mechanism, or for lookup files that are updated frequently.


Some configuration changes will require restarts, while many others require only reloads. See here for details. Restarts/reloads of each worker process are handled automatically by the Worker.

Worker Process Rolling Restart

During a restart, to minimize ingestion disruption and increase availability of network ports, worker processes on a Worker Node are restarted in a rolling fashion. 20% of running processes – with a minimum of one process – are restarted at a time. A worker process must come up and report as started before the next one is restarted. This rolling restart continues until all processes have restarted. If a worker process fails to restart, configurations will be rolled back.

Auto-Scaling Workers and Load-Balancing Incoming Data

If data flows in via Load Balancers, make sure to register all instances. Each Cribl LogStream node exposes a health endpoint that your Load Balancer can check to make a data/connection routing decision.

Health Check EndpointHealthy Response
curl http://<host>:<port>/api/v1/health{"status":"healthy"}

Environment Variables

  • CRIBL_DIST_MASTER_URL – URL of the Master Node. Format: <tls|tcp>://<authToken>@host:port?group=defaultGroup&tag=tag1&tag=tag2&tls.<tls-settings below>.
    • tls.privKeyPath – Private Key Path.
    • tls.passphrase – Key Passphrase.
    • tls.caPath – CA Certificate Path.
    • tls.certPath – Certificate Path.
    • tls.rejectUnauthorized – Validate Client Certs. Boolean, defaults to false.
    • tls.requestCert – Authenticate Client (mutual auth). Boolean, defaults to false.
    • tls.commonNameRegex – Regex matching peer certificate > subject > common names allowed to connect. Used only if tls.requestCert is set to true.
  • CRIBL_DIST_MODEworker | master. Defaults to worker iff CRIBL_DIST_MASTER_URL is present.
  • CRIBL_HOME – Auto setup on startup. Defaults to parent of bin directory.
  • CRIBL_CONF_DIR – Auto setup on startup. Defaults to parent of bin directory.
  • CRIBL_NOAUTH – Disables authentication. Careful here!!

Workers GUID

When you install and first run the software, a GUID is generated and stored in a .dat file located in CRIBL_HOME/bin/, e.g.:

# cat CRIBL_HOME/bin/676f6174733432.dat

When deploying Cribl LogStream as part of a host image or VM, be sure to remove this file, so that you don't end up with duplicate GUIDs. The file will be regenerated on next run.

Updated 17 days ago

Distributed Deployment

Getting started with Cribl LogStream on a distributed deployment

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.