Cribl LogStream – Docs

Cribl LogStream Documentation

Questions? We'd love to help you! Meet us in #Cribl Community Slack (sign up here)
Download entire manual as PDF – v.3.1.1

Single-Instance Deployment

Getting started with Cribl LogStream on a single instance

For small-volume or light processing environments – or for test or evaluation use cases – a single instance of Cribl LogStream might be sufficient to serve all inputs, event processing, and outputs. This page outlines how to implement a single-instance deployment.

Architecture

Requirements

  • OS (Intel Processors):

    • Linux 64-bit kernel >= 3.10 and glibc >= 2.17
    • Examples: Ubuntu 16.04, Debian 9, RHEL 7, CentOS 7, SUSE Linux Enterprise Server 12+, Amazon Linux 2014.03+
  • OS (ARM64 Processors):

    • Linux 64-bit
    • Tested so far on Ubuntu (14.04, 16.04, 18.04, and 20.04), CentOS 7.9, and Amazon Linux 2
  • System:

ℹ️

We assume that 1 physical core is equivalent to 2 virtual/​hyperthreaded CPUs (vCPUs) on Intel/Xeon or AMD processors; and to 1 (higher-throughput) vCPU on Graviton2/ARM64 processors.

  • Browser Support: Firefox 65+, Chrome 70+, Safari 12+, Microsoft Edge

All quantities listed above are minimum requirements. To fulfill these requirements using cloud-based virtual machines, see Recommended AWS, Azure, and GCP Instance Types.

Network Ports

By default, LogStream listens on the following ports:

Component

Default Port

UI

9000

HTTP In

10080

Splunk to Cribl LogStream data port

localhost:10000 (Cribl App for Splunk)

| criblstream Splunk search command to Cribl LogStream

localhost:10420 (Cribl App for Splunk)

User options

  • Other data ports as required.

Overriding Default Ports

The above ports can be overridden in the following configuration files:

  • Cribl UI port (9000): Default definitions for host, port, and other settings are set in $CRIBL_HOME/default/cribl/cribl.yml, and can be overridden by defining alternatives in $CRIBL_HOME/local/cribl/cribl.yml.

  • Data Ports: HTTP In (10080), TCPJSON in (10420) Splunk to Cribl (10000) : Default definitions for host, port and other settings are set in $CRIBL_HOME/default/cribl/inputs.yml, and can be overridden by defining alternatives in $CRIBL_HOME/local/cribl/inputs.yml.

Installing on Linux

  • Install the package on your instance of choice. Download it here.
  • Ensure that required ports are available (see Network Ports).
  • Un-tar in a directory of choice, e.g., /opt/:
    • tar xvzf cribl-<version>-<build>-<arch>.tgz

Running

Go to the $CRIBL_HOME/bin directory, where the package was extracted (e.g.: /opt/cribl/bin). Here, you can use ./cribl to:

  • Start: ./cribl start
  • Stop: ./cribl stop
  • Reload: ./cribl reload
  • Restart: ./cribl restart
  • Get status: ./cribl status
  • Switch a distributed deployment to single-instance mode:
    ./cribl mode-single (uses the default address:port 0.0.0.0:9000)

📘

Executing the restart or stop command cancels any currently running collection jobs. For other available commands, see CLI Reference.

Next, go to http://<hostname>:9000 and log in with default credentials (admin:admin). You can now start configuring Cribl LogStream with Sources and Destinations, or start creating Routes and Pipelines.

📘

In the case of an API port conflict, the process will retry binding for 10 minutes before exiting.

Shutdown and Restart Sequence

When a Worker Process receives an explicit shutdown command, it follows this sequence:

  1. Shuts down internal system communications: stops receiving any commands from the API Process or distributed Leader.
  2. Shuts down the input Sources.
  3. When the input stream ends, receives a signal event from LogStream's event processor to flush out any stateful Pipeline Functions (such as Aggregations, Sampling, Dynamic Sampling, and Suppress).
  4. Waits for 10 seconds, to allow data to finish flowing through the streams processing engine. This wait is designed to allow all Destinations to flush out remaining data. However, any data not flushed within this interval – e.g., because of an error on downstream receivers – will be lost.
  5. Exits.

Shutdown/Restart with PQ

Enabling Persistent Queues, on Destinations that support it, generally helps ensure data delivery to your downstream systems. However, note that when a Worker Process restarts, there is a potential for duplicate events to be sent through such Destinations.

This is because PQ doesn’t mark events as safe to discard until they've been handed them off to the host OS to send out. So if the Worker Process exits at the final step above before all events have flushed, the final handful of events will not have been marked as committed and re moved. Upon restart, LogStream will still see them, and will resend them.

Enabling Start on Boot

Cribl LogStream ships with a CLI utility that can update your system's configuration to start LogStream at system boot time. The basic format to invoke this utility is:

[sudo] $CRIBL_HOME/bin/cribl boot-start [enable|disable] [options] [args]

📘

You will need to run this command as root, or with sudo. For options and arguments, see the CLI Reference.

Most, if not all, popular Linux distributions use systemd now to start processes at boot, while older or more obscure distributions may still use initd . Verify with your Linux distribution vendor if you aren't sure which method your systems use in order to know which procedure listed below to follow.

Using systemd

To enable Cribl LogStream to start at boot time with systemd, you need to run the boot‑start command. Make sure you first create any user you want to specify to run LogStream. E.g., to run LogStream on boot as existing user cribl, you'd use:

sudo $CRIBL_HOME/bin/cribl boot-start enable -m systemd -u cribl

This will install a unit file (as below) and start Cribl LogStream at boot time as user cribl. A ‑configDir option can be used to specify where to install the unit file. If not specified, this location defaults to /etc/systemd/system.

If necessary, change ownership for the Cribl LogStream installation:

[sudo] chown -R cribl $CRIBL_HOME

Next, use the enable command to ensure that the service starts on system boot:

[sudo] systemctl enable cribl

To disable starting at boot time, run the following command:

sudo $CRIBL_HOME/bin/cribl boot-start disable

Note the file's default 65536 hard limit on maximum open file descriptors (known as a ulimit). The minimum recommended is 65536. Linux tracks this per user account. You can view the current soft ulimit for max open file descriptors with $ ulimit -n while logged in as the same user running the cribl binary.

[Unit]
Description=Systemd service file for Cribl LogStream.
After=network.target

[Service]
Type=forking
User=cribl
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
PIDFile=/install/path/to/cribl/pid/cribl.pid
ExecStart=/install/path/to/cribl/bin/cribl start
ExecStop=/install/path/to/cribl/bin/cribl stop
ExecStopPost='/bin/rm -f /install/path/to/cribl/pid/cribl.pid'
ExecReload=/install/path/to/cribl/bin/cribl reload
TimeoutSec=60

[Install]
WantedBy=multi-user.target

🚧

Do NOT Run LogStream as Root!

If LogStream is required to listen on ports 1–1024, it will need privileged access. You can enable this on systemd by adding this configuration key:

[Service]
AmbientCapabilities=CAP_NET_BIND_SERVICE

Using initd

To enable Cribl LogStream to start at boot time with initd, you need to run the boot-start command. If the user that you want to run LogStreams does not exist, create it prior to executing. E.g., running LogStream as user cribl on boot:

sudo $CRIBL_HOME/bin/cribl boot-start enable -m initd -u cribl

This will install an init.d script in /etc/init.d/cribl.init.d, and start Cribl LogStream at boot time as user cribl. A ‑configDir option can be used to specify where to install the script. If not specified, this location defaults to /etc/init.d.

If necessary, change ownership for the Cribl LogStream installation:

[sudo] chown -R cribl $CRIBL_HOME

To disable starting at boot time, run the following command:

sudo $CRIBL_HOME/bin/cribl boot-start disable

🚧

Do NOT Run LogStream as Root!

If LogStream is required to listen on ports 1–1024, it will need privileged access. On a Linux system with POSIX capabilities, you can achieve this by adding the CAP_NET_BIND_SERVICE capability. For example: # setcap cap_net_bind_service=+ep $CRIBL_HOME/bin/cribl

On some OS versions (such as CentOS), you must add an -i switch to the setcap command. For example: # setcap -i cap_net_bind_service=+ep $CRIBL_HOME/bin/cribl

Upon starting the LogStream server, a bind EACCES 0.0.0.0:<port> error in the API/worker logs (depending on the service) might indicate that setcap did not successfully execute.

System Proxy Configuration

For details on configuring LogStream to send and receive data through proxy servers, see our System Proxy Configuration topic.

Scaling Up

A single-instance installation can be configured to scale up and utilize as many resources on the host as required. See Sizing and Scaling for details.

Anti-Virus Exceptions

If you are running anti-virus software on a LogStream instance's host OS, here are general guidelines for minimizing accidental blockage of LogStream's normal operation.

Your overall goals are to prevent the anti-virus software from locking any files while LogStream needs to write to them, and from triggering any changes that LogStream would detect as needing to be committed.

First, if Persistent Queues are enabled on any Destinations, exclude any directories that these Destinations write to. This is especially relevant if you're writing queues to any custom locations outside of $CRIBL_HOME.

Next, for any non-streaming Destinations that you've configured, exclude their staging paths.

Next, exclude these subdirectories of $CRIBL_HOME:

  • state/
  • `log/
  • .git/ (usually only exists on Leader Nodes)
  • groups/ (on Leader Nodes)
  • local/ (on Workers or Leader)

Finally, avoid scanning any processes. Except for the queueing/staging directories already listed above, LogStream runs everything in memory, so scanning process memory will slow down LogStream's processing and reduce throughput.

Updated 5 days ago

Single-Instance Deployment


Getting started with Cribl LogStream on a single instance

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.