Cribl LogStream – Docs

Cribl LogStream Documentation

Questions? We'd love to help you! Meet us in #Cribl Community Slack (sign up here)
Download entire manual as PDF - v2.4.4

Version Control

Tracking, backing up, and restoring configuration changes for single-instance and distributed deployments

Cribl LogStream integrates with Git clients and remote repositories to provide version control of LogStream's configuration. This integration offers backup and rollback for single-instance and distributed deployments.

These options are separate from the Git repo responsible for version control of Worker configurations, located on the Master Node in distributed deployments. We cover all these options and requirements below.

Git Installation (Local or Standalone/Single-Instance)

To verify that git is available, run:

git --version

The minimum version that LogStream requires is: 1.8.3.1. If you don't have git installed, see the installation links here.

Git Required for Distributed Deployments

For distributed deployments, git must be installed and available locally on the host running the Master Node.

All configuration changes must be committed before they are deployed. The Master notifies Workers that a new configuration is available, and Workers pull the new configuration from the Master Node.

Committing Changes

Once Git is installed, you can commit configuration changes using the git CLI. You can also commit changes interactively, using LogStream's UI.

Pending commits have a red dot indicator, as shown below. Click Commit to proceed.

Next, in the resulting Commit Changes modal, you can verify the diff'ed configuration changes. Other options here include clearing individual files' check boxes to exclude them from the commit (as shown below), and clicking Undo to reverse the changes instead of committing them.

Reviewing a pending commit

When you're ready to commit to your commit, click Commit. Look for a Commit successful confirmation banner.

Reverting Commits

Once Git is installed, you can revert to a previous commit using the git CLI. You can also restore a Worker Group's previous commit using LogStream's UI:

Select the commit from the Config Version drop-down, as shown below.

Then, in the resulting Commit modal, verify the diff'ed configuration changes and click Revert.

Undoing earlier commits

Finally, confirm permission for LogStream to restart.

Support For Remote Repositories

Git remote repositories are supported – but not required – for version control of all configuration changes. You can configure a Standalone Master Node with Git remote push capabilities through the LogStream CLI, or through the LogStream UI (via Settings > Distributed Settings > Git Settings).

To create a repo, see these tutorials:

📘

Currently, LogStream supports push and pull only against the master branch on each remote repo.

Several tutorial links and examples on this page point to GitHub, based on its wide adoption. The basic principles are the same for other Git repo providers, including private Git servers. GitHub's own UI and documentation periodically change, and linked tutorials' screenshots might differ from GitHub's current UI.

Remote Formats Supported

Remote URI schema patterns should match this regex:
(?:git|ssh|ftps?|file|https?|[email protected][-\w.]+):(\/\/)?(.*?)(\.git\/?)?$.

You can find a list of supported formats here.

For example:

  • GitHub or other providers: <protocol>://[email protected]/<username>/<reponame>.git
  • Local Git servers: git://<host.xyz>:<port>/<user>/path/to/repo.git

Securing Remote Repos

❗️

Some files that are used by LogStream (both Master and Worker Groups) contain sensitive keys; examples are cribl.secret and ...auth/ssh/git.key. These will be pushed to the remote repo as part of the entire directory structure under version control. Ensure that this repo is secured appropriately.

Connecting to a Remote with a Personal Access Token over HTTPS (Recommended)

Cribl recommends connecting to a remote repo over HTTPS. The example below shows a token-based HTTPS connection to GitHub.

Example: Connecting to GitHub over HTTPS

  1. Create a new GitHub repository.

    For best results, create a new empty repo, with no readme file and no commit history. This will prevent git push errors.

    Note the user name and email associated with your login to the repo provider.

  2. Create a personal access token with repo scope.

  3. Copy the token to your clipboard.

  4. In Cribl LogStream, go to Settings > Distributed Settings > Git Settings.

  1. Fill in the Remote URL field with your repo name. Use the format below:

    https://<accesstoken>@github.com/<reponame>.git

For additional details, see GitHub's Creating a Personal Access Token tutorial.

🚧

For GitHub repos specifically, use only personal access tokens in the Remote URL field. GitHub has announced that it will end support for plaintext passwords as of August 13, 2021.

Connecting to a Remote with SSH

You can set up SSH keys from the CLI, or upload keys via the UI. If you have a passphrase set, this functionality is available only through the CLI – see Encryption: Configuring Keys with the CLI. The example below outlines the UI steps.

Example: Connecting to GitHub with SSH

  1. Create a new GitHub repository.

    For best results, create a new empty repo, with no readme file and no commit history. This will prevent git push errors.

    Note the user name and email associated with your login to the repo provider.

  2. Add an SSH public key to your GitHub account.

  3. In Cribl LogStream, go to Settings > Distributed Settings > Git Settings.

  4. Fill in the remote repo URL and the SSH private key. In the example format below, replace <username> with your user name on the repo provider:

    Remote URL: <protocol>://[email protected]:<username>/<reponame>.git
    SSH private key: <ssh-private-key>

For GitHub specifically, the URL/protocol format must be:

Remote URL: [email protected]:<user>/<reponame>.git

For example:

Remote URL: [email protected]:taylorswift/leadsheets.git

  1. As the user running LogStream, run this command to add the GitHub keys to known_hosts:
    ssh-keyscan -H github.com >> ~/.ssh/known_hosts

For additional details, see GitHub's Connecting to GitHub with SSH tutorial.

LogStream's Git settings

Additional Git Settings

On the Git Settings > General tab, you can change the Authentication Type from its SSH default to Basic authentication. This displays two additional fields:

  • User: Username on the repo.

  • Password: Authentication password (e.g., a GitHub personal access token).

Git Authentication Type settings

On the Git Settings > Scheduled Actions tab, you can schedule a Commit, Push, or Commit & Push action to occur on a predefined interval.

Git Scheduled Actions selection

For the selected action type, you can define a [cron schedule](cron schedule), and a commit message distinct from the General tab's Default Commit Message. Then click Save.

Saving a Git Scheduled Action

You can schedule only one type of action. To swap to a different type, select it from the Scheduled global actions drop-down, and resave. To turn off scheduled Git commands, select None from the drop-down, and resave.

Pushing to a Remote Repo

Once you've configured a remote, a Git Push button appears in the Version Control overlay.

Git Push button

If you enabled the Git Settings > Collapse Actions option, you will instead see a combined Commit & Push button (or, for changes made on individual Worker Groups, a combined Commit & Deploy button) in the overlay.

Git combined actions button

Git combined actions button for a Worker Group

Troubleshooting Push Errors

This section anticipates common errors you might see in LogStream's UI, or in the git CLI, when pushing a commit.

Failed to Push Some Refs

Your first push to a remote repo might fail with one of several failed to push some refs errors.

As a first step in debugging these errors, edit the $CRIBL_HOME/.git/config file to make sure that its name and email key values match the credentials you've set on your repo provider or git server.

Also make sure that the remote "origin" key value matches the remote you set when you connected to the remote repo. This example shows all three keys, with placeholder values:

[user]
    name = <your-login-name>
    email = <[email protected]>
[remote "origin"]
    url = https://<user-name>:<token>@github.com/<username>/<repo-name>

Next, verify the remote repo from the command line, as follows:

cd $CRIBL_HOME/.git
git remote -v 

In response, git should echo your configured remote twice – once for fetch and once for push operations.

If all of the above settings are correct, the push is very likely blocking because the remote repo has some commit history, or was simply created with a readme.md file. For command-line instructions to remedy this – by syncing your local repo to its remote – see GitHub's Dealing with Non-Fast-Forward Errors topic.

Large Files Detected

A push command might also trigger "large file" warnings or, more seriously, errors of this form (CLI/GitHub example):

remote: warning: File data/lookups/geo.mmdb is 60.12 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: [################################################################]
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File groups/default/data/lookups/largelookup.csv is 313.91 MB; this exceeds GitHub's file size limit of 100.00 MB

Cribl recommends adding such large files to .gitignore, to exclude them from subsequent push commands. As the above examples show, typical culprits are large .csv or .mmdb lookup files. A simple option is to place these files in a $CRIBL_HOME subdirectory that's already listed in .gitignore – for details, see Managing Large Lookups.

Other available workarounds include staging such files outside $CRIBL_HOME, or using plugins to accommodate the large files. For GitHub-specific options, see Working with Large Files.

Restoring Master from a Remote Repo

If a remote repo is configured and has the latest known good Master configuration, this section outlines the general steps to restore the config from that repo.

Restoring from remote repo

Let's assume that the entire $CRIBL_HOME directory of the Master is corrupted, or you're starting from scratch. Let's also assume that the remote repo has the form:
[email protected]:<username>/<reponame>.git.

  1. Important: In a directory of choice, untar the same Cribl LogStream version that you're trying to restore, but do not start it.

  2. If you are using SSH key authentication, specify the key using the following command:

GIT_SSH_COMMAND='ssh -i .key -o IdentitiesOnly=yes' git fetch origin
  1. Ensure that you have proper access to the remote repo:
# git ls-remote [email protected]:/.git 
56331fabb4822eaec4ca0ffd008d6e9974c1e419f	HEAD
5631fabb4822eaec4ca0ffd008d6e9974c1e419f	refs/heads/master
  1. Change directory into $CRIBL_HOME and initialize git:
    # git init

  2. Next, add/configure the remote:
    # git remote add origin [email protected]:<username>/<reponame>.git

  3. Now set up your local branch to exactly match the remote branch:
    # git fetch origin
    # git reset --hard origin/master

  4. Finally, to confirm that the commits match, run this command while in $CRIBL_HOME. Note the commit hash:

# git show --abbrev-commit
commit 5631fab (HEAD -> master, origin/master)
Author: First Last 
Date:   Fri Jan 31 10:16:07 2020 -0500
admin: Last commit before failure/crash

......

That last step above pulls in all the latest configs from the remote repo, and you should be able to start the Master as normal. Once up and running, Workers should start checking in after about 60 seconds.

🚧

Verify cribl.secret

The cribl.secret file – located at $CRIBL_HOME/local/cribl/auth/cribl.secret – contains the secret key that is used to encrypt sensitive settings on configuration files (e.g., AWS Secret Access Key, etc.). Make sure this file is properly restored on the new Master, because it is required to make encrypted conf file settings usable again.

.gitignore File

A .gitignore file specifies files that git should ignore when tracking changes. Each line specifies a pattern, which should match a file path to be ignored. Cribl LogStream ships with a .gitgnore file containing a number of patterns/rules, under a section of the file labeled CRIBL SECTION.

# Do NOT REMOVE CRIBL and CUSTOM header lines!
# DO NOT REMOVE rules under the CRIBL section as they may be reintroduced on update.
# You can ONLY comment out rules in the CRIBL section.
# You can add new rules in the CUSTOM section.
### CRIBL SECTION -- DO NOT REMOVE ###
default/ui/**
default/data/ui/**
bin/**
log/**
pid/**
data/uploads/**
diag/**
**/state/**
### CUSTOM SECTION -- DO NOT REMOVE ###

<User defined patterns/rules go here>

CRIBL Section

🚧

Do Not Remove CRIBL SECTION or CUSTOM SECTION Headers

The CRIBL SECTION is used by Cribl LogStream to define default patterns/rules that ship with every version. Do not add or remove any of the lines here, because Chuck Norris will easily find you!

Maslow's theory of higher needs does not apply to Chuck Norris. He has only two needs: killing people and finding people to kill. Seriously, do not remove them, as they will be overwritten on the next update. The only modifications that will survive updates are commented lines.

CUSTOM Section

User-defined, custom patterns/rules can be safely defined under the CUSTOM SECTION.
Cribl LogStream will not modify the contents of CUSTOM SECTION.

Good candidates to add here include large lookup files – especially large binary database files. See Troubleshooting: Large Files Detected, above.

Files skipped with .gitignore

If you have files that are skipped with .gitgnore, you will need to back them up and restore them via means other than Git. E.g., you can periodically copy/rsync them to a backup destination, and then restore them to their original locations after you complete the steps above.

Updated 23 days ago

Version Control


Tracking, backing up, and restoring configuration changes for single-instance and distributed deployments

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.