Manage Config Bundles
Cribl Stream uses config bundles, compressed archives containing configuration files and data essential for Worker operation.
Bundle Management in Cribl.Cloud
In Cribl.Cloud, bundle storage and delivery are fully managed. You do not need to configure Amazon S3 buckets or remote URLs for Cribl-managed Worker Groups and hybrid Worker Groups.
- Primary path (S3 over HTTPS): Cribl-managed Workers first attempt to download the new config bundle from a Cribl-managed Amazon S3 bucket over HTTPS (via port
443). - Automatic fallback (Leader over HTTPS): If the Worker cannot reach the S3 endpoint (for example, because of firewall or proxy rules), it automatically falls back to downloading the same bundle from the Leader over HTTPS via port
4200. For details on ports, see Ports and Required Ports in Cribl.Cloud.
For details on preventing deployment delays by white-listing the required S3 and Leader endpoints, see Network Access for Config Bundles.
Bundle Management in Customer-Managed Deployments
For Workers you manage, you control bundle storage:
- Default: The Leader serves bundles directly from its local storage.
- Optional S3: You can offload bundle storage to your own S3 bucket to reduce the load on your Leader Node. For details, see Store Bundles Remotely.
Bundle Lifecycle and Retention
Regardless of the delivery method, the Leader and Worker Nodes automatically manage the lifecycle of the config bundles archives.
Leader Node Bundles
The Leader Node actively manages bundles, performing the following tasks:
- Cleanup on startup: Upon startup, the Leader Node clears all bundles.
- Running state: The Leader Node maintains a maximum of five bundles per Worker Group while running.
- Automatic cleanup: Creating a new bundle triggers the deletion of older ones.
Worker Node Bundles
The Worker Node pulls bundles from the Leader Node and manages them as follows:
- Cache: Worker Nodes cache the latest five bundles and their backups.
- Recent file retention: Worker Nodes retain any files created within the last ten minutes.
- Reconfigure cleanup: Reconfigured events initiate bundle cleanup.
Store Bundles Remotely
For customer-managed Leaders, you can optionally reduce Leader load by storing bundles in an Amazon S3. This offloads distribution, allowing Worker Nodes and Edge Nodes to pull bundles directly from S3.
Configure S3 Storage
You can configure bundling in S3 through the following ways:
- UI: Configure the Leader settings on the Leader Node. In the sidebar, select Settings. In the Global tab, select System > Distributed Settings > Leader Settings. In the S3 Bundle Bucket URL field, specify an S3 bucket for remote bundle storage. Format:
s3://${bucket}. - YAML: Define the S3 bucket URL in the
master.configBundles.remoteUrlproperty of yourinstance.ymlconfiguration file. - Environment variable: Set the
CRIBL_DIST_LEADER_BUNDLE_URLenvironment variable.
Optimized Bundle Delivery
Cribl Stream prioritizes efficient bundle downloads for Worker Nodes. Here’s how:
- Direct S3 downloads (preferred): When possible, Cribl prioritizes direct downloads from Amazon S3 for optimal performance.
- Private network fallback: Worker Nodes on private networks attempt to retrieve bundles from a Content Delivery Network (CDN). If blocked, they automatically fall back to downloading from the Leader Node.
- Optional S3 firewall allowlist: For Worker Nodes with internet access, consider allowlisting the regional Amazon S3 endpoint used by your bundle bucket in your firewall or proxy configuration. For example, endpoints look like
s3.eu-central-1.amazonaws.comors3.us-west-2.amazonaws.com, and dual-stack variants look likes3.dualstack.us-west-2.amazonaws.comorbucket-name.s3.dualstack.us-west-2.amazonaws.com.
S3 endpoints are regional and do not share a single wildcard hostname, so patterns such as *.s3.amazonaws.com are not sufficient for firewall or proxy allowlists. Refer to the AWS documentation: Using Amazon S3 dual-stack endpoints for the complete and current list of S3 regional and dual-stack endpoints.
Authenticating to S3
When storing configuration bundles in an Amazon S3 bucket, Cribl Stream needs appropriate credentials to access the bucket. There are several ways to provide these credentials, with IAM roles being the most secure and recommended approach.
IAM Roles (Recommended)
The best practice is to configure an IAM role with read access to your S3 bucket and attach this role to the EC2 instance (or other AWS service) where your Cribl Stream Leader and Worker Nodes are running. Cribl Stream will automatically use the instance’s IAM role to authenticate with S3. This method avoids the need to manage AWS credentials directly within Cribl and is the most secure option.
To authenticate via IAM roles:
- Create an IAM role with the
AmazonS3ReadOnlyAccesspolicy (or a custom policy with more restrictive permissions if needed) for the S3 bucket containing your bundles. - Attach this IAM role to the EC2 instance(s) running your Cribl Stream Leader and Worker Nodes.
- Provide the path to your S3 bucket using the CRIBL_DIST_LEADER_BUNDLE_URL environment variable or the S3 Bundle Bucket URL field in the Distributed/ Leader settings (For example,
s3://your-bucket-name/path/to/bundles).path/to/bundlesis optional.
No further configuration is required. Cribl Stream will automatically discover and use the instance’s IAM role.
AWS Access Keys
Alternatively, you can provide AWS access keys directly. Ensure the environment in which Cribl Stream runs in contains the following variables:
export AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY"
export CRIBL_DIST_LEADER_BUNDLE_URL="s3://your-bucket-name/path/to/bundles"Encryption and Backward Compatibility
If your Workers are running versions older than 4.9.0 and your Leader is v.4.9.0 or newer, deployments will always use Leader-hosted bundles. This happens because of how encryption works:
- Newer Leaders encrypt bundles when uploaded to S3 to enhance security and prevent the exposure of sensitive information like secret keys.
- Older Workers check the code for errors by comparing its checksum (a unique identifier).
- Since the Workers check the encrypted code, they will always find errors because the checksum changes after encryption.
You might see an error similar to the one below in your logs:
Checksum mismatch, expected=f781b2bace2e1869e840b0ee200b57378ecae985, found=a59af697c140521ea7f42af7630ac1fce88e4ff0"**