These docs are for Cribl Stream 4.11 and are no longer actively maintained.
See the latest version (4.14).
Amazon S3 Destination
The Data Lake Amazon S3 Destination sends data to Amazon Simple Storage Service (Amazon S3) and uses a partitioning scheme that is designed to work with Cribl Search. You can organize your bucket structure by event fields for referencing in Cribl Search. See the Cribl Search setup for Data Lake Amazon S3 for details.
Type: Non-Streaming | TLS Support: Yes | PQ Support: No
Configuring Cribl Stream to Output to S3 Destinations
From the top nav, click Manage, then select a Worker Group to configure. Next, you have two options:
To configure via the graphical QuickConnect UI. Next, click Add Destination at right. From the resulting drawer’s tiles, select Data Lakes > Amazon S3. Next, click either Add Destination or (if displayed) Select Existing. The resulting drawer will provide the options below.
Or, to configure via the Routing UI, click Data > Destinations. From the resulting page’s tiles or the Destinations left nav, select Data Lakes > Amazon S3. Next, click Add Destination to open a New Destination modal that provides the options below.
General Settings
Output ID: Enter a unique name to identify this S3 definition.
S3 bucket name: Name of the destination S3 Bucket. This value can be a constant, or a JavaScript expression that will be evaluated only at init time. For example, referencing a Global Variable: myBucket-${C.vars.myVar}.
Optional Settings
Partition by fields: List of fields to partition the path by, in addition to time (which is included automatically).
The partitioning scheme is designed to work with Cribl Search. This scheme optimizes how objects are searched relative to the search’s time range.
In AWS S3, bucket content is returned in ascending alphanumeric order, by default, they are sorted “oldest-first”. This Destination uses alpha characters to prefix objects, so ascending sort returns them in “newest-first” order, allowing Search to find files that will likely have the requested objects soonest.
The effective partition will be <4-letters>-YYYY/<2-letters>-MM/<2-letters>-DD/<2-letters>-HH/<list/of/fields>. For example, hjhh-2023/ag-07/ai-25/aj-14/<list/of/fields>.
In Cribl Search, if you create a corresponding Data Lake Amazon S3 Dataset, the UI will automatically render the path with wildcards and tokens in this format: <path>/${*}-${_time:%Y}/${*}-${_time:%m}/${*}-${_time:%d}/${*}-${_time:%H}/<list/of/fields>.
Backpressure behavior: Select whether to block or drop events when all receivers are exerting backpressure. (Causes might include an accumulation of too many files needing to be closed.) Defaults to Block.
Tags: Optionally, add tags that you can use to filter and group Destinations in Cribl Stream’s Manage Destinations page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
Authentication
Use the Authentication method drop-downs to select one of these options:
Auto: This default option uses the AWS SDK for JavaScript to automatically obtain credentials in the following order of attempts:
- IAM Roles for Amazon EC2: Loaded from AWS Identity and Access Management (IAM) roles attached to an EC2 instance.
- Shared Credentials File: Loaded from the shared credentials file (~/.aws/credentials).
- Environment Variables: Loaded from environment variables AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY.
- JSON File on Disk: Loaded from a JSON file on disk.
- Other Credential-Provider Classes: Other credential-provider classes provided by the AWS SDK for JavaScript.
The Auto method works both when running on AWS and in other environments where the necessary credentials are available through one of the above methods.
SSO Providers
When using the auto authentication method, you can leverage SSO providers like SAML and Okta to issue temporary credentials. These credentials should be set in the environment variables
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY. The AWS SDK will then use these environment variables to authenticate.
Manual: If not running on AWS, you can select this option to enter a static set of user-associated IAM credentials (your access key and secret key) directly or by reference. This is useful for Workers not in an AWS VPC, for example, those running a private cloud. The Manual option exposes these corresponding additional fields:
- Access key: Enter your AWS access key. If not present, will fall back to the - env.AWS_ACCESS_KEY_IDenvironment variable, or to the metadata endpoint for IAM role credentials.
- Secret key: Enter your AWS secret key. If not present, will fall back to the - env.AWS_SECRET_ACCESS_KEYenvironment variable, or to the metadata endpoint for IAM credentials.
The values for Access key and Secret key can be a constant, or a JavaScript expression (such as ${C.env.MY_VAR}) enclosed in quotes or backticks, which allows configuration with environment variables.
Secret: If not running on AWS, you can select this option to supply a stored secret that references an AWS access key and secret key. The Secret option exposes this additional field:
- Secret key pair: Use the drop-down to select an API key/secret key pair that you’ve configured in Cribl Stream’s secrets manager. A Create link is available to store a new, reusable secret.
Assume Role
When using Assume Role to access resources in a different Region than Cribl Stream, you can target the AWS Security Token Service (STS) endpoint specific to that Region by using the CRIBL_AWS_STS_REGION environment variable on your Worker Node. Setting an invalid Region results in a fallback to the global STS endpoint.
Enable for S3: Toggle on to use Assume Role credentials to access S3.
AssumeRole ARN: Enter the Amazon Resource Name (ARN) of the role to assume.
External ID: Enter the External ID to use when assuming role. This is required only when assuming a role that requires this ID in order to delegate third-party access. For details, see AWS’ documentation.
Duration (seconds): Duration of the Assumed Role’s session, in seconds. Minimum is 900 (15 minutes). Maximum is 43200 (12 hours). Defaults to 3600 (1 hour).
Processing Settings
Post‑Processing
Pipeline: Pipeline or Pack to process data before sending the data out using this output.
System fields: A list of fields to automatically add to events that use this output. By default, includes cribl_pipe (identifying the Cribl Stream Pipeline that processed the event). Supports c* wildcards. Other options include:
- cribl_host– Cribl Stream Node that processed the event.
- cribl_input– Cribl Stream Source that processed the event.
- cribl_output– Cribl Stream Destination that processed the event.
- cribl_route– Cribl Stream Route (or QuickConnect) that processed the event.
- cribl_wp– Cribl Stream Worker Process that processed the event.
Advanced Settings
Max file size (MB): Maximum uncompressed output file size. Files of this size will be closed and moved to final output location. Defaults to 32.
Max file open time (sec): Maximum amount of time to write to a file. Files open for longer than this limit will be closed and moved to final output location. Defaults to 300.
Max file idle time (sec): Maximum amount of time to keep inactive files open. Files open for longer than this limit will be closed and moved to final output location. Defaults to 30.
Max open files: Maximum number of files to keep open concurrently. When this limit is exceeded, on any individual Worker Process, Cribl Stream will close the oldest open files, and move them to the final output location. Defaults to 100.
Cribl Stream will close files when any of the four above conditions is met.
Max concurrent file parts: Maximum number of parts to upload in parallel per file. A value of 1 tells the Destination to send the whole file at once. When set to 2 or above, IAM permissions must include those required for multipart uploads. Defaults to 4; highest allowed value is 10.
Disk space protection: Specifies whether to Block (default) or Drop incoming events when the disk space falls below the globally defined Min free disk space amount.
Add Output ID: When toggled on (default), adds the Output ID field’s value to the staging location’s file path. This ensures that each Destination’s logs will write to its own bucket.
For a Destination originally configured in a Cribl Stream version below 2.4.0, the Add Output ID behavior will be switched off on the backend, regardless of this toggle’s state. This is to avoid losing any files pending in the original staging directory, upon Cribl Stream upgrade and restart. To enable this option for such Destinations, Cribl’s recommended migration path is:
- Clone the Destination.
- Redirect the Routes referencing the original Destination to instead reference the new, cloned Destination.
This way, the original Destination will process pending files (after an idle timeout), and the new, cloned Destination will process newly arriving events with Add output ID enabled.
Remove staging dirs: Toggle this on to delete empty staging directories after moving files. This prevents the proliferation of orphaned empty directories. When enabled, exposes this additional option:
- Staging cleanup period: How often (in seconds) to delete empty directories when Remove staging dirs is enabled. Defaults to 300seconds (every 5 minutes). Minimum configurable interval is10seconds; maximum is86400seconds (every 24 hours).
Enable dead-lettering: Toggle this on to set a maximum number of retries, and to move files to a designated directory when write failures exceed that limit. This prevents data flow blockage and excessive error logging due to undeliverable files. When enabled, exposes two additional fields:
- Dead-letter location: Specify the storage location for undeliverable files. Defaults to $CRIBL_HOME/state/outputs/dead-letter.
- Maximum retry limit: Configure the retry limit for failed file deliveries. This setting defines how many times the system will attempt to move a file to its intended location before it is deemed undeliverable and placed in the dead-letter directory. Defaults to 20.
Endpoint: S3 service endpoint. If empty, Cribl Stream will automatically construct the endpoint from the AWS Region. To access the AWS S3 endpoints, use the path-style URL format. You don’t need to specify the bucket name in the URL, because Cribl Stream will automatically add it to the URL path. For details, see AWS’ Path‑Style Requests topic.
Object ACL: Object ACL (Access Control List) to assign to uploaded objects.
Storage class: Select a storage class for uploaded objects. Defaults to Standard. The other options are: Reduced Redundancy Storage; Standard, Infrequent Access; One Zone, Infrequent Access; Intelligent Tiering; Glacier; or Deep Archive.
Server-side encryption: Encryption type for uploaded objects – used to enable encryption on data at rest. Defaults to no encryption; the other options are Amazon S3 Managed Key or AWS KMS Managed Key.
AWS S3 always encrypts data in transit using HTTPS, with default one-way authentication from server to clients. With other S3-compatible stores (such as our native MinIO Destination), use an
https://URL to invoke in-transit encryption. Two-way authentication is not required to get encryption, and requires clients to possess a certificate.
Signature version: Signature version to use for signing S3 requests. Defaults to v4.
Reuse connections: Whether to reuse connections between requests. Toggling on (default) can improve performance.
Reject unauthorized certificates: Whether to accept certificates that cannot be verified against a valid Certificate Authority (for example, self-signed certificates). Defaults to toggled on.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
IAM Permissions
The following permissions are always needed to write to an Amazon S3 bucket:
- s3:ListBucket
- s3:GetBucketLocation
- s3:PutObject
If your Destination needs to do multipart uploads to S3, two more permissions are needed:
- kms:GenerateDataKey
- kms:Decrypt
See the AWS documentation.
Internal Fields
Cribl Stream uses a set of internal fields to assist in forwarding data to a Destination.
Field for this Destination:
- __partition
Proxying Requests
If you need to proxy HTTP/S requests, see System Proxy Configuration.