These docs are for Cribl Stream 4.12 and are no longer actively maintained.
See the latest version (4.13).
Zscaler Cloud NSS Source
The Zscaler Cloud NSS Source receives log data over HTTP/S from Zscaler Cloud Nanolog Streaming Service (NSS).
If you’re sending from a Zscaler NSS Virtual Machine, use a Syslog Source over TCP.
Type: Push | TLS Support: YES | Event Breaker Support: YES
This Source supports gzip-compressed inbound data when the
Content‑Encoding: gzip
connection header is set.
Zscaler Cloud NSS Feed
When adding a Zcaler Cloud NSS feed, configure the following:
- API URL: The URL varies depending on your Cribl environment. Use the same port and HEC endpoint you define in your Zscaler Cloud NSS Source.
- Cribl.Cloud: Enter the Workspace Public Ingress Address in the following format:
https://<groupName>.<workspaceName>.<organizationId>.cribl.cloud:<port><hec-endpoint>
- On-prem: Enter the IP address or FQDN for either your Cribl Stream instance, or the load balancer you’re using with your Cribl Stream instances.
- Cribl.Cloud: Enter the Workspace Public Ingress Address in the following format:
- HTTP Headers:
- Key 1: Enter
Authorization
. - Value 1: Enter the Auth Token from your Cribl Source.
- Key 2: Enter
Content-Encoding
. - Value 2: Enter
gzip
.
- Key 1: Enter
Configure a Zscaler Cloud NSS Source
On the top bar, select Products, and then select Cribl Stream. Under Worker Groups, select a Worker Group. Next, you have two options:
- To configure via QuickConnect, navigate to Routing > QuickConnect (Stream) or Collect (Edge). Select Add Source and select the Source you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select Data > Sources (Stream) or More > Sources (Edge). Select the Source you want. Next, select Add Source.
Configure the following under General Settings:
- Input ID: Enter a unique name to identify this Source definition. If you clone this Source, Cribl Stream will add
-CLONE
to the original Input ID. - Description: Optionally, enter a description.
The next three settings configure the location where the Source listens for incoming HTTP(S) data.- Address: Enter the hostname/IP. Such as:
localhost
or0.0.0.0
. Supports IPv4 and IPv6 addresses. - Port: Enter the network port.
- HEC endpoint: Enter the absolute path. Defaults to
/services/collector
.
- Input ID: Enter a unique name to identify this Source definition. If you clone this Source, Cribl Stream will add
Next, you can configure the following Optional Settings:
- Allowed Indexes: List the values allowed in the HEC event index field. Allows wildcards. Leave blank to skip validation.
- Zscaler HEC Acks: Whether to respond with a
200
HTTP status code acknowledging receipt of theACK
request. If toggled off (default), the Source responds with a400
code to indicateACK
processing is disabled. - Tags: Optionally, add tags that you can use to filter and group Sources in Cribl Stream’s UI. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
Optionally, you can enable TLS and persistent queue, then configure processing and advanced settings.
Create an Auth Token. You’ll provide the token in an HTTP header with the key
Authorization
, see the Overview section for details.Select Save, then Commit & Deploy.
TLS Settings (Server Side)
Enabled: Defaults to toggled off. When toggled on:
Certificate name: Name of the predefined certificate.
Private key path: Server path containing the private key (in PEM format) to use. Path can reference $ENV_VARS
.
Passphrase: Passphrase to use to decrypt private key.
Certificate path: Server path containing certificates (in PEM format) to use. Path can reference $ENV_VARS
.
CA certificate path: Server path containing CA certificates (in PEM format) to use. Path can reference $ENV_VARS
.
Authenticate client (mutual auth): Require clients to present their certificates. Used to perform mutual authentication using SSL certs. Default is toggled off. When toggled on:
Validate client certificates: Reject certificates that are not authorized by a CA in the CA certificate path, or by another trusted CA (such as, the system’s CA). Default is toggled off.
Common name: Regex that a peer certificate’s subject attribute must match in order to connect. Defaults to
.*
. Matches on the substring afterCN=
. As needed, escape regex tokens to match literal characters. (For example, to match the subjectCN=worker.cribl.local
, you would enter:worker\.cribl\.local
.) If the subject attribute contains Subject Alternative Name (SAN) entries, the Source will check the regex against all of those but ignore the Common Name (CN) entry (if any). If the certificate has no SAN extension, the Source will check the regex against the single name in the CN.
Minimum TLS version: Optionally, select the minimum TLS version to accept from connections.
Maximum TLS version: Optionally, select the maximum TLS version to accept from connections.
Persistent Queue Settings
In the Persistent Queue Settings tab, you can optionally specify persistent queue storage, using the following controls. Persistent queue buffers and preserves incoming events when a downstream Destination has an outage or experiences backpressure.
Before enabling persistent queue, learn more about persistent queue behavior and how to optimize it with your system:
On Cribl-managed Cloud Workers (with an Enterprise plan), this tab exposes only the Enable persistent queue toggle. If enabled, PQ is automatically configured in
Always On
mode, with a maximum queue size of 1 GB disk space allocated per PQ‑enabled Source, per Worker Process.The 1 GB limit is on uncompressed inbound data, and the queue does not perform any compression. This limit is not configurable. For configurable queue size, compression, mode, and other options below, use a hybrid Group.
Enable persistent queue: Default is toggled off. When toggled on:
Mode: Select a condition for engaging persistent queues.
Always On
: This default option will always write events to the persistent queue, before forwarding them to the Cribl Stream data processing engine.Smart
: This option will engage PQ only when the Source detects backpressure from the Cribl Stream data processing engine.
Smart
mode only engages when necessary, such as when a downstream Destination becomes blocked and the Buffer size limit reaches its limit. When persistent queue is set toSmart
mode, Cribl attempts to flush the queue when every new event arrives. The only time events stay in the buffer is when a downstream Destination becomes blocked.
Buffer size limit: The maximum number of events to hold in memory before reporting backpressure to the sender and writing the queue to disk. Defaults to 1000
. This buffer is for all connections, not just per Worker Process. For that reason, this can dramatically expand memory usage. Connections share this limit, which may result in slightly lower throughput for higher numbers of connections. For higher numbers of connections, consider increasing the limit.
Commit frequency: The number of events to send downstream before committing that Stream has read them. Defaults to 42
.
File size limit: The maximum data volume to store in each queue file before closing it and (optionally) applying the configured Compression. Enter a numeral with units of KB, MB, and so forth. If not specified, Cribl Stream applies the default 1 MB
.
Queue size limit: The maximum amount of disk space that the queue is allowed to consume on each Worker Process. Once this limit is reached, this Source will stop queueing data and block incoming data. Required, and defaults to 5
GB. Accepts positive numbers with units of KB
, MB
, GB
, and so forth. Can be set as high as 1 TB
, unless you’ve configured a different Worker Process PQ size limit in Group or Fleet settings.
Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues
. To this field’s specified path, Cribl Stream will append /<worker-id>/inputs/<input-id>
.
Compression: Optional codec to compress the persisted data after a file closes. Defaults to None
; Gzip
is also available.
In Cribl Stream 4.1 and newer, the Source persistent queue default Mode is
Always on
, to best ensure events’ delivery. For details on optimizing this selection, see Optimize Source Persistent Queues (sPQ).You can optimize Workers’ startup connections and CPU load at Group/Fleet settings > Worker Processes.
Processing Settings
Fields
In this section, you can add Fields to each event using Eval-like functionality.
Name: Field name.
Value: JavaScript expression to determine field’s value (can be a constant).
Fields specified on the Fields tab will normally override fields of the same name in events. But you can specify that fields in events should override these fields’ values.
In particular, where incoming events have no
index
field, this Source adds one with the literal valuedefault
. You can override this value by using Add Field to specify anindex
field, and then setting its Value to an expression of the following form:index == 'default' ? 'myIndex' : index
Fields here are evaluated and applied after any fields specified in the Auth Tokens section.
Pre-Processing
In this section’s Pipeline drop-down list, you can select a single existing Pipeline or Pack to process data from this input before the data is sent through the Routes.
Auth Tokens
Provides the authentication token required to access this Source’s data. Ensure the token is valid and has the necessary permissions for data retrieval.
Not providing a token allows unauthorized access.
Authentication method: Choose between manually entering a Token or selecting a stored Secret. If you opt for manual entry, type the token directly or Generate a new one. For added security, use the secrets store to select a stored Token (text secret) or Create a new one.
Enable token: Controls the status of the specified authentication token. When toggled on, the specified auth token is active and can be used for client access. If toggled off, the token is disabled, causing it to be skipped and ignored during initialization. This means that any client attempting to use a disabled token will not be granted access.
Disabling a token can be helpful when you need to temporarily stop receiving data from a particular service or application. Disabling a token prevents the service or application from sending data to your endpoint without having to completely delete the token.
See also Periodic Logging for information on how auth tokens affect product logging.
Description: Optional (but recommended), the description for this token. Although this field is optional, the token description becomes the display name for the token. You can select this display name to expand or collapse the token details.
Allowed indexes: Optionally, specify which indexes are permissible for events ingested using this token. Supports multiple index values and wildcards.
Fields: Fields to add to events referencing this token. Each field is a Name/Value pair.
Fields specified on the Auth Tokens tab will normally override fields of the same name in events. But you can specify that fields in events should override these fields’ values.
In particular, where incoming events have no
index
field, this Source adds one with the literal valuedefault
. You can override this value by using Add Field to specify anindex
field, and then setting its Value to an expression of the following form:index == 'default' ? 'myIndex' : index
Fields here are evaluated and applied before any fields specified in the Fields section.
Advanced Settings
Emit per-token request metrics: Toggle on to emit per-token <prefix>.http.perToken
and summary <prefix>.http.summary
request metrics. This enables Periodic Logging.
Show originating IP: Toggle on when clients are connecting through a proxy that supports the X-Forwarded-For
header to keep the client’s original IP address on the event instead of the proxy’s IP address. This setting affects how the Source handles the __srcIpPort
field.
Capture request headers: Toggle on to add request headers to events, in the __headers
field.
CORS allowed origins: If you need to enable cross-origin resource sharing with Zscaler senders, use this field to specify the HTTP origins to which Cribl Stream should send Access-Control-Allow-*
headers. You can enter multiple domains and use wildcards. This and its companion CORS allowed headers option should seldom be needed – see Working with CORS Headers below.
CORS allowed headers: List HTTP headers that Cribl Stream will send in a CORS preflight response to your configured (above) CORS allowed origins as Access-Control-Allow-Headers
. Enter *
to allow all headers.
Active request limit: Maximum number of active requests allowed for this Source, per Worker Process. Defaults to 256
. Enter 0
for unlimited.
Raising this limit can increase throughput by allowing more concurrent data requests, but increases resource usage and load on both your Cribl Stream infrastructure and on downstream Destinations. Before raising this limit, ensure:
- Your Cribl Stream deployment has sufficient capacity to support higher request concurrency, including CPU, memory, or number of Worker Processes.
- Downstream Destinations are correctly sized and tuned to accept the higher data ingest rate, preventing backpressure. See Manage Backpressure for more information.
Improper sizing on either side can result in dropped events, delayed processing, or overall system instability.
Activity log sample rate: Determines how often request activity is logged at the info
level. The default 100
value logs every 100th value; a 1
value would log every request; a 10
value would log every 10th request; and so forth.
Requests-per-socket limit: The maximum number of requests Cribl Stream should allow on one socket before instructing the client to close the connection. Defaults to 0
(unlimited). See Balancing Connection Reuse Against Request Distribution below.
Socket timeout (seconds): How long Cribl Stream should wait before assuming that an inactive socket has timed out. The default 0
value means wait forever.
Request timeout (seconds): How long to wait for an incoming request to complete before aborting it. The default 0
value means wait indefinitely.
Keep-alive timeout (seconds): After the last response is sent, Cribl Stream will wait this long for additional data before closing the socket connection. Defaults to 5
seconds; minimum is 1
second; maximum is 600
seconds (10 minutes).
The longer the Keep‑alive timeout, the more Cribl Stream will reuse connections. The shorter the timeout, the closer Cribl Stream gets to creating a new connection for every request. When request frequency is high, you can use longer timeouts to reduce the number of connections created, which mitigates the associated cost.
IP allowlist regex: Grants access to requests originating from specific IP addresses that match a defined pattern. Unmatched requests are rejected with a 403 (Forbidden) status code. Defaults to .*
(allow all).
IP denylist regex: Blocks requests originating from specific IP addresses that match a defined pattern, even if they would be allowed by default. Rejected requests receive a 403 (Forbidden) status code. Defaults to ^$
(allow all).
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Balancing Connection Reuse Against Request Distribution
Requests-per-socket limit allows you to limit the number of HTTP requests an upstream client can send on one network connection. Once the limit is reached, Cribl Stream uses HTTP headers to inform the client that it must establish a new connection to send any more requests. (Specifically, Cribl Stream sets the HTTP Connection
header to close
.) After that, if the client disregards what Cribl Stream has asked it to do and tries to send another HTTP request over the existing connection, Cribl Stream will respond with an HTTP status code of 503 Service Unavailable
.
Use this setting to strike a balance between connection reuse by the client, and distribution of requests among one or more Worker Node processes by Cribl Stream:
When a client sends a sequence of requests on the same connection, that is called connection reuse. Because connection reuse benefits client performance by avoiding the overhead of creating new connections, clients have an incentive to maximize connection reuse.
Meanwhile, a single process on that Worker Node will handle all the requests of a single network connection, for the lifetime of the connection. When receiving a large overall set of data, Cribl Stream performs better when the workload is distributed across multiple Worker Node processes. In that situation, it makes sense to limit connection reuse.
There is no one-size-fits-all solution, because of variation in the size of the payload a client sends with a request and in the number of requests a client wants to send in one sequence. Start by estimating how long connections will stay open. To do this, multiply the typical time that requests take to process (based on payload size) times the number of requests the client typically wants to send.
If the result is 60 seconds or longer, set Requests-per-socket limit to force the client to create a new connection sooner. This way, more data can be spread over more Worker Node processes within a given unit of time.
For example: Suppose a client tries to send thousands of requests over a very few connections that stay open for hours on end. By setting a relatively low Requests-per-socket limit, you can ensure that the same work is done over more, shorter-lived connections distributed between more Worker Node processes, yielding better performance from Cribl Stream.
A final point to consider is that one Cribl Stream Source can receive requests from more than one client, making it more complicated to determine an optimal value for Requests-per-socket limit.
Connected Destinations
Select Send to Routes to enable conditional routing, filtering, and cloning of this Source’s data via the Routing table.
Select QuickConnect to send this Source’s data to one or more Destinations via independent, direct connections.
Internal Fields
Cribl Stream uses a set of internal fields to assist in handling of data. These “meta” fields are not part of an event, but they are accessible, and Functions can use them to make processing decisions.
Fields for this Source:
__headers
– Added only when Advanced Settings > Capture request headers is toggled on.__hecToken
__inputId
__ctrlFields
__srcIpPort
– See details below.__TZ
Overriding __srcIpPort
with Client IP/Port
The __srcIpPort
field’s value contains the IP address and (optionally) port of the Zscaler Cloud NSS sending data to this Source.
When any proxies (including load balancers) lie between the Zscaler Cloud NSS and the Source, the last proxy adds an X‑Forwarded‑For
header whose value is the IP/port of the original client. With multiple proxies, this header’s value will be an array, whose first item is the original client IP/port.
If X‑Forwarded‑For
is present, and Advanced Settings > Show originating IP is toggled off, the original client IP/port in this header will override the value of __srcIpPort
.
If Show originating IP is toggled on, the X‑Forwarded‑For
header’s contents will not override the __srcIpPort
value. (Here, the upstream proxy can convey the client IP/port without using this header.)
Working with CORS Headers
The CORS allowed origins and CORS allowed headers settings are needed only when you want JavaScript code running in a web browser to send events to the Zscaler Cloud NSS Source. In this (uncommon) scenario, the code in question sends a preflight request that includes CORS (Cross-Origin Resource Sharing) headers, and the Zscaler Cloud NSS Source includes CORS headers in its response.
If the settings (and thus the CORS headers in the response) are correct, the browser will allow the code in question to complete requests to the Zscaler Cloud NSS Source – that is, allow the code to read the Source’s responses. (These settings are optional because for some web applications it’s sufficient to send requests without reading responses.)
The CORS header dance begins when the web browser sends a request with a header stating its origin. What happens next depends on how you’ve configured the CORS header settings, as explained in the tables below. (For these examples, we’ll assume an origin of https://mycoolwebapp:3001
.)
CORS allowed origins value | What Zscaler Cloud NSS Source does | Does Browser then allow code to read response? |
---|---|---|
* | Send Access-Control-Allow-Origin: * header. | Yes |
Multiple wildcard values | If origin value matches any wildcard, send Access-Control-Allow-Origin: https://mycoolwebapp:3001 header. | Yes |
Nothing (not configured) | Send neither an Access-Control-Allow-Origin nor an Access-Control-Allow-Headers header. | No |
The web browser’s request might also include an Access-Control-Request-Headers
header that lists one or more headers that it wants to use in a subsequent POST
request (that is, the “actual” as opposed to preflight request). The Zscaler Cloud NSS Source can then respond with an Access-Control-Allow-Headers
header if and only if it’s also sending the Access-Control-Allow-Origin
header.
CORS allowed headers value | What Zscaler Cloud NSS Source does | Does Browser then allow code to read response? |
---|---|---|
* | Send Access-Control-Allow-Headers header with the same list the client sent. | Yes |
Multiple wildcard values | If any headers from the client’s list match wildcards, send Access-Control-Allow-Headers header listing the matching headers. | Yes |
Nothing (not configured) | Do not send an Access-Control-Allow-Headers header. | Probably not |
Periodic Logging
Cribl Stream emits metrics about incoming requests when Advanced Settings > Emit per-token request metrics is enabled. These metrics can be routed to a Destination by enabling the Cribl Metrics Source. There are two categories of metrics: per-token metrics provide details for each configured token, while summary metrics aggregate data across all tokens. If no tokens are configured, only summary metrics will be emitted.
Per-token metric names will be prefixed by <basePrefix>.http.perToken.
, where <basePrefix>
is the configured base prefix for the Cribl Metrics Source. The available metric names are listed below, excluding the prefix. As these are counter-type metrics, they will not be emitted if the value is zero.
bytesReceived
numRequests
numEvents
numErrors
Note: this is cumulative of specific error types below as well as a catch-all for other typesnumParserErrors
numToDisabledToken
The summary metric names will be prefixed by <basePrefix>.http.summary.
where <basePrefix>
is the configured base prefix for the Cribl Metrics Source. The available metric names are listed below, excluding the prefix. As these are counter-type metrics, they will not be emitted if the value is zero.
bytesReceived
numRequests
numEvents
numErrors
Note: this is cumulative of specific error types below as well as a catch-all for other typesnumParserErrors
numToDisabledToken
numAuthFailures
numToWrongUrl
numAckReqs
numReqsAcked
Troubleshooting
The Source’s configuration modal has helpful tabs for troubleshooting:
Live Data: Try capturing live data to see real-time events as they are ingested. On the Live Data tab, click Start Capture to begin viewing real-time data.
Logs: Review and search the logs that provide detailed information about the ingestion process, including any errors or warnings that may have occurred.
You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify ingestion issues. Analyze the graphs showing events and bytes in/out over time.