These docs are for Cribl Edge 4.10 and are no longer actively maintained.
See the latest version (4.13).
OpenTelemetry (OTel) Source
Cribl Edge supports receiving trace, metric, and log events from OTLP-compliant senders. Cribl Edge supports log events beginning in v.4.7.
Type: Push | TLS Support: YES | Event Breaker Support: No
Supported and Unsupported Input Data
In Cribl Edge 4.0.3 and later, the Open Telemetry Source supports receiving compressed inbound data (with DEFLATE or gzip compression), as well as uncompressed data.
In Cribl Edge 4.1 and later, this Source supports receiving telemetry data over either of the transports that the OpenTelemetry Protocol (OTLP) describes: gRPC or HTTP. OTLP defines Protocol buffer (Protobuf) schemas for its payloads (requests and responses). With the HTTP transport, this Source supports Binary Protobuf payload encoding, but currently does not support JSON Protobuf.
The OpenTelemetry Project’s Signals documentation provides these hierarchical definitions of Cribl Edge’s supported trace and metric event types:
- A trace tracks the progression of a single request.
- Each trace is a tree of spans.
- A span object represents the work being done by the individual services, or components, involved in a request as that request flows through a system.
- A metric provides aggreggated statistical information.
- A metric contains individual measurements called data points.
- A log provides a recording of an event.
- See the OTLP Logs Function topic for an explanation of OTLP logs structure.
Breaking Changes Between OTLP 0.19.0 and 1.3.1
With version 0.19.0, the OTLP Protobuf spec introduced breaking changes in its field names. This means that if you created Routes or Pipelines that work with OTLP version 0.10.0, they may break when you use them with OTLP 1.3.1.
To use Routes or Pipelines created before Cribl Edge 4.7 with OTLP 1.3.1, you must check their filters and logic for any use of obsolete field names, and modify them accordingly,
Configure an OTel Source
On the top bar, select Products, and then select Cribl Edge. Under Fleets, select a Fleet. Next, you have two options:
- To configure via QuickConnect, navigate to Routing > QuickConnect (Stream) or Collect (Edge). Select Add Source and select the Source you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select Data > Sources (Stream) or More > Sources (Edge). Select the Source you want. Next, select Add Source.
In the New Source modal, configure the following under General Settings:
- Input ID: Unique ID for this Source. For example,
OTel042
. - Description: Optionally, enter a description.
- Address: Enter the hostname/IP to listen on. Defaults to
0.0.0.0
(all addresses, IPv4 format). - Port: By default, OTel applications send output to port
4317
when using the gRPC protocol, and port4318
when using HTTP. This setting defaults to4317
– you must change it if you set Protocol (below) toHTTP
, or you want Cribl Edge to collect data from an OTel application that is using a different port.
Port
4318
is not available on Cribl-managed Worker Groups in Cribl.Cloud.- Input ID: Unique ID for this Source. For example,
Next, you can configure the following Optional Settings:
- Protocol: Use the drop-down to choose the protocol matching the data you will ingest:
gRPC
(the default), orHTTP
. - Tags: Optionally, add tags that you can use to filter and group Sources in Cribl Edge’s UI. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.
- Protocol: Use the drop-down to choose the protocol matching the data you will ingest:
Optionally, you can adjust the Authentication, TLS, Persistent Queue Settings, Processing and Advanced settings, or Connected Destinations outlined in the sections below.
Select Save, then Commit & Deploy.
Authentication
Select one of the following options for authentication:
None: Don’t use authentication.
Auth token: Enter the bearer token that must be included in the authorization header.
Auth token (text secret): Provide an HTTP token referenced by a secret. Select a stored text secret in the resulting drop-down, or click Create to configure a new secret.
Basic: Displays Username and Password fields for you to enter HTTP Basic authentication credentials.
Basic (credentials secret): Provide username and password credentials referenced by a secret. Select a stored text secret in the resulting Credentials secret drop-down, or click Create to configure a new secret.
TLS Settings (Server Side)
In OTel terminology, your Cribl Edge OTel Source will receive OTel data from a Collector running on a local agent. In Cribl Edge’s terminology, the Collector is the client and the OTel Source is the server. This is why this Source’s UI identifies the Source’s TLS Settings as “server-side.”
For more about this client-server relationship, see the TLS Configuration Example below.
Enabled: Defaults to toggled off. When toggled on:
Certificate name: Name of the predefined certificate.
Private key path: Server path containing the private key (in PEM format) to use. Path can reference $ENV_VARS
.
Certificate path: Server path containing certificates (in PEM format) to use. Path can reference $ENV_VARS
.
CA certificate path: Server path containing CA certificates (in PEM format) to use. Path can reference $ENV_VARS
.
Authenticate client (mutual auth): Require clients to present their certificates. Used to perform mutual authentication using SSL certs (mTLS). Default is toggled off. When toggled on:
Validate client certs: Reject certificates that are not authorized by a CA in the CA certificate path, or by another trusted CA (for example, the system’s CA). Default is toggled on.
Common Name: Regex that a peer certificate’s subject attribute must match in order to connect. Defaults to
.*
. Matches on the substring afterCN=
. As needed, escape regex tokens to match literal characters. (For example, to match the subjectCN=worker.cribl.local
, you would enter:worker\.cribl\.local
.) If the subject attribute contains Subject Alternative Name (SAN) entries, the Source will check the regex against all of those but ignore the Common Name (CN) entry (if any). If the certificate has no SAN extension, the Source will check the regex against the single name in the CN.
Minimum TLS version: Defines the oldest TLS version allowed for incoming connections. Connections using an earlier version will be rejected. Use this setting to enforce a baseline for secure communication.
Maximum TLS version: Defines the newest TLS version that can be negotiated during a connection. Use this setting to restrict connections to tested and supported TLS versions.
Persistent Queue Settings
In the Persistent Queue Settings tab, you can optionally specify persistent queue storage, using the following controls. Persistent queue buffers and preserves incoming events when a downstream Destination has an outage or experiences backpressure.
Before enabling persistent queue, learn more about persistent queue behavior and how to optimize it with your system:
On Cribl-managed Cloud Workers (with an Enterprise plan), this tab exposes only the Enable Persistent Queue toggle. If enabled, PQ is automatically configured in
Always On
mode, with a maximum queue size of 1 GB disk space allocated per PQ‑enabled Source, per Worker Process.The 1 GB limit is on uncompressed inbound data, and the queue does not perform any compression. This limit is not configurable. For configurable queue size, compression, mode, and other options below, use a hybrid Group.
Enable Persistent Queue: Default is toggled off. When toggled on:
Mode: Select a condition for engaging persistent queues.
Always On
: This default option will always write events to the persistent queue, before forwarding them to the Cribl Stream data processing engine.Smart
: This option will engage PQ only when the Source detects backpressure from the Cribl Stream data processing engine.
Smart
mode only engages when necessary, such as when a downstream Destination becomes blocked and the Max buffer size reaches its limit. When persistent queue is set toSmart
mode, Cribl attempts to flush the queue when every new event arrives. The only time events stay in the buffer is when a downstream Destination becomes blocked.
Max buffer size: The maximum number of events to hold in memory before reporting backpressure to the sender and writing the queue to disk. Defaults to 1000
. This buffer is for all connections, not just per Worker Process. For that reason, this can dramatically expand memory usage. Connections share this limit, which may result in slightly lower throughput for higher numbers of connections. For higher numbers of connections, consider increasing the limit.
Commit frequency: The number of events to send downstream before committing that Stream has read them. Defaults to 42
.
Max file size: The maximum data volume to store in each queue file before closing it and (optionally) applying the configured Compression. Enter a numeral with units of KB, MB, and so forth. If not specified, Cribl Stream applies the default 1 MB
.
Max queue size: The maximum amount of disk space that the queue is allowed to consume on each Worker Process. Once this limit is reached, this Source will stop queueing data and block incoming data. Required, and defaults to 5
GB. Accepts positive numbers with units of KB
, MB
, GB
, and so forth. Can be set as high as 1 TB
, unless you’ve configured a different Worker Process PQ size limit in Group/Fleet settings.
Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues
. To this field’s specified path, Cribl Stream will append /<worker-id>/inputs/<input-id>
.
Compression: Optional codec to compress the persisted data after a file closes. Defaults to None
; Gzip
is also available.
In Cribl Stream 4.1 and later, the Source persistent queue default Mode is
Always on
, to best ensure events’ delivery. For details on optimizing this selection, see Optimize Source Persistent Queues (sPQ).You can optimize Workers’ startup connections and CPU load at Group/Fleet settings > Worker Processes.
Processing Settings
Fields
In this section, you can add Fields to each event, using Eval-like functionality.
Name: Field name.
Value: JavaScript expression to compute field’s value, enclosed in quotes or backticks. (Can evaluate to a constant.)
Pre-Processing
In this section’s Pipeline drop-down list, you can select a single existing Pipeline or Pack to process data from this input before the data is sent through the Routes.
Advanced Settings
Advanced Settings always displays OTLP version, Extract spans, Extract metrics, and Environment. When OTLP version is set to 1.3.1
, the UI also displays Extract logs.
The Extract spans, Extract metrics, and Extract logs settings are unique to OTel. Toggling off (default) allows Cribl Edge to essentially function as a bump on the wire, generating a single event for each incoming OTel event. This can be useful when, for example, you want to send whole OTel events to persistent storage.
OTLP version: The drop-down offers
0.10.0
(default) and1.3.1
.Extract spans: Toggle on if you want Cribl Edge to generate an individual event for each span. (Recall that traces contain multiple spans.)
Extract metrics: Toggle on if you want Cribl Edge to generate an individual event for each data point. (Recall that OTel metric events contain multiple data points.)
Extract logs: Available only when OTLP version is set to
1.3.1
. Toggle on if you want Cribl Edge to generate an individual event for each log record. Cribl recommends doing this to make it easier to transform and manipulate logs.Environment: Optionally, specify a single Git branch on which to enable this configuration. If this field is empty, the config will be enabled everywhere.
When General Settings > Protocol is set to gRPC
, the UI displays one additional setting:
- Max active connections: Maximum number of active connections allowed per Worker Process. Defaults to
1000
. Set a lower value if connection storms are causing the Source to hang. Set0
for unlimited connections.
When General Settings > Protocol is set to HTTP
, the UI displays these additional settings:
Health check endpoint: Toggle on to enable a health check endpoint specific to this Source,
http(s)://<host>:<port>/cribl_health
. A200
HTTP response code is returned when the Source is healthy. Otherwise, two errors you could receive are:ECONNRESET
where the Source failed to initialize due to not having listeners on the port.503
orServer is busy, max active connections reached
indicate there are too many connections per Worker Process.
Max active requests: Maximum number of active requests allowed for this Source, per Worker Process. Defaults to
256
. Enter0
for unlimited.Max requests per socket: The maximum number of requests Cribl Edge should allow on one socket before instructing the client to close the connection. Defaults to
0
(unlimited). See Balancing Connection Reuse Against Request Distribution below.Socket timeout (seconds): How long Cribl Edge should wait before assuming that an inactive socket has timed out. The default
0
value means wait forever.Request timeout (seconds): How long to wait for an incoming request to complete before aborting it. The default
0
value means wait indefinitely.Keep-alive timeout (seconds): After the last response is sent, Cribl Edge will wait this long for additional data before closing the socket connection. Defaults to
5
seconds; minimum is1
second; maximum is600
seconds (10 minutes).IP allowlist regex: Grants access to requests originating from specific IP addresses that match a defined pattern. Unmatched requests are rejected with a 403 (Forbidden) status code. Defaults to
.*
(allow all).IP denylist regex: Blocks requests originating from specific IP addresses that match a defined pattern, even if they would be allowed by default. Rejected requests receive a 403 (Forbidden) status code. Defaults to
^$
(allow all).
Balancing Connection Reuse Against Request Distribution
Max requests per socket allows you to limit the number of HTTP requests an upstream client can send on one network connection. Once the limit is reached, Cribl Edge uses HTTP headers to inform the client that it must establish a new connection to send any more requests. (Specifically, Cribl Edge sets the HTTP Connection
header to close
.) After that, if the client disregards what Cribl Edge has asked it to do and tries to send another HTTP request over the existing connection, Cribl Edge will respond with an HTTP status code of 503 Service Unavailable
.
Use this setting to strike a balance between connection reuse by the client, and distribution of requests among one or more Edge Node processes by Cribl Edge:
When a client sends a sequence of requests on the same connection, that is called connection reuse. Because connection reuse benefits client performance by avoiding the overhead of creating new connections, clients have an incentive to maximize connection reuse.
Meanwhile, a single process on that Edge Node will handle all the requests of a single network connection, for the lifetime of the connection. When receiving a large overall set of data, Cribl Edge performs better when the workload is distributed across multiple Edge Node processes. In that situation, it makes sense to limit connection reuse.
There is no one-size-fits-all solution, because of variation in the size of the payload a client sends with a request and in the number of requests a client wants to send in one sequence. Start by estimating how long connections will stay open. To do this, multiply the typical time that requests take to process (based on payload size) times the number of requests the client typically wants to send.
If the result is 60 seconds or longer, set Max requests per socket to force the client to create a new connection sooner. This way, more data can be spread over more Edge Node processes within a given unit of time.
For example: Suppose a client tries to send thousands of requests over a very few connections that stay open for hours on end. By setting a relatively low Max requests per socket, you can ensure that the same work is done over more, shorter-lived connections distributed between more Edge Node processes, yielding better performance from Cribl Edge.
A final point to consider is that one Cribl Edge Source can receive requests from more than one client, making it more complicated to determine an optimal value for Max requests per socket.
Connected Destinations
Select Send to Routes to enable conditional routing, filtering, and cloning of this Source’s data via the Routing table.
Select QuickConnect to send this Source’s data to one or more Destinations via independent, direct connections.
Internal Fields
OTel internal fields are metadata fields added to events processed by the OTel Source. These fields provide essential context about the event’s origin and processing details and can be referenced in filters, Routes, and Pipelines to apply specific logic based on the event’s metadata. These fields are not included in the final output unless explicitly configured at the Pipeline or Destination stage.
Fields for this Source:
__otlp.version
– The OTLP version used to parse the event. For example,0.10.0
or1.3.1
. This helps ensure compatibility with upstream systems and configurations.__otlp.type
– The type of telemetry data. For example,logs
,metrics
, ortraces
. This is crucial for distinguishing between different signal types in Pipelines.__otlp.extracted
– A boolean flag that denotes whether individual records were extracted from a batch event. Iftrue
, the event represents an extracted record. This behavior is directly governed by the Extract spans, Extract metrics, and Extract logs general settings of the OTel Source.
TLS Configuration Example
Here’s a simple example for using TLS to secure the communication between an OpenTelemetry client and your Cribl Edge OTel Source.
Choose or generate a certificate and key. If you need to generate a certificate/key pair, you can adapt the following OpenSSL command:
openssl req -nodes -new -x509 -newkey rsa:2048 -keyout myKey.pem -out myCert.pem -days 420
This example command will generate both a self-signed cert named
myCert.pem
(certified for 420 days), and an unencrypted, 2048-bit RSA private key namedmyKey.pem
.Configure the TLS Settings (Server Side). Toggle Enabled on, then:
- Enter the appropriate values in the Certificate name, Private key path, and Certificate path fields. A Create link is available if you need a new certificate, and Certificate name also works as a drop-down to allow you to choose from any existing certificates.
- Leave the CA certificate path field empty.
- Leave Authenticate client (mutual auth) toggled off.
Configure the OTel client. See the OTel Collector TLS Configuration Settings README for an explanation of the relevant settings. The config file might be named
otel-config.yaml
,otel-local-config.yaml
, or justconfig.yaml
, depending on your environment. This YAML file will have anexporters
section, which you must edit to include anotlp
sub-section, as follows:- Add an
endpoint
whose value is the IP address of either (a) the Cribl Edge Edge Node on which your OTel Source is running, or (b) the IP address of the load balancer for the relevant Fleet. In the example snippet below, this is the<Cribl_IP_address>
. Specify the port on which Cribl Edge’s OTel Source is listening; port4317
is the default. - Set
tls
>insecure
tofalse
. This matches your setting TLS Settings (Server Side) > Enabled to toggled on for the Cribl Edge OTel Source. - Set
tls
>insecure_skip_verify
totrue
. This matches your setting TLS Settings (Server Side) > Authenticate client (mutual auth) to toggled off for the Cribl Edge OTel Source. Settinginsecure_skip_verify
totrue
is also required if you’re using a self-signed certificate.
Here’s how the section you edited should look:
exporters: otlp: endpoint: "https://<Cribl_IP_address>:4317" tls: insecure: false insecure_skip_verify: true
- Add an
Troubleshooting
The Source’s configuration modal has helpful tabs for troubleshooting:
Live Data: Try capturing live data to see real-time events as they are ingested. On the Live Data tab, click Start Capture to begin viewing real-time data.
Logs: Review and search the logs that provide detailed information about the ingestion process, including any errors or warnings that may have occurred.
You can also view the Monitoring page that provides a comprehensive overview of data volume and rate, helping you identify ingestion issues. Analyze the graphs showing events and bytes in/out over time.
Common Issue
When your upstream sender is the OpenTelemetry Collector and its logs show the error Exporting failed. Will retry the request after interval
, try increasing the duration specified in Advanced Settings > Keep-alive timeout (seconds). Choose a duration that aligns with your OpenTelemetry Collector configuration, such that the OpenTelemetry Collector has enough time to send its events before the keep-alive timeout elapses.