Home / Stream/ Monitoring/Internal Logs

Internal Logs

Cribl Stream generates internal application logs that monitor its own operations and health. They provide valuable insights into the system’s behavior, performance, and potential issues.

Distributed deployments emit a larger set of logs than single-instance deployments. We’ll describe the distributed set first.

You can display and export all internal logs by selecting Monitoring in the sidebar, then selecting Logs (Stream) or selecting Logs in the sidebar (Edge). Logs’ persistence depends on event volume, not time – for details, see Log Rotation and Retention.

Several logs listed on this page are exposed only in customer-managed (on-prem) deployments. In Cribl.Cloud, Leaders support Cribl Stream Worker Node logs on hybrid Workers.

However, Organization Members who have the Admin Permission on Cribl Search can use that product to search the cribl_internal_logs Dataset for additional details about the Leader and its Cribl-managed Workers.

Leader Node Logs (Distributed)

The API/main process emits the following logs into the Leader Node’s $CRIBL_HOME/log/ directory.

Logfile NameDescriptionEquivalent on Logs page
cribl.logPrincipal log in Cribl Stream. Includes telemetry/license-validation logs. Corresponds to top-level cribl.log on Diag page.Leader > API Process
access.logAPI calls, e.g., GET /api/v1/version/info.Leader > Access
audit.logActions pertaining to files, e.g., create, update, commit, deploy, delete.Leader > Audit
notifications.logMessages that appear in the Notification list in the UI.Leader > Notifications
ui-access.logInteractions with different UI components described as URLs, e.g., /settings/apidocs, /dashboard/logs.Leader > UI Access

The API/main process emits the following service logs into the Leader Node’s $CRIBL_HOME/log/service/ directory. Each service includes a cribl.log file that logs the service’s internal telemetry and an access.log file that logs which API calls the service has handled.

SERVICE NAMEDESCRIPTIONEQUIVALENT ON Logs PAGE
Connections ServiceHandles all worker connections and communication, including heartbeats, bundle deploys, teleporting, restarting, etc. Workers are assigned to connection processes using a round-robin algorithm.Leader > Connections Service
Lease Renewal ServiceHandles lease renewal for the primary Leader Node.Leader > Lease Renewal Service
Metrics ServiceHandles in-memory metrics, merging of incoming packets, metrics persistence and rehydration, and UI queries for metrics.Leader > Metrics Service
Notifications ServiceTriggers Notifications based on its configuration.Leader > Notifications Service

The Config Helper process for each Worker Group/Fleet emits the following log in $CRIBL_HOME/log/group/GROUPNAME.

Logfile NameDescriptionEquivalent on Logs page
cribl.logMessages about config maintenance, previews, etc.GROUPNAME > Config helper

Worker Node Logs (Distributed)

The API Process emits the following log in $CRIBL_HOME/log/.

Logfile NameDescriptionEquivalent on Logs page
cribl.logMessages about the Worker/Edge Node communicating with the Leader Node (i.e., with its API Process) and other API requests, e.g., sending metrics, reaping job artifacts.GROUPNAME > Worker:HOSTNAME > API Process

Each Worker Process emits the following logs in $CRIBL_HOME/log/worker/N/, where N is the Worker/Edge Node Process ID. (The metrics.log file is written only when HTTP-based Destinations exist.)

Logfile NameDescriptionEquivalent on Logs page
cribl.logMessages about the Worker/Edge Node processing data.GROUPNAME > Worker:HOSTNAME > Worker Process N
metrics.logMessages about the Worker/Edge Node’s outbound HTTP request statistics.GROUPNAME > Worker:HOSTNAME > Worker Process N

For convenience, the UI aggregates the Worker/Edge Node Process logs as follows.

Logfile NameDescriptionEquivalent on Logs page
N/AAggregation of all the Worker Process N logs and the API Process log.GROUPNAME > WORKER_NAME

In Cribl Stream, the logs listed above are currently available only on customer-managed hybrid Workers. The single-instance logs listed below are not relevant to Cribl.Cloud.

Single‑Instance Logs

The API/main process emits the same logs as it does for a Distributed deployment, in$CRIBL_HOME/log/:

  • cribl.log
  • access.log
  • audit.log
  • notifications.log
  • ui-access.log

Each Worker/Edge Node Process emits the following logs in $CRIBL_HOME/log/worker/N/, where N is the Worker/Edge Node Process ID. (The metrics.log file is written only when HTTP-based Destinations exist.)

Logfile NameDescriptionEquivalent on Logs page
cribl.logMessages about the Worker/Edge Node processing data.GROUPNAME > Worker:HOSTNAME > Worker Process N
metrics.logMessages about the Worker/Edge Node’s outbound HTTP request statistics.GROUPNAME > Worker:HOSTNAME > Worker Process N

_raw stats Event Fields

Each Worker/Edge Node Process logs this information at a 1-minute frequency. So each event’s scope covers only that Worker/Edge Node Process, over a 1‑minute time span defined by the startTime and endTime fields.

Sample Event

{"time":"2022-11-17T16:54:05.349Z","cid":"w0","channel":"server","level":"info","message":"_raw stats","inEvents":307965,"outEvents":495848,"inBytes":52756162,"outBytes":83028013,"starttime":1668703980,"endtime":1668704040,"activeCxn":0,"openCxn":0,"closeCxn":0,"rejectCxn":0,"abortCxn":0,"pqInEvents":62000,"pqOutEvents":114591,"pqInBytes":12163896,"pqOutBytes":22481509,"pqTotalBytes":480467058,"droppedEvents":0,"tasksStarted":6,"tasksCompleted":6,"activeEP":9,"blockedEP":0,"cpuPerc":101.09,"eluPerc":97.81,"mem":{"heap":277,"heapTotal":287,"ext":46,"rss":453,"buffers":0}}

Field Descriptions

FieldDescription
abortCxnNumber of TCP connections that were aborted.
activeCxnNumber of TCP connections newly opened at the time the _raw stats are logged. (This is a gauge when exported in internal metrics, and can otherwise be ignored as an instantaneous measurement. Only some application protocols count toward this; e.g., any HTTP-based Source does not count.)
activeEPNumber of currently active event processors (EPs). EPs are used to process events through Breakers and Pipelines as the events are received from Sources and sent to destinations. EPs are typically created per TCP connection (such as for HTTP).
blockedEPNumber of currently blocked event processors (caused by blocking Destinations).
closeCxnNumber of TCP connections that were closed.
cpuPercCPU utilization from the combined user and system activity over the last 60 seconds.
droppedEventsThis is equivalent to the total.dropped_events metric. Drops can occur from Pipeline Functions, from Destination Backpressure, or from other errors. Any event not sent to a Destination is considered dropped.
eluPercEvent loop utilization. Represents the percentage of time over the last 60 seconds that the NodeJS runtime spent processing events within its event loop.
endTimeThe end of the timespan represented by these metrics. (Will always be 60 seconds after startTime.)
inBytesNumber of bytes received from all Sources (based only off _raw).
inEventsNumber of events received from all inputs after Event Breakers are applied. This can be larger than outEvents if events are dropped via Drop, Aggregation, Suppression, Sampling, or similar Functions.
mem.buffersMemory allocated for ArrayBuffers and SharedArrayBuffers.
mem.extExternal section of process memory, in MB.
mem.heapUsed heap section of process memory, in MB.
mem.heapTotalTotal heap section of process memory, in MB.
mem.rssResident set size section of process memory, in MB.
openCxnSame as activeCxn, but tracked as a counter rather than a gauge. So openCxn will show all connections newly opened each minute, and is more accurate than using activeCxn.
outBytesNumber of bytes sent to all Destinations (based only off _raw).
outEventsNumber of events sent out to all Destinations. This can be larger than inEvents due to creating event clones or entirely new unique events (such as when using the Aggregation Function).
pqInBytesNumber of bytes that were written to persistent queues, across all Destinations.
pqInEventsNumber of events that were written to persistent queues, across all Destinations.
pqOutBytesNumber of bytes that were flushed from persistent queues, across all Destinations.
pqOutEventsNumber of events that were flushed from persistent queues, across all Destinations.
pqTotalBytesAmount of data currently stored in persistent queues, across all Destinations.
rejectCxnNumber of TCP connections that were rejected.
startTimeThe beginning of the timespan represented by these metrics.
tasksCompletedThe number of tasks the process has started and completed for all collection jobs for which it was executing tasks.
tasksStartedThe number of tasks the process started for all collection jobs for which it was executing tasks.

Troubleshooting With the cribl_stderr Log

In rare instances, Cribl Stream’s Node.js backend may encounter an OOM or fatal error, preventing logging to its usual location. To aid in troubleshooting, a special cribl_stderr.log file is created, containing timestamped error details in UTC. This file is intended for Cribl Support and is not accessible through the UI.

To prevent excessive disk usage, Cribl Stream implements log rotation for cribl_stderr.log, limiting files to 5MB each with a maximum of 5 rotated copies. This addresses an issue where the log file could rapidly expand due to logged errors, impacting system performance.