Cribl - Docs

Getting started with Cribl LogStream

Questions? We'd love to help you! Meet us in #cribl (sign up)

Changelog    Guides

Pipelines

What are Pipelines

After the data has been matched by a route it gets delivered to a pipeline. A pipeline is set of functions that work on the data and that are composed in a very specific list. Similar to routes, the order in which the functions are listed matters.

Functions in a pipeline are evaluated in order, top down.

How do Pipelines Work

Events are always delivered at the beginning of a pipeline via a route . They are processed by each function, in order. A pipeline of chained functions will always move events in the direction that points outside of the system. This is on purpose so as to keep the design simple and avoid potential loops.

Types of Pipelines


Input Pipelines

These are pipelines that are attached to a Source (or Input) for the purposes of conditioning the events before they're delivered to a Processing Pipeline. They are optional and typical use cases are event formatting or when applying functions to all events of that input . E.g. extract the message field from all Elastic Sources before pushing events to various processing pipelines.

Processing Pipelines

These are the classic event processing pipelines.

Output Pipelines

These pipelines that are attached to a Destination (or Output) for the purposes of conditioning the events before they're sent out. Typical use cases are applying functions that transform or shape events per receiver requirements. E.g., ensure that a _time field exists for all events bound to a Splunk receiver.

Destination Selection in Processing Pipelines

Pipelines can be configured with an output destination but it is considered a best practice to define the destination at the route level instead. This makes the pipeline independent and reusable. (Note that destinations defined at route level overrides those at the pipeline level).

Other Considerations

Functions in a pipeline are equipped with their own filters. Even though they're not required, it advised that they're used as often as possible. Similar to routes, the general goal is to minimize extra work that a function will do; the fewer events a function has to operate on the better the overall performance. For example, if a pipeline has two functions, f1--f2 and if f1 operates on source 'foo' and f2 that operates on source 'bar' it may make sense to apply source=='foo' and source=='bar' filters on each one respectively.