Elasticsearch API

Cribl Edge supports receiving data over HTTP/S using the Elasticsearch Bulk API.

Type: Push | TLS Support: YES | Event Breaker Support: No
This Source supports gzip-compressed inbound data when the Content-Encoding: gzip connection header is set.

For examples of receiving data from popular senders via this API, see Configuring Elastic Beats below.

Configuring Cribl Edge to Receive Elasticsearch Bulk API Data over HTTP(S)

From the top nav, click Manage, then select a Fleet to configure. Next, you have two options:

To configure via the graphical QuickConnect UI, click Routing > QuickConnect (Stream) or Collect (Edge). Next, click Add Source at left. From the resulting drawer’s tiles, select [Push >] Elasticsearch API. Next, click either Add Destination or (if displayed) Select Existing. The resulting drawer will provide the options below.

Or, to configure via the Routing UI, click Data > Sources (Stream) or More > Sources (Edge). From the resulting page’s tiles or left nav, select [Push >] Elasticsearch API. Next, click New Source to open a New Source modal that provides the options below.

Cribl Edge ships with an Elasticsearch API Source preconfigured to listen on Port 9200. You can clone or directly modify this Source to further configure it, and then enable it.

General Settings

Input ID: Enter a unique name to identify this Elasticsearch Source definition.

Address: Enter the hostname/IP on which to listen for Elasticsearch data. (E.g., localhost or 0.0.0.0.)

Port: Enter the port number.

Elasticsearch API endpoint (for Bulk API): Absolute path on which to listen for Elasticsearch API requests. Defaults to /. Cribl Edge automatically appends _bulk, so (e.g.) /myPath becomes /myPath/_bulk. Requests could then be made to either /myPath/_bulk or /myPath/<myIndexName>/_bulk. Other entries are faked as success.

Optional Settings

Tags: Optionally, add tags that you can use to filter and group Sources in Cribl Edge’s Manage Sources page. These tags aren’t added to processed events. Use a tab or hard return between (arbitrary) tag names.

Authentication

In the Authentication type drop-down, select one of the following options:

None: Don’t use authentication.
Basic: Displays Username and Password fields for you to enter HTTP Basic authentication credentials. Click Generate if you need a new password.
Basic (credentials secret): Provide username and password credentials referenced by a secret. Select a stored text secret in the resulting Credentials secret drop-down, or click Create to configure a new secret.
Auth tokens: Use HTTP token authentication. Click Add Token and, in the resulting Token field, enter the bearer token that must be included in the HTTP authorization header. Click Generate if you need a new token. Click Add Token to display additional rows to specify more tokens.

TLS Settings (Server Side)

Enabled defaults to No. When toggled to Yes:

Certificate name: Name of the predefined certificate.

Private key path: Server path containing the private key (in PEM format) to use. Path can reference $ENV_VARS.

Passphrase: Passphrase to use to decrypt private key.

Certificate path: Server path containing certificates (in PEM format) to use. Path can reference $ENV_VARS.

CA certificate path: Server path containing CA certificates (in PEM format) to use. Path can reference $ENV_VARS.

Authenticate client (mutual auth): Require clients to present their certificates. Used to perform mutual authentication using SSL certs. Defaults to No. When toggled to Yes:

Validate client certs: Reject certificates that are not authorized by a CA in the CA certificate path, or by another trusted CA (e.g., the system’s CA). Defaults to Yes.
Common name: Regex matching subject common names in peer certificates allowed to connect. Defaults to .*. Matches on the substring after CN=. As needed, escape regex tokens to match literal characters. E.g., to match the subject CN=worker.cribl.local, you would enter: worker\.cribl\.local.

Minimum TLS version: Optionally, select the minimum TLS version to accept from connections.

Maximum TLS version: Optionally, select the maximum TLS version to accept from connections.

Persistent Queue Settings

In this section, you can optionally specify persistent queue storage, using the following controls. This will buffer and preserve incoming events when a downstream Destination is down, or exhibiting backpressure.

On Cribl-managed Cribl.Cloud Workers (with an Enterprise plan), this tab exposes only the Enable Persistent Queue toggle. If enabled, PQ is automatically configured in Always On mode, with a maximum queue size of 1 GB disk space allocated per Worker Process.

Enable Persistent Queue: Defaults to No. When toggled to Yes:

Mode: Select a condition for engaging persistent queues.

Always On: This default option will always write events to the persistent queue, before forwarding them to Cribl Edge’s data processing engine.
Smart: This option will engage PQ only when the Source detects backpressure from Cribl Edge’s data processing engine.

Max buffer size: The maximum number of events to hold in memory before reporting backpressure to the Source. Defaults to 1000.

Commit frequency: The number of events to send downstream before committing that Stream has read them. Defaults to 42.

Max file size: The maximum data volume to store in each queue file before closing it and (optionally) applying the configured Compression. Enter a numeral with units of KB, MB, etc. If not specified, Cribl Edge applies the default 1 MB.

Max queue size: The maximum amount of disk space that the queue is allowed to consume, on each Worker Process. Once this limit is reached, Cribl Edge will stop queueing data, and will apply the Queue-full behavior. Enter a numeral with units of KB, MB, etc. If not specified, the implicit 0 default will enable Cribl Edge to fill all available disk space on the volume.

Queue file path: The location for the persistent queue files. Defaults to $CRIBL_HOME/state/queues. To this field’s specified path, Cribl Edge will append /<worker-id>/inputs/<input-id>.

Compression: Optional codec to compress the persisted data after a file is closed. Defaults to None; Gzip is also available.

As of Cribl Edge 4.1, Source-side PQ’s default Mode changed from Smart to Always on. This option more reliably ensures events’ delivery, and the change does not affect existing Sources’ configurations. However:
If you create Stream Sources programmatically, and you want to enforce the previous Smart mode, you’ll need to update your existing code.
If you enable Always on, this can reduce data throughput. As a trade-off for data durability, you might need to either accept slower throughput, or provision more machines/faster disks.
You can optimize Workers’ startup connections and CPU load at Fleet Settings > Worker Processes.

Processing Settings

Fields

In this section, you can add Fields to each event using Eval-like functionality.

Name: Field name.

Value: JavaScript expression to compute field’s value, enclosed in quotes or backticks. (Can evaluate to a constant.)

Pre-Processing

In this section’s Pipeline drop-down list, you can select a single existing Pipeline to process data from this input before the data is sent through the Routes.

Advanced Settings

Enable proxy protocol: Toggle to Yes if the connection is proxied by a device that supports Proxy Protocol v1 or v2. This setting affects how the Source handles the __srcIpPort field.

Capture request headers: Toggle this to Yes to add request headers to events, in the __headers field.

Max active requests: Maximum number of active requests allowed for this Source, per Worker Process. Defaults to 256. Enter 0 for unlimited.

Activity log sample rate: Determines how often request activity is logged at the info level. The default 100 value logs every 100th value; a 1 value would log every request; a 10 value would log every 10th request; etc.

Socket timeout (seconds): How long Cribl Edge should wait before assuming that an inactive socket has timed out. The default 0 value means wait forever.

Request timeout (seconds): How long to wait for an incoming request to complete before aborting it. The default 0 value means wait indefinitely.

Keep-alive timeout (seconds): After the last response is sent, Cribl Edge will wait this long for additional data before closing the socket connection. Defaults to 5 seconds; minimum is 1 second; maximum is 600 seconds (10 minutes).

The longer the Keep-alive timeout, the more Cribl Edge will reuse connections. The shorter the timeout, the closer Cribl Edge gets to creating a new connection for every request. When request frequency is high, you can use longer timeouts to reduce the number of connections created, which mitigates the associated cost.

Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.

Enable proxy mode: If you toggle this to Yes, see Proxy Mode below for the resulting options.

Extra HTTP Headers: Name-value pairs to pass as additional HTTP headers. By default, Cribl Edge’s responses to HTTP requests include the X-elastic-product header, with an Elasticsearch value. (This is required by certain clients, including some Elastic Beats.)

API Version: To upstream Elastic Beats, this Cribl Edge Source will appear as an Elasticsearch instance matching the version that you set in this drop-down:

6.8.4 - Retained for backward compatibility.
8.3.2 - This default entry matches Elasticsearch’s current 8.3.x versions.
Custom - Opens an HTTP response object in the Custom API Version editor. This object replicates what an Elasticsearch server would send in its HTTP responses to a client. You can edit the version : number, and any other fields, as required to satisfy the Elastic Beat sending data to this Source. (This Custom option supports future Elasticsearch releases, as long as Elasticsearch keeps the same response-object structure.)

Proxy Mode

Enabling proxy mode allows Cribl Edge to proxy non-Bulk API requests to a downstream Elasticsearch server. This can be useful when integrating with Elasticsearch API senders like Elastic Endgame agents, which send requests that Cribl Edge does not natively support. Sliding Enable proxy mode to Yes exposes the following controls.

Proxy URL: URL of the Elasticsearch server that will proxy non-bulk requests, e.g.: http://elastic:9200.

Reject unauthorized certificates: Reject certificates that are not authorized by a trusted CA (e.g., the system’s CA). Defaults to No.

Remove headers: Enter any headers that you want removed from proxied requests. Press Tab or Return to separate header names.

Proxy request timeout: How long, in seconds, to wait for a proxy request to complete before aborting it. Defaults to 60 seconds; minimum timeout is 1 second.

To understand how Proxy request timeout interacts with the X-Forwarded-For header, see Overriding __srcIpPort with Client IP/Port.

Authentication method: Select one of the following options.

None: Don’t use authentication.
Manual: Displays Username and Password fields for you to enter HTTP Basic authentication credentials.
Secret: This option exposes a Credentials secret drop-down, in which you can select a stored secret that references the credentials described above. A Create link is available to store a new, reusable secret.

Connected Destinations

Select Send to Routes to enable conditional routing, filtering, and cloning of this Source’s data via the Routing table.

Select QuickConnect to send this Source’s data to one or more Destinations via independent, direct connections.

Field Normalization

When ingesting data from Elasticsearch, this Source performs the following field normalization:

Renames @timestamp to _time, maintaining millisecond precision.
Renames host.name to host.
Stores original host.name in __host to retain all host-related information.

The Elasticsearch Destination does the reverse, and it also recognizes the presence of __host.

Internal Settings

Cribl Edge uses a set of internal fields to assist in handling of data. These “meta” fields are not part of an event, but they are accessible, and Functions can use them to make processing decisions.

Fields for this Source:

__headers - Added only when Advanced Settings > Capture request headers is set to Yes.
__host
__id
__index
__inputId
__srcIpPort - See details below.
__type

Overriding `__srcIpPort` with Client IP/Port

The __srcIpPort field’s value contains the IP address and (optionally) port of the Elasticsearch client sending data to this Source.

When any proxies (including load balancers) lie between the Elasticsearch client and the Source, the last proxy adds an X-Forwarded-For header whose value is the IP/port of the original client. With multiple proxies, this header’s value will be an array, whose first item is the original client IP/port.

If X-Forwarded-For is present, and Advanced Settings > Enable proxy protocol is set to No, the original client IP/port in this header will override the value of __srcIpPort.

If Enable proxy protocol is set to Yes, the X-Forwarded-For header’s contents will not override the __srcIpPort value. (Here, the upstream proxy can convey the client IP/port without using this header.)

Configuring Elastic Beats

Beats are open-source data shippers that act as agents, sending data to Elasticsearch (or to other services, in this case Cribl Edge). The Beats most popular with Cribl users are Filebeat and Winlogbeat.

To set up a Beat to send data to Cribl Edge, edit the Beat’s YAML configuration file: filebeat.yml for Filebeat, winlogbeat.yml for Winlogbeat, and so on. In the config file, you’ll specify your Cribl Edge Elasticsearch Source endpoint as the Beat’s Elasticsearch output. To the Beat, Cribl Edge will appear as an instance of Elasticsearch.

If you’re using HTTP token authentication (which is disabled by default, both on-prem and on Cribl.Cloud): First, set the token. Then add the following to the Beat config file under output.elasticsearch.headers, substituting your token for myToken42:

output.elasticsearch:
  headers:
    Authorization: "myToken42"

Configuring an Elastic Agent

An Elastic Agent is single agent for logs, metrics, security data, and threat prevention that sends data to Elasticsearch (or to other services, in this case Cribl Edge).

When you are sending from an Elastic Agent to Cribl Edge, the elastic-agent.yml file doesn’t pay any attention to the settings for output.elasticsarch.allow_older_versions: true. As a result, the Elasticsearch API Source will not get any data.

To set it up correctly, go to the Advanced Settings tab, and change the API Version to Custom. In the Custom API Version editor, edit the version : number to match the version of the Elastic Agent you are using. This should allow the data to start flowing to the Source.

Elasticsearch API

Configuring Cribl Edge to Receive Elasticsearch Bulk API Data over HTTP(S)​

General Settings​

Optional Settings​

Authentication​

TLS Settings (Server Side)​

Persistent Queue Settings​

Processing Settings​

Fields​

Pre-Processing​

Advanced Settings​

Proxy Mode​

Connected Destinations​

Field Normalization​

Internal Settings​

Overriding __srcIpPort with Client IP/Port​

Configuring Elastic Beats​

Configuring an Elastic Agent​

Common Resources