These docs are for Cribl Edge 4.9 and are no longer actively maintained.
See the latest version (4.13).
Prometheus Edge Scraper Source
Cribl Edge supports receiving batched data from Prometheus targets.
Type: Internal | TLS Support: No | Event Breaker Support: No
This Source currently does not support Prometheus metadata.
This is an Internal (Pull) Source. To ingest Prometheus streaming data, see Prometheus Remote Write.
In addition to the functionality already supported in the existing Prometheus Scraper, this Source is designed to work seamlessly in Kubernetes environments; and no longer uses internal jobs framework allowing it to handle large-scale Cribl Edge deployments.
Configuring Cribl Edge to Scrape Prometheus Data
- On the top bar, select Products, and then select Cribl Edge. Under Fleets, select a Fleet. Next, you have two options:
- To configure via QuickConnect, navigate to Collect. Select Add Source and select the Source you want from the list, choosing either Select Existing or Add New.
- To configure via the Routes, select More > Sources. Select the Source you want. Next, select Add Source.
- Configure the following under General Settings:
- Input ID: Enter a unique name to identify this Source definition. If you clone this Source, Cribl Edge will add
-CLONE
to the original Input ID. - Description: Optionally, enter a description.
- Discovery type: Use this drop-down to select a discovery mechanism for targets. See Discovery Type below for the options and the resulting controls displayed. Some Discovery type options replace this section’s Targets field with additional controls – while also adding an AWS IAM left tab to the modal.
- Poll interval: Specify how often (in seconds) to scrape targets for metrics. Defaults to
15
seconds. This value must be an integer that divides evenly into60
.
- Input ID: Enter a unique name to identify this Source definition. If you clone this Source, Cribl Edge will add
- Next, you can configure the following Optional Settings:
- Extra dimensions: Specify the dimensions to include in events. Defaults to
host
andsource
. - Tags: Optionally, add tags that you can use for filtering and grouping in the Cribl Edge UI. Use a tab or hard return between (arbitrary) tag names. These tags aren’t added to processed events.
- Extra dimensions: Specify the dimensions to include in events. Defaults to
- Optionally, configure any Authentication, Processing, and Advanced settings, or Connected Destinations outlined in the sections below.
- Select Save, then Commit & Deploy.
Discovery Type
Use this drop-down to select a discovery mechanism for targets. To manually enter a targets list, use Static (the default). To enable dynamic discovery of endpoints to scrape, select DNS or AWS EC2. Each selection exposes different controls and/or tabs, listed below.
Static Discovery
The Static
option adds a General Settings > Targets field, in which you enter a list of specific Prometheus targets from which to pull metrics. Click Add Target to expose a table with the following options:
- Protocol: Select
http
(the default) orhttps
as the protocol to use when collecting metrics. - Host: Specify the host name to pull metrics from.
- Port: Specify the port number to append to the metrics URL for discovered targets. Defaults to
9090
. - Path: Specify a path to use when collecting metrics from discovered targets. Defaults to
/metrics
.
DNS Discovery
The DNS
option adds the following extra fields to its General Settings tab:
- Record type: Select the DNS record type to resolve. Defaults to
SRV
(Service). Other options areA
orAAAA
. - DNS names: Enter a list of DNS names to resolve.
- Protocol: Select
http
(the default) orhttps
as the protocol to use when collecting metrics. - Path: Specify a path to use when collecting metrics from discovered targets. Defaults to
/metrics
.
AWS EC2 Discovery
The AWS EC2
option adds AWS IAM to the modal, and adds extra fields to the General Settings tab:
- Protocol: Select
http
(the default) orhttps
as the protocol to use when collecting metrics. - Port: Specify the port number to append to the metrics URL corresponding to discovered targets. Defaults to
9090
. - Path: Specify a path to use when collecting metrics from discovered targets. Defaults to
/metrics
. - Region: Select the AWS region in which to discover EC2 instances with metrics endpoints to scrape.
- Use public IP: The
Yes
default uses the public IP address for discovered targets. Toggle toNo
to use a private IP address. - Search filter: Click Add filter to apply filters when searching for EC2 instances. Each filter row provides two columns:
- Filter name: Select standard attributes from the drop-down, or type in custom attributes.
- Filter values: Enter values to match within this row’s attribute, Press
Enter
between values. (If you specify no values, the search will return onlyrunning
EC2 instances.)
SelectingAWS EC2
also adds controls to the Advanced Settings tab, as described below.
Kubernetes Node
The Kubernetes Node
option supports collecting metrics from an endpoint on the local node where Cribl Edge is running. Selecting this option adds the following extra fields to the General Settings tab:
- Protocol: Select
http
(the default) orhttps
as the protocol to use when collecting metrics. - Port: Specify the port number to append to the metrics URL corresponding to discovered targets. Defaults to
9090
. - Path: Specify a path to use when collecting metrics from discovered targets. Defaults to
/metrics
.
Selecting Kubernetes Node
also adds a control to the Authentication tab, as described below.
You must first authorize the Source to discover a Kubernetes Node. For details, see RBAC for Prometheus Edge Scraper.
Kuberenetes Pods
The Kubernetes Pods
option supports building a list of endpoints to collect from Pods on the same node where Cribl Edge is running. This option adds the following extra fields to the General Settings tab – use a custom expression to set the configuration properties on each field.
- Protocol: Set the protocol to use when collecting metrics. Defaults to:
metadata.annotations['prometheus.io/scheme'] || 'http'
- Port: Specify the port number to append to the metrics URL corresponding to discovered targets. Defaults to:
metadata.annotations['prometheus.io/port'] || '9090'
- Path: Specify a path to use when collecting metrics from discovered targets. Defaults to:
metadata.annotations['prometheus.io/path'] || '/metrics'
- Filter rules: Optionally, add rules to determine which Pods to discover for metrics. If no rules are provided, Cribl Edge will search all Pods. Otherwise, it will search Pods where rules’ expressions evaluate to
true
. Each rule consists of an expression and an optional description.- Filter expression: JavaScript expression applied to Pods’ objects. Return
true
to include it. Filters are based on the Kubernetes Pod Object definition, and are evaluated from top to bottom. The first expression evaluating tofalse
excludes a Pod from collection. The default filter ismetadata.annotations['prometheus.io/scrape']
, which scrapes the Pod if the annotation is true. - Description: Optional description of the rule.
- Filter expression: JavaScript expression applied to Pods’ objects. Return
Selecting Kubernetes Pods
also adds a control to the Authentication tab, as described below.
You must first authorize the Source to discover Kubernetes Pods. For details, see RBAC for Prometheus Edge Scraper.
Authentication (Prometheus)
Use the Authentication method drop-down to select one of these authentication options for Prometheus:
Manual: In the resulting Username and Password fields, enter Basic authentication credentials corresponding to your Prometheus targets.
Secret: This option exposes a Secret drop-down, in which you can select a stored secret that references your credentials described above. The secret can reside in Cribl Edge’s internal secrets manager or (if enabled) in an external KMS. Click Create if you need to configure a new secret.
Authentication Settings for Kubernetes
This additional setting appears only when General Settings > Discovery Type is set to Kubernetes Pods
or Kubernetes Node
.
- Kubernetes: Select this option to use Kubernetes authentication. There are no details to configure, because Cribl Edge will authenticate using the Environment Variables or Service Account tokens mounted in the Edge Pod(s). For details, see Deploying via Kubernetes.
AWS IAM
With the AWS EC2 target discovery type, you can configure AssumeRole behavior on AWS.
Assume Role
Enable for EC2: Toggle to
Yes
if you want to useAssumeRole
credentials to access EC2.AssumeRole ARN: Enter the Amazon Resource Name (ARN) of the role to assume.
External ID: Enter the External ID to use when assuming the role.
AWS Authentication Options
Auto: This default option uses the AWS SDK for JavaScript to automatically obtain credentials in the following order of attempts:
- IAM Roles for Amazon EC2: Loaded from AWS Identity and Access Management (IAM) roles attached to an EC2 instance.
- Shared Credentials File: Loaded from the shared credentials file (
~/.aws/credentials
). - Environment Variables: Loaded from environment variables
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
. - JSON File on Disk: Loaded from a JSON file on disk.
- Other Credential-Provider Classes: Other credential-provider classes provided by the AWS SDK for JavaScript.
The Auto
method works both when running on AWS and in other environments where the necessary credentials are available through one of the above methods.
SSO Providers
When using the auto authentication method, you can leverage SSO providers like SAML and Okta to issue temporary credentials. These credentials should be set in the environment variables
AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
. The AWS SDK will then use these environment variables to authenticate.
Manual: If not running on AWS, you can select this option to enter a static set of user-associated IAM credentials (your access key and secret key) directly or by reference. This is useful for Edge Nodes not in an AWS VPC, for example, those running a private cloud. This option displays:
Access key: Enter your AWS access key. If not present, will fall back to the
env.AWS_ACCESS_KEY_ID
environment variable, or to the metadata endpoint for IAM role credentials.Secret key: Enter your AWS secret key. If not present, will fall back to the
env.AWS_SECRET_ACCESS_KEY
environment variable, or to the metadata endpoint for IAM credentials.
Secret: If not running on AWS, you can select this option to supply a stored secret that references an AWS access key and secret key. This option displays:
- Secret key pair: Use the drop-down to select a secret key pair that you’ve configured in Cribl Edge’s internal secrets manager or (if enabled) an external KMS. Click Create if you need to configure a key pair.
Processing Settings
Fields
In this section, you can add Fields to each event using Eval-like functionality.
Name: Field name.
Value: JavaScript expression to compute field’s value, enclosed in quotes or backticks. (Can evaluate to a constant.)
Pre-Processing
In this section’s Pipeline drop-down list, you can select a single existing Pipeline to process data from this input before the data is sent through the Routes.
Disk Spooling
Enable disk spooling: Whether to save metrics to disk. When set to Yes
, it exposes this section’s remaining fields.
Bucket time span: The amount of time that data is held in each bucket before it’s written to disk. The default is 10 minutes (10m
).
Max data size: Maximum disk space the persistent metrics can consume. Once reached, Cribl Edge will delete older data. Example values: 420 MB
, 4 GB
. Default value: 1 GB
.
Max data age: How long to retain data. Once reached, Cribl Edge will delete older data. Example values: 2h
, 4d
. Default value: 24h
(24 hours).
Compression: Defaults to gzip
.
Cribl Edge will write metrics to the default location: CRIBL_HOME/state/spool/in/edge_prometheus/${inputId}
. Use the environment variable CRIBL_SPOOL_DIR
, to change the default path.
Advanced Settings
HTTP Connection Timeout: The amount of time (in milliseconds) to wait for the HTTP connection to establish before it times out. 1-60000
is the allowable range. Enter 0
to disable the timeout.
Environment: If you’re using GitOps, optionally use this field to specify a single Git branch on which to enable this configuration. If empty, the config will be enabled everywhere.
Advanced Settings for AWS
These additional settings appear only when General Settings > Discovery Type is set to AWS EC2
.
Endpoint: Specify an EC2-compatible service endpoint. If empty, this defaults to AWS’ Region-specific endpoint.
Signature version: The signature version to use for signing EC2 requests. Defaults to v4
.
Reuse connections: Whether to reuse connections between requests. The default setting (Yes
) can improve performance.
Reject unauthorized certificates: Whether to reject certificates that cannot be verified against a valid Certificate Authority (for example, self-signed certificates). Defaults to Yes
, the restrictive option.
Connected Destinations
Select Send to Routes to enable conditional routing, filtering, and cloning of this Source’s data via the Routing table.
Select QuickConnect to send this Source’s data to one or more Destinations via independent, direct connections.
Internal Fields
Cribl Edge uses a set of internal fields to assist in handling of data. These “meta” fields are not part of an event, but they are accessible, and Functions can use them to make processing decisions.
Fields for this Source:
__source
__isBroken
__inputId
__final
__criblMetrics
__channel
__cloneCount
__kube_pod
when Discovery Mode is set to Kubernetes Node.__kube_node
and__kube_pod
when Discovery Mode is set to Kubernetes Pods.
Proxying Requests
If you need to proxy HTTP/S requests, see System Proxy Configuration.
Troubleshooting
Dropping request because token invalid",“authToken”: “Bas…Njc=”
The specified token is invalid. Note that the above message is logged only at the debug level.