Windows System Metrics Details
Consult the reference for the metrics reported by the System Metrics Source on Windows.
Events generated by the Windows Metrics Source include metrics metadata to designate dimension and metric fields. The host field contains the hostname, and is included as a dimension in all of them.
The collectors include:
In the Source’s configuration modal, You can set the level of detail for each type of metrics:
- Basic enables minimal metrics, averaged or aggregated.
- All enables full, detailed metrics, specified for individual CPUs, interfaces, and so on.
- Custom displays sub-menus and buttons from which you can choose a level of detail (Basic, All, Custom, or Disabled) for each type of event.
- Disabled means that no metrics will be generated.
Basic and Custom have different meanings depending on event type and will be described under each section below.
The tables outline the metrics emitted for each mode (Basic or Custom) and where applicable, the dimensions (to indicate where the metrics are coming from).
System
With System Metrics enabled, Cribl Edge captures CPU load averages, uptime, and count. The Custom option option allows you to include detailed metrics. These are Windows-specific metrics including OS information, system uptime, CPU architecture, etc.
Metrics for the overall system include the following:
| Name | Description | Type | Dimensions | Mode |
|---|---|---|---|---|
windows_cs_logical_processors | Number of installed logical processors. | Gauge | N/A | Basic |
windows_cs_physical_memory_bytes | Total installed physical memory. | Gauge | N/A | Basic |
windows_os_info | Contains full product name & version in labels. | Gauge | product, version | Basic |
windows_os_physical_memory_free_bytes | Bytes of physical memory currently unused and available. | Gauge | N/A | Basic |
windows_os_processes | Number of process contexts currently loaded or running on the operating system. | Gauge | N/A | Basic |
windows_system_processor_queue_length | Number of threads in the processor queue. | Gauge | N/A | Basic |
windows_system_threads | Number of Windows system threads. | Gauge | N/A | Basic |
windows_cs_hostname | Labeled system hostname information. | Gauge | hostname, domain, fqdn | Custom: Detailed |
windows_cpu_info | Labeled CPU information. | Gauge | architecture, device_id, description, family, l2_cache_size, l3_cache_size, name | Custom: Detailed |
windows_os_paging_limit_bytes | Total number of bytes that can be stored in the operating system paging files. | Gauge | N/A | Custom: Detailed |
windows_os_paging_free_bytes | Number of bytes that can be mapped into the operating system paging files without causing any other pages to be swapped out. | Gauge | N/A | Custom: Detailed |
windows_os_processes_limit | Maximum number of process contexts the operating system can support. | Gauge | N/A | Custom: Detailed |
windows_os_process_memory_limit_bytes | Maximum number of bytes of memory that can be allocated to a process. | Gauge | N/A | Custom: Detailed |
windows_os_virtual_memory_bytes | Bytes of virtual memory. | Gauge | N/A | Custom: Detailed |
windows_system_exception_dispatches_total | Total exceptions dispatched by the system. | Counter | N/A | Custom: Detailed |
windows_system_system_calls_total | Total combined calls to Windows NT system service routines by all processes running on the computer. | Counter | N/A | Custom: Detailed |
windows_system_system_up_time | Time of last boot of system. | Gauge | N/A | Custom: Detailed |
CPU
Basic level captures active, user, system, idle, and iowait percentages over all CPUs.
Custom level toggles the following on or off: Per CPU metrics, Detailed metrics (i.e., metrics for all CPU states), and CPU time metrics (i.e., raw, monotonic CPU time counters).
Metrics for CPUs include the following:
| Name | Description | Type | Dimensions | Mode |
|---|---|---|---|---|
windows_cpu_percent_active_all | CPU percent active usage | Gauge | core, mode | Basic |
windows_cpu_percent_active | CPU percent active usage per CPU | Gauge | core, mode | Basic or Custom: Per CPU and CPU time metrics |
windows_cpu_percent | CPU percent active usage | Gauge | core, mode is set to user, idle, privileged, interrupt, dpc | Basic or Custom: Per CPU and CPU time metrics |
windows_cpu_parking_status | Parking Status represents whether a processor is parked or not. | Counter | core | Basic or Custom: Per CPU and CPU time metrics |
windows_cpu_core_frequency_mhz | Core frequency in megahertz. | Gauge | core | Basic or Custom: Per CPU and CPU time metrics |
windows_cpu_time_all_total | Sum of all cpu_time across all cores. | Gauge | mode | Basic or Custom: CPU time metrics |
windows_cpu_cstate_seconds_total | Time spent in low-power idle state. | Counter | core, state | Custom: Per CPU and Detailed metrics |
windows_cpu_time_total | Time that processor spent in different modes (idle, user, system etc.). | Counter | core, mode | Custom: Per CPU and CPU time metrics |
windows_cpu_interrupts_total | Total number of received and serviced hardware interrupts. | Counter | core | Custom: Per CPU or Detailed metrics |
windows_cpu_dpcs_total | Total number of received and serviced deferred procedure calls (DPCs). | Counter | core | Custom: Per CPU or Detailed metrics |
windows_cpu_clock_interrupts_total | Total number of received and serviced clock tick interrupts. | Counter | core | Custom: Per CPU or Detailed metrics |
windows_cpu_idle_break_events_total | Total number of time processor was woken from idle. | Counter | core | Custom: Per CPU or Detailed metrics |
windows_cpu_processor_performance | Average performance of the processor while it is executing instructions. | Gauge. | core | Custom: Per CPU and Detailed metrics |
windows_cpu_percent_processor_performance | Average performance of the processor while it is executing instructions, as a percentage of the nominal performance of the processor. | Gauge. | core | Custom: Per CPU and Detailed metrics |
windows_cpu_percent_processor_utility | Amount of work a processor is completing, as a percentage of the amount of work the processor could complete if it were running at its nominal performance and never idle. | Gauge. | core | Custom: Per CPU and Detailed metrics |
windows_cpu_average_idle_time | Processor idle time. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_percent_privilege_utility | Amount of work a processor is completing while executing in privileged mode. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_interrupts_total_per_sec | Total number of received and serviced hardware interrupts, computed average on a per second interval. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_dpcs_total_per_sec | Total number of received and serviced deferred procedure calls (DPCs), computed average on a per second interval. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_clock_interrupts_total_per_sec | Total number of received and serviced clock tick interrupts, computed average on a per second interval. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_idle_break_events_total_per_sec | Total number of time processor was woken from idle, computed average on a per second interval. | Gauge | mode | Custom: Per CPU and Detailed metrics |
windows_cpu_percent_all_total | Percentage of all cpu_time across all cores. | Gauge | mode is set to dpc, idle, interrupt, privilege, user,active | Custom: CPU time metrics |
Memory
Basic level captures captures total, used, available, swap_free, and swap_total.
Custom level toggles Detailed metrics on or off. (These are metrics for all memory states.)
Metrics for memory include the following:
| Name | Description | Type | Dimensions | Mode |
|---|---|---|---|---|
windows_memory_available_bytes | Physical memory that is immediately available for allocation to a process or for system use. This is the sum of the standby (cached), free, and zero page lists. | Gauge | N/A | Basic |
windows_memory_cache_bytes | Number of bytes currently being used by the filesystem cache | Gauge | N/A | Basic |
windows_memory_cache_bytes_peak | Maximum number of CacheBytes after the system was last restarted. | Gauge | N/A | Basic |
windows_memory_cache_faults_total | Faults that occur when a page sought in the filesystem cache is not found there and must be retrieved elsewhere in memory (soft fault) or from disk (hard fault). | Counter | N/A | Basic |
windows_memory_commit_limit | Bytes of virtual memory that can be committed without having to extend paging files. | Gauge | N/A | Basic |
windows_memory_committed_bytes | Bytes of committed virtual memory. | Gauge | N/A | Basic |
windows_memory_pool_paged_allocs_total | Calls to allocate space in the paged pool, regardless of the amount of space allocated in each call. | Counter | N/A | Basic |
windows_memory_pool_paged_bytes | Number of bytes in the paged pool. | Gauge | N/A | Basic |
windows_memory_pool_paged_resident_bytes | The size, in bytes, of the portion of the paged pool that is currently resident and active in physical memory. The paged pool is an area of the system virtual memory used for objects that can be written to disk when they are not being used. | Gauge | N/A | Basic |
windows_memory_demand_zero_faults_total | Number of Zeroed pages required to satisfy faults. Windows uses zeroed pages as a security measure to prevent processes from seeing data stored by earlier processes that previously used the memory space. | Counter | N/A | Custom: Detailed |
windows_memory_free_and_zero_page_list_bytes | Physical memory allocated to free and zero pages, in bytes. This memory does not contain cached data. It is immediately available for allocation to a process or system use. | Gauge | N/A | Custom: Detailed |
windows_memory_free_system_page_table_entries | Page table entries not being used by the system. | Gauge | N/A | Custom: Detailed |
windows_memory_modified_page_list_bytes | Physical memory, in bytes, assigned to the modified page list. This memory contains cached data and code that is not actively in use by processes, the system, and the system cache. This memory needs to be written out before it will be available for allocation to a process or for system use. | Gauge | N/A | Custom: Detailed |
windows_memory_page_faults_total | Overall rate at which faulted pages are handled by the processor. | Counter | N/A | Custom: Detailed |
windows_memory_swap_page_reads_total | Disk page reads (a single read operation reading several pages is still only counted once). | Counter | N/A | Custom: Detailed |
windows_memory_swap_pages_read_total | Pages read across all page reads (i.e., counting all pages read even if they are read in a single operation). | Counter | N/A | Custom: Detailed |
windows_memory_swap_pages_written_total | Pages written across all page writes (i.e., counting all pages written even if they are written in a single operation). | Counter | N/A | Custom: Detailed |
windows_memory_swap_page_operations_total | Total number of swap page read and writes (PagesPersec). | Counter | N/A | Custom: Detailed |
windows_memory_swap_page_writes_total | Disk page writes (a single write operation writing several pages is still only counted once). | Counter | N/A | Custom: Detailed |
windows_memory_pool_nonpaged_allocs_total | The number of calls to allocate space in the non-paged pool. The non-paged pool is an area of system memory area for objects that cannot be written to disk, and must remain in physical memory as long as they are allocated | Counter. | N/A | Custom: Detailed |
windows_memory_pool_nonpaged_bytes | Non-paged pool, in bytes. The non-paged pool is an area of the system virtual memory that is used for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. | Gauge | N/A | Custom: Detailed |
windows_memory_standby_cache_core_bytes | Physical memory, in bytes, that is assigned to the core standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system, and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists. | Gauge | N/A | Custom: Detailed |
windows_memory_standby_cache_normal_priority_bytes | Physical memory, in bytes, that is assigned to the normal priority standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system, and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists. | Gauge | N/A | Custom: Detailed |
windows_memory_standby_cache_reserve_bytes | Pysical memory, in bytes, that is assigned to the reserve standby cache page lists. This memory contains cached data and code that is not actively in use by processes, the system and the system cache. It is immediately available for allocation to a process or for system use. If the system runs out of available free and zero memory, memory on lower priority standby cache page lists will be repurposed before memory on higher priority standby cache page lists. | Gauge | N/A | Custom: Detailed |
windows_memory_system_cache_resident_bytes | The size, in bytes, of the portion of the system file cache which is currently resident and active in physical memory. | Gauge | N/A | Custom: Detailed |
windows_memory_system_code_resident_bytes | The size, in bytes, of the pageable operating system code that is currently resident and active in physical memory. This value is a component of Memory/System Code Total Bytes. Memory/System Code Resident Bytes (and Memory/System Code Total Bytes) does not include code that must remain in physical memory and cannot be written to disk. | Gauge | N/A | Custom: Detailed |
windows_memory_system_code_total_bytes | The size, in bytes, of the pageable operating system code currently mapped into the system virtual address space. This value is calculated by summing the bytes in Ntoskrnl.exe, Hal.dll, the boot drivers, and filesystems loaded by Ntldr/osloader. This counter does not include code that must remain in physical memory and cannot be written to disk. | Gauge | N/A | Custom: Detailed |
windows_memory_system_driver_resident_bytes | The size, in bytes, of the pageable physical memory being used by device drivers. It is the working set (physical memory area) of the drivers. This value is a component of Memory/System Driver Total Bytes, which also includes driver memory that has been written to disk. Neither Memory/System Driver Resident Bytes nor Memory/System Driver Total Bytes includes memory that cannot be written to disk. | Gauge | N/A | Custom: Detailed |
windows_memory_system_driver_total_bytes | The size, in bytes, of the pageable virtual memory currently being used by device drivers. Pageable memory can be written to disk when it is not being used. It includes both physical memory (Memory/System Driver Resident Bytes) and code and data paged to disk. It is a component of Memory/System Code Total Bytes. | Gauge | N/A | Custom: Detailed |
windows_memory_transition_faults_total | Rate at which page faults are resolved, by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk at the time of the page fault. The pages were recovered without additional disk activity. Transition faults are counted in numbers of faults; because only one page is faulted in each operation, it is also equal to the number of pages faulted. | Counter | N/A | Custom: Detailed |
windows_memory_transition_pages_repurposed_total | Rate at which the number of transition cache pages were reused for a different purpose. These pages would have otherwise remained in the page cache to provide a (fast) soft fault (instead of retrieving it from backing store) in the event the page was accessed in the future. | Counter | N/A | Custom: Detailed |
windows_memory_write_copies_total | The number of page faults caused by attempting to write that were satisfied by copying the page from elsewhere in physical memory. | Counter | N/A | Custom: Detailed |
windows_memory_used_percent | Percent of committed memory used. | Gauge | N/A | Custom: Detailed |
windows_memory_available_percent | Percent of committed memory available. | Gauge | N/A | Custom: Detailed |
windows_memory_cache_faults_per_sec | Rate of cache faults computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_demand_zero_faults_per_sec | Rate of Zeroed pages required to satisfy faults computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_page_faults_per_sec | Rate at which faulted pages are handled by the processor computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_page_reads_per_sec | Disk page reads computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_pages_input_per_sec | Disk page reads computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_pages_output_per_sec | Pages written across all page writes computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_pages_per_sec | Total number of swap page read and writes computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_page_writes_per_sec | Disk page writes computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_transition_faults_sec | Rate at which page faults are resolved computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_transition_pages_repurposed_per_sec | Rate at which the number of transition cache pages were reused for a different purpose computed per sec. | Gauge | N/A | Custom: Detailed |
windows_memory_write_copies_per_sec | Rate at which the number of page faults caused by attempting to write that were satisfied by copying the page from elsewhere in physical memory computed per sec. | Gauge | N/A | Custom: Detailed |
Network
Basic level captures bytes, packets, errors, and connections over all interfaces.
Custom level exposes the following:
- The Interface filter, which specifies which network interfaces to include or exclude. (An empty filter will include all metrics.)
- Per interface metrics, which toggle on or off.
- Detailed metrics, which toggle on or off. If on, the Protocol metrics toggle appears, allowing you to choose whether to generate metrics for ICMP, ICMPMsg, IP, TCP, UDP, and UDPLite.
Metrics for networks include the following:
| Name | Description | Type | Dimensions | Mode |
|---|---|---|---|---|
windows_net_packets_outbound_discarded_total | Total outbound packets to be discarded even though no errors had been detected to prevent transmission. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_outbound_errors_total | Total packets that could not be transmitted due to errors. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_discarded_total | Total inbound packets that were chosen to be discarded even though no errors had been detected to prevent delivery. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_errors_total | Total packets that could not be received due to errors. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_unknown_total | Total packets received by interface that were discarded because of an unknown or unsupported protocol. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_non_unicast_total | Total non-unicast (subnet broadcast or subnet multicast) packets that are delivered to a higher-layer protocol. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_non_unicast_total_per_sec | Rate at which non-unicast (subnet broadcast or subnet multicast) packets are delivered to a higher-layer protocol computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_received_unicast_total | Total subnet-unicast packets that are delivered to a higher-layer protocol. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_received_unicast_total_per_sec | Rate at which subnet-unicast packets are delivered to a higher-layer protocol computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_sent_unicast_total | Total packets requested to be transmitted to subnet-unicast addresses by higher-level protocols. | Counter | nic | Custom: Detailed metrics |
windows_net_packets_sent_non_unicast_total_per_sec | Total packets that are requested to be transmitted to nonunicast (subnet broadcast or subnet multicast) addresses by higher-level protocols. | Gauge | nic | Custom: Detailed metrics |
windows_net_bytes_received_total | Total bytes received by interface. | Counter | nic | Custom: Per Interface metrics |
windows_net_bytes_received_total_per_sec | Total bytes received by interface computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_bytes_sent_total | Total bytes transmitted by interface. | Counter | nic | Custom: Per Interface metrics |
windows_net_bytes_sent_total_per_sec | Total bytes transmitted by interface computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_bytes_total | Total bytes received and transmitted by interface. | Counter | nic | Custom: Per Interface metrics |
windows_net_bytes_total_per_sec | Total bytes received and transmitted by interface per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_received_total_per_sec | Total packets received by interface computed per sec. | Counter | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_bytes_total_per_sec | Total bytes received and transmitted by interface per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_total | Total packets received and transmitted by interface. | Counter | nic | Custom: Per Interface metrics |
windows_net_packets_total_per_sec | Total packets received and transmitted by interface computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_sent_total | Total packets transmitted by interface. | Counter | nic | Custom: Per Interface metrics |
windows_net_packets_sent_total_per_sec | Total packets transmitted by interface computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_sent_unicast_total_per_sec | Rate at which packets are requested to be transmitted to subnet-unicast addresses by higher-level protocols computed per sec. | Gauge | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_packets_sent_non_unicast_total | Rate at which packets that are requested to be transmitted to nonunicast (subnet broadcast or subnet multicast) addresses by higher-level protocols per sec. | Counter | nic | Custom: Per Interface metrics and Detailed metrics |
windows_net_current_bandwidth_bytes | Estimate of the interface’s current bandwidth in bytes per second. | Gauge | nic | Custom: Detailed metrics |
windows_tcp_connection_failures_all_total | Number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition from the SYN-RCVD state to the LISTEN state. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_connections_active_all_total | Number of times TCP connections have made a direct transition from the CLOSED state to the SYN-SENT state. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_connections_established | Number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT. | Gauge | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_connections_passive_all_total | Number of times TCP connections have made a direct transition from the LISTEN state to the SYN-RCVD state. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_connections_reset_total | Number of times TCP connections have made a direct transition from the LISTEN state to the SYN-RCVD state. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_total | Total segments sent or received using the TCP protocol. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_received_all_total | Total segments received using the TCP protocol. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_retransmitted_all_total | Total segments retransmitted using the TCP protocol. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_sent_all_total | Total segments sent using the TCP protocol. | Counter | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_all_total_per_sec | Total segments sent or received using the TCP protocol computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_received_all_total_per_sec | Total segments received using the TCP protocol computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_retransmitted_all_total_per_sec | Total segments retransmitted using the TCP protocol computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_tcp_segments_sent_all_total_per_sec | Total segments sent using the TCP protocol computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (TCPv4 and TCPv6) |
windows_net_datagrams_all_total | Total datagrams sent or received using the UDP protocol. | Counter | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_no_port_all_total | Rate of received UDP datagrams for which there was no application at the destination port. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_received_all_total | Rate at which UDP datagrams are delivered to UDP users. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_received_errors_all_total | Number of received UDP datagrams that could not be delivered excluding errors due to lack of an application at the destination port. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_sent_all_total | Total UDP datagrams sent from the entity. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_no_port_all_total_per_sec | Rate of received UDP datagrams for which there was no application at the destination port computed per sec | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_received_all_total_per_sec | Rate at which UDP datagrams are delivered to UDP users computed per sec | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_received_errors_all_total_per_sec | Rate at which received UDP datagrams that could not be delivered, excluding errors due to lack of an application at the destination port, computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
windows_net_datagrams_sent_all_total_per_sec | Rate at which UDP datagrams sent from the entity computed per sec. | Gauge | af | Custom: Detailed and Protocol metrics (UDPv4 and UDPv6) |
Disk
Basic level captures disk usage (%), bytes read and written, and read and write operations, over all mounted disks.
Custom level exposes the following:
- The Volume filter, specifying which Windows volumes to include or exclude. Supports wildcards and
!(not) operators. An empty filter will include all volumes. - Per volume metrics, which toggle on or off.
- Detailed metrics, which toggle on or off.
Metrics for Disk include the following:
| Name | Description | Type | Dimensions | Mode |
|---|---|---|---|---|
windows_logical_disk_requests_queued | Outstanding requests on the disk at the time the performance data is collected | Gauge | volume | Basic |
windows_logical_disk_read_bytes_total | Rate at which bytes are transferred from the disk during read operations. | Counter | volume | Basic |
windows_logical_disk_reads_total | Rate of read operations on the disk. | Counter | volume | Basic |
windows_logical_disk_write_bytes_total | Rate at which bytes are transferred to the disk during write operations. | Counter | volume | Basic |
windows_logical_disk_writes_total | Rate of write operations on the disk. | Counter | volume | Basic |
windows_logical_disk_write_latency_seconds_total | Shows the average time, in seconds, of a write operation to the disk. | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_read_latency_seconds_total | Shows the average time, in seconds, of a read operation from the disk. | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_read_write_latency_seconds_total | Shows the time, in seconds, of the average disk transfer. | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_read_seconds_total | Seconds the disk was busy servicing read request. | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_idle_seconds_total | Seconds the disk was idle (not servicing read/write requests). | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_split_ios_total | Number of I/Os to the disk split into multiple I/Os. | Counter | volume | Custom: Detailed metrics |
windows_logical_disk_percent_read_time | Percent rate of read operations on the disk. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_percent_write_time | Percent write operations on the disk. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_percent_time | Percent time the disk was in read + write operations | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_percent_idle_time | Percent time the disk was idle. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_percent_free_space | Percent space free on volume. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_average_disk_sec_per_transfer | Measures the average time of data reads and writes on the disk. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_average_disk_sec_per_read | Measures the average rate of disk read requests that are executed per second on a specific logical disk. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_average_disk_sec_per_write | Indicates how fast data is being written on average for a specific logical disk. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_split_ios_per_sec | Rate the I/Os to the disk were split into multiple I/Os per sec. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_bytes_per_sec | Exposes the rate bytes are transferred to or from the disk during write or read operations per sec. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_read_bytes_per_sec | Exposes the rate bytes are transferred to or from the disk during read operations per sec. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_reads_per_sec | Exposes the rate of read operations on the disk per sec. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_transfers_per_sec | How fast data is being read and written for a specific logical disk per sec. | Gauge | volume | Custom: Detailed metrics |
windows_logical_disk_write_bytes_per_sec | Exposes the rate at which bytes are transferred from the disk during write operations per sec. | Gauge | volume | Custom: Detailed metrics |
The windows_logical_disk_free_bytes and windows_logical_disk_size_bytes metrics are not updated in real time and might have a delay of 10-15min. This is the same behavior as the Windows performance counters.
Process Metrics
With Process Metrics enabled, Cribl Edge captures process-specific metrics from Windows servers and reports them as events. This allows you to monitor specific processes on Cribl.Cloud instances. You can generate events for any process object.
Process-specific metrics are not affected by the Host Metrics detail setting.
For information on how to configure the Windows Metrics Source to generate process-specific metrics, check out the section of the Windows Metrics page.
Process-specific metrics include the following:
| Name | Description | Type | Dimensions |
|---|---|---|---|
process_start_time | Time the process started. | gauge | process_cmdline, process_service |
process_cpu_time_total | Elapsed time that the process’s threads have spent executing instructions in either privileged mode or user mode. Included in this count is code executed to handle some hardware interrupts and trap conditions. | counter | process_cmdline, process_service |
process_handles | Total number of handles the process has open. This number is the sum of the handles currently open by each thread in the process. | gauge | process_cmdline, process_service |
process_io_bytes_total | Total number of bytes issued to I/O operations in either read, write, or other mode. This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write modes include data operations; other mode includes those that don’t involve data, like control operations. | counter | process_cmdline, process_service |
process_io_operations_total | Total number of I/O operations issued in either read, write, or other mode. This property counts all I/O activity generated by the process to include file, network, and device I/Os. Read and write mode includes data operations; other mode includes those that do not involve data, such as control operations. | counter | process_cmdline, process_service |
process_page_faults_total | Total number of page faults by threads executing in this process. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This can cause the page not to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with which the page is shared. | counter | process_cmdline, process_service |
process_page_file_bytes | Current number of bytes this process has used in the paging files. Paging files are used to store pages of memory used by the process that are not contained in other files. Paging files are shared by all processes, and lack of space in paging files can prevent other processes from allocating memory. | gauge | process_cmdline, process_service |
process_pool_bytes | Last observed number of bytes in the paged or nonpaged pool. The paged pool is an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used. The nonpaged pool is an area of system memory (physical memory used by the operating system) for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated. Nonpaged pool bytes are calculated differently than paged pool bytes, so they may not equal the total of paged pool bytes. | gauge | process_cmdline, process_service |
process_priority_base | Current base priority of this process. Threads within a process can raise and lower their own base priority, relative to the process’s base priority. | gauge | process_cmdline, process_service |
process_private_bytes | Current number of bytes this process has allocated that can’t be shared with other processes. | gauge | process_cmdline, process_service |
process_threads | Number of threads currently active in this process. Every running process has at least one thread. | gauge | process_cmdline, process_service |
process_virtual_bytes | Current size, in bytes, of the virtual address space that the process is using. Use of virtual space doesn’t imply use of either disk or main memory pages. Virtual space is finite and when the process uses too much, it can limit its ability to load libraries. | gauge | process_cmdline, process_service |
process_working_set_private_byte | Size of the working set, in bytes, that is used for this process only and not shared or shareable by other processes. | gauge | process_cmdline, process_service |
process_working_set_peak_bytes | Maximum size, in bytes, of the working set of this process at any point in time. The working set is the set of memory pages touched recently by the threads in the process. If free memory is above a threshold, pages are left in the working set of a process even if they are not in use. When free memory falls below a threshold, pages are trimmed from working sets. If pages are needed, they will be soft-faulted back into the working set before they leave main memory. | gauge | process_cmdline, process_service |
process_working_set_bytes | Maximum number of bytes in the working set of this process at any point in time. If free memory is above a threshold, pages are left in the working set of a process even if they aren’t in use. When free memory falls below a threshold, pages are trimmed from working sets. If pages are needed, they will be soft-faulted back into the working set before they leave main memory. | gauge | process_cmdline, process_service |
Additional Process Metrics Properties
These properties, along with the dimensions listed in the previous table, are included with each metric:
process_pidprocess_ppidprocess_exe_pathprocess_name
Missing property values are represented as null.
Windows Filter Expressions
Cribl Edge allows you to filter Windows events based on various properties. This enables you to isolate specific events of interest for analysis or further processing.
For example, to identify all processes running as explorer, use the following filter expression:
processName == 'explorer'
This expression will return all instances of the explorer process.
GPU Metrics
With GPU Metrics enabled, Cribl Edge captures temperature, utilization, memory, power, clock, and throttle metrics from Nvidia GPUs.
The All option adds per-GPU events with identifying dimensions (gpu_index, gpu_name, gpu_uuid) and detailed PCIe, ECC, encoder, and power metrics.
In Basic mode, or when Per GPU metrics is disabled in Custom mode,
the collector emits a single aggregated metric event for all GPUs during each collection interval.
These metrics use the same name with the _all suffix, such as node_gpu_temperature_celsius_all.
Across GPUs, those _all values use sum, average, or maximum, depending on how the metric is defined.
The node_gpu_count gauge appears only on the aggregated event and does not have an _all equivalent.
| Name | Description | Type | Mode |
|---|---|---|---|
node_gpu_count | Number of NVIDIA GPUs detected (aggregated event only). | Gauge | Basic |
node_gpu_temperature_celsius | GPU core temperature in degrees Celsius. | Gauge | All or Custom: Per GPU |
node_gpu_temperature_memory_celsius | GPU memory temperature in degrees Celsius. | Gauge | All or Custom: Per GPU |
node_gpu_fan_speed_percent | Fan speed as a percent of maximum. | Gauge | All or Custom: Per GPU |
node_gpu_utilization_gpu_percent | Percent utilization of GPU compute. | Gauge | All or Custom: Per GPU |
node_gpu_utilization_memory_percent | Percent utilization of GPU memory. | Gauge | All or Custom: Per GPU |
node_gpu_utilization_encoder_percent | Percent utilization of the hardware video encoder. | Gauge | All or Custom: Per GPU |
node_gpu_utilization_decoder_percent | Percent utilization of the hardware video decoder. | Gauge | All or Custom: Per GPU |
node_gpu_power_draw_watts | Instantaneous GPU power draw in watts (see also average and instantaneous detail metrics). | Gauge | All or Custom: Per GPU |
node_gpu_power_limit_watts | Configured GPU power limit in watts. | Gauge | All or Custom: Per GPU |
node_gpu_memory_total_bytes | Total GPU framebuffer memory size in bytes. | Gauge | All or Custom: Per GPU |
node_gpu_memory_used_bytes | GPU memory used in bytes. | Gauge | All or Custom: Per GPU |
node_gpu_memory_free_bytes | GPU memory free in bytes. | Gauge | All or Custom: Per GPU |
node_gpu_memory_reserved_bytes | GPU memory reserved but not actively used in bytes. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_gr_mhz | GPU graphics clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_sm_mhz | Streaming multiprocessor clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_mem_mhz | GPU memory clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_video_mhz | Video clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_max_gr_mhz | Maximum advertised graphics clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_max_mem_mhz | Maximum advertised memory clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_clocks_max_sm_mhz | Maximum advertised streaming multiprocessor clock in megahertz. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_gpu_idle | Whether the GPU idle clock throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_applications_clocks_setting | Whether the applications clocks-setting throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_sw_power_cap | Whether the software power cap throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_hw_thermal_slowdown | Whether the hardware thermal slowdown throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_hw_power_brake_slowdown | Whether the hardware power brake slowdown throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_sw_thermal_slowdown | Whether the software thermal slowdown throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_sync_boost | Whether the synchronous boost throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_throttle_hw_slowdown | Whether the hardware slowdown throttle reason is active; 1 = active, 0 = not active. | Gauge | All or Custom: Per GPU |
node_gpu_pcie_gen_current | Current PCIe link generation. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_gen_gpucurrent | Negotiated PCIe link generation reported by the GPU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_gen_max | Maximum PCIe link generation. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_gen_gpumax | Maximum PCIe link generation supported by the GPU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_gen_hostmax | Maximum PCIe link generation supported by the host. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_width_current | Current PCIe link width in lanes. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_pcie_width_max | Maximum PCIe link width in lanes. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_accounting_buffer_size | Process accounting statistics buffer size in KiB. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_utilization_jpeg_percent | JPEG engine utilization in percent. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_utilization_ofa_percent | OFA engine utilization in percent. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_encoder_session_count | Number of NVENC encoder sessions. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_encoder_average_fps | Encoder throughput in average frames per second. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_encoder_average_latency_us | Encoder latency in microseconds. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_device_memory | Corrected ECC error count (volatile) for device memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_dram | Corrected ECC error count (volatile) for DRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_register_file | Corrected ECC error count (volatile) for register file. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_l1_cache | Corrected ECC error count (volatile) for L1 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_l2_cache | Corrected ECC error count (volatile) for L2 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_texture_memory | Corrected ECC error count (volatile) for texture memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_cbu | Corrected ECC error count (volatile) for CBU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_sram | Corrected ECC error count (volatile) for SRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_volatile_total | Corrected ECC error count (volatile) for memory (total). | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_device_memory | Corrected ECC error count (aggregate lifetime) for device memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_dram | Corrected ECC error count (aggregate lifetime) for DRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_register_file | Corrected ECC error count (aggregate lifetime) for register file. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_l1_cache | Corrected ECC error count (aggregate lifetime) for L1 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_l2_cache | Corrected ECC error count (aggregate lifetime) for L2 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_texture_memory | Corrected ECC error count (aggregate lifetime) for texture memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_cbu | Corrected ECC error count (aggregate lifetime) for CBU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_sram | Corrected ECC error count (aggregate lifetime) for SRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_corrected_aggregate_total | Corrected ECC error count (aggregate lifetime) for memory (total). | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_device_memory | Uncorrected ECC error count (volatile) for device memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_dram | Uncorrected ECC error count (volatile) for DRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_register_file | Uncorrected ECC error count (volatile) for register file. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_l1_cache | Uncorrected ECC error count (volatile) for L1 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_l2_cache | Uncorrected ECC error count (volatile) for L2 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_texture_memory | Uncorrected ECC error count (volatile) for texture memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_cbu | Uncorrected ECC error count (volatile) for CBU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_sram | Uncorrected ECC error count (volatile) for SRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_total | Uncorrected ECC error count (volatile) for memory (total). | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_device_memory | Uncorrected ECC error count (aggregate lifetime) for device memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_dram | Uncorrected ECC error count (aggregate lifetime) for DRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_register_file | Uncorrected ECC error count (aggregate lifetime) for register file. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_l1_cache | Uncorrected ECC error count (aggregate lifetime) for L1 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_l2_cache | Uncorrected ECC error count (aggregate lifetime) for L2 cache. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_texture_memory | Uncorrected ECC error count (aggregate lifetime) for texture memory. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_cbu | Uncorrected ECC error count (aggregate lifetime) for CBU. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram | Uncorrected ECC error count (aggregate lifetime) for SRAM. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_total | Uncorrected ECC error count (aggregate lifetime) for memory (total). | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_sram_parity | Uncorrected ECC error count (volatile) for sram parity. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_volatile_sram_secded | Uncorrected ECC error count (volatile) for sram secded. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_parity | Uncorrected ECC error count (aggregate lifetime) for sram parity. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_secded | Uncorrected ECC error count (aggregate lifetime) for sram secded. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_threshold_exceeded | Uncorrected ECC error count (aggregate lifetime) for sram threshold exceeded. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_l2 | Uncorrected ECC error count (aggregate lifetime) for sram l2. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_sm | Uncorrected ECC error count (aggregate lifetime) for sram sm. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_mcu | Uncorrected ECC error count (aggregate lifetime) for sram mcu. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_pcie | Uncorrected ECC error count (aggregate lifetime) for sram pcie. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_ecc_uncorrected_aggregate_sram_other | Uncorrected ECC error count (aggregate lifetime) for sram other. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_retired_pages_sbe | Number of framebuffer pages retired after single-bit errors. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_retired_pages_dbe | Number of framebuffer pages retired after double-bit errors. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_correctable | Number of DRAM rows requiring remapping after correctable ECC. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_uncorrectable | Number of DRAM rows requiring remapping after uncorrectable ECC. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_histogram_max | Remapped-row histogram count for worst-case mapping impact. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_histogram_high | Remapped-row histogram count for high mapping impact. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_histogram_partial | Remapped-row histogram count for partial mapping impact. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_histogram_low | Remapped-row histogram count for low mapping impact. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_remapped_rows_histogram_none | Remapped-row histogram count for no measurable mapping impact. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_temperature_throttle_celsius | GPU throttle temperature threshold (t limit) in degrees Celsius. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_draw_average_watts | Average GPU power draw in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_draw_instant_watts | Instantaneous GPU power draw in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_enforced_power_limit_watts | Enforced GPU power limit in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_default_limit_watts | Default GPU power limit in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_min_limit_watts | Minimum GPU power limit setting in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_max_limit_watts | Maximum GPU power limit setting in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_draw_average_watts | Average module-level power draw in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_draw_instant_watts | Instantaneous module-level power draw in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_limit_watts | Module-level power limit in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_enforced_power_limit_watts | Enforced module-level power limit in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_default_limit_watts | Default module-level power limit in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_min_limit_watts | Minimum module-level power limit setting in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_module_power_max_limit_watts | Maximum module-level power limit setting in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_primary_floor_watts | Power-smoothing primary floor in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_secondary_floor_watts | Power-smoothing secondary floor in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_min_primary_activation_offset | Minimum activation offset for the primary power floor. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_min_primary_activation_point | Minimum activation point for the primary power floor. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_window_multiplier | Power-smoothing window multiplier. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_curr_secondary_floor_watts | Active profile secondary floor in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_curr_primary_act_win_multiplier | Active profile primary-floor activation-window multiplier. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_curr_primary_tar_win_multiplier | Active profile primary-floor target-window multiplier. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_curr_primary_act_offset | Primary-floor activation offset for the active smoothing profile. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_admin_secondary_floor_watts | Admin override secondary floor in watts. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_admin_primary_act_win_multiplier | Admin primary activation-window multiplier override. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_admin_primary_tar_win_multiplier | Admin primary target-window multiplier override. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_power_smoothing_admin_primary_act_offset | Admin primary activation offset. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_clocks_applications_gr_mhz | Requested application-driven graphics clock in megahertz. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_clocks_applications_mem_mhz | Requested application-driven memory clock in megahertz. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_clocks_default_applications_gr_mhz | Default application graphics clock in megahertz. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_clocks_default_applications_mem_mhz | Default application memory clock in megahertz. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_protected_memory_total_bytes | Total protected VRAM capacity in bytes. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_protected_memory_used_bytes | Protected VRAM used in bytes. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_protected_memory_free_bytes | Protected VRAM free in bytes. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_throttle_counter_sw_power_cap | Count of software power-cap clock-throttle events. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_throttle_counter_sync_boost | Count of synchronous-boost throttle events. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_throttle_counter_sw_thermal_slowdown | Count of software thermal slowdown events. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_throttle_counter_hw_thermal_slowdown | Count of hardware thermal slowdown events. | Gauge | All or Custom: Per GPU and Detailed |
node_gpu_throttle_counter_hw_power_brake_slowdown | Count of hardware power-brake slowdown events. | Gauge | All or Custom: Per GPU and Detailed |
gpu_serial | Board serial number. | Property | All or Custom: Per GPU |
driver_version | Installed NVIDIA driver version. | Property | All or Custom: Per GPU |
pci_bus_id | Full PCI bus identifier in domain:bus:device.function format. | Property | All or Custom: Per GPU |
pci_bus | PCI bus number. | Property | All or Custom: Per GPU |
pci_device | PCI device number on the bus. | Property | All or Custom: Per GPU |
pci_device_id | PCI vendor and device ID. | Property | All or Custom: Per GPU |
persistence_mode | Whether persistence mode is enabled. | Property | All or Custom: Per GPU |
pstate | Current GPU performance state (P0 is maximum performance). | Property | All or Custom: Per GPU |
pci_domain | PCI domain number. | Property | All or Custom: Per GPU and Detailed |
pci_base_class | PCI base class code. | Property | All or Custom: Per GPU and Detailed |
pci_sub_class | PCI sub-class code. | Property | All or Custom: Per GPU and Detailed |
pci_sub_device_id | PCI subsystem device ID. | Property | All or Custom: Per GPU and Detailed |
vgpu_driver_cap_heterogenous_multi_vgpu | Whether the driver supports heterogeneous multi-vGPU. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_fractional_multi_vgpu | Whether the device supports fractional multi-vGPU. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_heterogeneous_time_slice_profile | Whether the device supports heterogeneous time-slice profiles. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_heterogeneous_time_slice_sizes | Whether the device supports heterogeneous time-slice sizes. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_homogeneous_placements | Whether the device supports homogeneous vGPU placements. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_mig_time_slicing | Whether the device supports MIG time-slicing. | Property | All or Custom: Per GPU and Detailed |
vgpu_device_cap_mig_time_slicing_mode | Current MIG time-slicing mode. | Property | All or Custom: Per GPU and Detailed |
display_mode | Whether the display feature is enabled. | Property | All or Custom: Per GPU and Detailed |
display_attached | Whether a display is physically attached to this GPU. | Property | All or Custom: Per GPU and Detailed |
display_active | Whether a display is actively receiving output from this GPU. | Property | All or Custom: Per GPU and Detailed |
addressing_mode | Memory addressing mode. | Property | All or Custom: Per GPU and Detailed |
accounting_mode | Whether process accounting is enabled. | Property | All or Custom: Per GPU and Detailed |
driver_model_current | Active driver model; typically N/A on Linux. | Property | All or Custom: Per GPU and Detailed |
driver_model_pending | Pending driver model after reboot; typically N/A on Linux. | Property | All or Custom: Per GPU and Detailed |
vbios_version | Video BIOS version string. | Property | All or Custom: Per GPU and Detailed |
inforom_img | InfoROM image version. | Property | All or Custom: Per GPU and Detailed |
inforom_oem | InfoROM OEM object version. | Property | All or Custom: Per GPU and Detailed |
inforom_ecc | InfoROM ECC object version. | Property | All or Custom: Per GPU and Detailed |
inforom_pwr | InfoROM power management object version. | Property | All or Custom: Per GPU and Detailed |
inforom_checksum_validation | InfoROM data integrity check result. | Property | All or Custom: Per GPU and Detailed |
gpu_recovery_action | Recommended recovery action following a GPU error. | Property | All or Custom: Per GPU and Detailed |
reset_status_reset_required | Whether a GPU reset is required to clear an error condition. | Property | All or Custom: Per GPU and Detailed |
reset_status_drain_and_reset_recommended | Whether a drain-and-reset is the recommended recovery procedure. | Property | All or Custom: Per GPU and Detailed |
gom_current | Current GPU operation mode (All On, Compute, or Low DP). | Property | All or Custom: Per GPU and Detailed |
gom_pending | Pending GPU operation mode, applied after the next reboot. | Property | All or Custom: Per GPU and Detailed |
clocks_throttle_reasons_supported | Bitmask of clock throttle reasons supported by this GPU. | Property | All or Custom: Per GPU and Detailed |
clocks_throttle_reasons_active | Bitmask of clock throttle reasons currently active. | Property | All or Custom: Per GPU and Detailed |
compute_mode | Compute access mode (Default, Exclusive Thread, Prohibited, or Exclusive Process). | Property | All or Custom: Per GPU and Detailed |
compute_cap | CUDA compute capability in major.minor format. | Property | All or Custom: Per GPU and Detailed |
dram_encryption_mode_current | Current DRAM encryption mode. | Property | All or Custom: Per GPU and Detailed |
dram_encryption_mode_pending | Pending DRAM encryption mode, applied after the next reboot. | Property | All or Custom: Per GPU and Detailed |
ecc_mode_current | Current ECC mode. | Property | All or Custom: Per GPU and Detailed |
ecc_mode_pending | Pending ECC mode, applied after the next reboot. | Property | All or Custom: Per GPU and Detailed |
retired_pages_pending | Whether pending page retirements require a reboot to take effect. | Property | All or Custom: Per GPU and Detailed |
remapped_rows_pending | Whether pending row remappings require a reboot to take effect. | Property | All or Custom: Per GPU and Detailed |
remapped_rows_failure | Whether a row remapping failure has been recorded. | Property | All or Custom: Per GPU and Detailed |
power_management | Whether power management is supported for this GPU. | Property | All or Custom: Per GPU and Detailed |
power_smoothing_supported | Whether delayed power smoothing is supported. | Property | All or Custom: Per GPU and Detailed |
mig_mode_current | Current MIG (Multi-Instance GPU) mode. | Property | All or Custom: Per GPU and Detailed |
mig_mode_pending | Pending MIG mode, applied after the next reboot. | Property | All or Custom: Per GPU and Detailed |
gsp_mode_current | Current GSP firmware mode. | Property | All or Custom: Per GPU and Detailed |
gsp_mode_default | Default GSP firmware mode. | Property | All or Custom: Per GPU and Detailed |
c2c_mode | Current chip-to-chip interconnect (C2C) mode. | Property | All or Custom: Per GPU and Detailed |
fabric_state | NVLink fabric state. | Property | All or Custom: Per GPU and Detailed |
fabric_status | NVLink fabric status. | Property | All or Custom: Per GPU and Detailed |
fabric_clique_id | NVLink fabric clique identifier. | Property | All or Custom: Per GPU and Detailed |
fabric_cluster_uuid | NVLink fabric cluster unique identifier. | Property | All or Custom: Per GPU and Detailed |
fabric_health_summary | Overall NVLink fabric health summary. | Property | All or Custom: Per GPU and Detailed |
fabric_health_bandwidth | NVLink fabric bandwidth health status. | Property | All or Custom: Per GPU and Detailed |
fabric_health_route_recovery_in_progress | Whether NVLink fabric route recovery is in progress. | Property | All or Custom: Per GPU and Detailed |
fabric_health_route_unhealthy | Whether any NVLink fabric routes are unhealthy. | Property | All or Custom: Per GPU and Detailed |
fabric_health_access_timeout_recovery | Whether access-timeout recovery is in progress on the NVLink fabric. | Property | All or Custom: Per GPU and Detailed |
fabric_health_incorrect_configuration | Whether the NVLink fabric has an incorrect configuration. | Property | All or Custom: Per GPU and Detailed |
fabric_health_partition_assigned | Whether the NVLink fabric partition has been assigned. | Property | All or Custom: Per GPU and Detailed |
platform_chassis_serial_number | Serial number of the chassis this GPU is installed in. | Property | All or Custom: Per GPU and Detailed |
platform_slot_number | Chassis slot number for this GPU module. | Property | All or Custom: Per GPU and Detailed |
platform_tray_index | Tray index within the chassis for this GPU module. | Property | All or Custom: Per GPU and Detailed |
platform_host_id | Host identifier in the platform topology. | Property | All or Custom: Per GPU and Detailed |
platform_peer_type | Type of the platform peer connection. | Property | All or Custom: Per GPU and Detailed |
platform_module_id | Module identifier within the platform. | Property | All or Custom: Per GPU and Detailed |
platform_gpu_fabric_guid | NVLink fabric GUID assigned to this GPU in the platform topology. | Property | All or Custom: Per GPU and Detailed |
hostname | Hostname of the system as reported by nvidia-smi. | Property | All or Custom: Per GPU and Detailed |
timestamp | Collection timestamp fetched from nvidia-smi. Used internally to maintain CSV column alignment; not emitted on any event. | Query | Basic |
count | Total number of GPUs returned by nvidia-smi. Produces the node_gpu_count gauge on the aggregated event; not emitted as a separate field. | Query | Basic |