Application Observability required metrics and labels
Grafana Cloud

Application Observability required metrics and labels

Application Observability uses metrics and labels to present data. This guide provides an overview of how Application Observability functions.

Metric names can differ depending on whether span/service metrics are generated using OTEL connectors on the Alloy/OTEL collector or if these metrics are generated using Tempo. Label names are consistent between the two options.

Application Observability uses the following common labels:

  • job: This label identifies services. job is the concatenation of $service.namespace/$service.name or just equal to service.name when the service.namespace attribute is not present.
  • deployment_environment: This label allows you to filter by environment, for example: “prod”, “dev”, and “ops." You can change the exact name of this label via configuration. We strongly recommend that you use this label on all metrics. It is required for the baselines feature to work.

Span metrics and service graph metrics

Application Observability generates metrics from traces or through Beyla. Regardless of origin, these metrics drive all of the main views in the user interface.

Note

You can use any additional labels not mentioned here for filtering or grouping metrics in the user interface.

Metric nameMandatoryDescriptionMandatory labelsRecommended labels

target_info (Alloy, OTEL Collector)

traces_target_info (Tempo, Beyla)

yesThis metric stores resource attributes. Service inventory and service metadata are derived from it.job

telemetry_sdk_*: Determines service technology and SDK metadata

deployment_environment: Allows filter by environment

k8s_*: Determines kubernetes metadata

emb_*: Determines embrace metadata

gf_feo11y_*: Determines frontend o11y metadata

cloud_* - Determines cloud metadata

traces_spanmetrics_latency_count, traces_spanmetrics_latency_sum, traces_spanmetrics_latency_bucket (classic histogram for Tempo, Beyla)

OR

traces_spanmetrics_latency (native histogram for Tempo, Beyla)

OR

traces_span_metrics_duration_seconds_count, traces_span_metrics_duration_seconds_sum, traces_span_metrics_duration_seconds_bucket (classic histogram for OTEL Collector >= v0.109, Grafana Alloy >= v1.5.0)

OR

traces_span_metrics_duration_seconds (native histogram for OTEL Collector >= v0.109, Grafana Alloy >= v1.5.0)

OR

duration_seconds_count, duration_seconds_sum, duration_seconds_bucket (classic histogram for OTEL Collector v0.94 to v0.108, Grafana Alloy v1.0 to v1.4.3, Grafana Agent >= v0.40)

OR

duration_seconds (native histogram for OTEL Collector v0.94 to v0.108, Grafana Alloy v1.0 to v1.4.3, Grafana Agent >= v0.40)

yesThese metrics power RED metric panels and baselines. They are necessary for Application Observability.

job

span_kind: Distinguishes incoming from outgoing requests

status_code: Determines if a request was successful or not; used for errors panel

le: Defines the upper bound of a histogram bucket

deployment_environment: Allows filter by environment

span_name: Defines operation name, for example, HTTP endpoint or RPC function. Per operation breakdowns won’t work without this label.

traces_service_graph_request_total, traces_service_graph_request_failed_total

noThese metrics indicate service graph request totals. Service maps and inbound/outbound panels won’t work without them. Application Observability also uses them to derive uninstrumented services. You can disable service graph generation to reduce the number of metric series.

client

client_service_namespace

server

server_service_namespace

Service graph metrics don’t have job labels. Application Observability parses the job label from other metrics to derive service namespace and name, which are then used to match client name and namespace, or server name and namespace respectively when querying service graph metrics.

client_deployment_environment: Allows filter by environment

server_deployment_environment: Allows filter by environment

connection_type: Determines if a service or a database is instrumented or not

traces_service_graph_request_client_seconds_bucket, traces_service_graph_request_client_seconds_count, traces_service_graph_request_client_seconds_sum, traces_service_graph_request_server_seconds_bucket, traces_service_graph_request_server_seconds_count, traces_service_graph_request_server_seconds_sum (classic histogram)

OR

traces_service_graph_request_client_seconds, traces_service_graph_request_server_seconds (native histogram)

noThese metrics determine service graph request latency histograms. Service maps and inbound/outbound panels won’t work without them. Application Observability also uses them to derive uninstrumented services. You can disable service graph generation to reduce the number of metric series.

client

client_service_namespace

server

server_service_namespace

le: Defines the upper bound of a histogram bucket

Service graph metrics don’t have job labels. Application Observability parses the job label from other metrics to derive service namespace and name, which are then used to match client name and namespace, or server name and namespace respectively when querying service graph metrics.

client_deployment_environment: Allows filter by environment

server_deployment_environment: Allows filter by environment

connection_type: Determines if a service or a database is instrumented or not

Host info metrics

Application Observability requires the host info metric to calculate the number of host hours for billing. It should produce a series per host that is sending Application Observability telemetry. You will lose access to Application Observability if no host info metric series are present for the last 30 days.

Metric nameMandatoryRequired labels
traces_host_infoyesgrafana_host_id: unique identifier of a host, for example, a k8s node

Runtime metrics

Application Observability uses runtime metrics to drive runtime dashboards for JVM, .NET, and Golang.

Note

You can use any additional labels not mentioned here for filtering metrics in the user interface.

Metric nameRuntimeRequired labelsRecommended labels

jvm_class_count

process_runtime_jvm_classes_current_loaded

jvm_classes_loaded

jvm_memory_used

jvm_memory_used_bytes

jvm_memory_limit

jvm_memory_limit_bytes

process_runtime_jvm_memory_usage

process_runtime_jvm_memory_usage_bytes

process_runtime_jvm_memory_limit

process_runtime_jvm_memory_limit_bytes

jvm_memory_max

jvm_memory_max_bytes

jvm_gc_duration_sum

jvm_gc_duration_seconds_sum

process_runtime_jvm_gc_duration_sum

process_runtime_jvm_gc_duration_seconds_sum

jvm_gc_pause_sum

jvm_gc_pause_seconds_sum

jvm_gc_pause_milliseconds_sum

jvm_cpu_recent_utilization

jvm_cpu_recent_utilization_ratio

process_runtime_jvm_system_cpu_utilization

process_runtime_jvm_system_cpu_utilization_ratio

system_cpu_usage

jvm_thread_count

process_runtime_jvm_threads_count

jvm_threads_live

JVMjobinstance: Correlates CPU/memory usage to a particular instance

process_runtime_go_mem_live_objects

process_runtime_go_mem_heap_sys

process_runtime_go_mem_heap_alloc

process_runtime_go_mem_heap_alloc_bytes

process_runtime_go_mem_heap_idle

process_runtime_go_mem_heap_idle_bytes

process_runtime_go_mem_heap_inuse

process_runtime_go_mem_heap_inuse_bytes

process_runtime_go_mem_heap_released

process_runtime_go_mem_heap_released_bytes

process_runtime_go_mem_lookups

process_runtime_go_mem_lookups_total

process_runtime_go_mem_heap_objects

process_runtime_go_goroutines

process_runtime_go_gc_count

process_runtime_go_gc_count_total

process_runtime_go_cgo_calls

Golangjobinstance: Correlates CPU/memory usage to a particular instance

process_runtime_dotnet_gc_objects_size

process_runtime_dotnet_gc_objects_size_bytes

process_threads

process_thread_count

process_cpu_time

process_cpu_time_seconds_total

process_memory_usage

process_memory_usage_bytes

.NETjobinstance: Correlates CPU/memory usage to a particular instance