Billing and usageControlling Prometheus metrics usageAnalyzing Prometheus metrics usage with Grafana Explore

Analyzing Prometheus metrics usage with Grafana Explore

Begin by logging in to your Grafana Cloud organization and navigating to the Cloud Portal. From there, click Log In on your Grafana instance.

From the Grafana UI, navigate to Explore in the menu sidebar:

Explore in sidebar

You’ll see the Explore interface. Using the data sources dropdown, select the data source corresponding to your Cloud Prometheus metrics endpoint. Its name will be grafanacloud-your_stack_name-prom:

Data sources dropdown

Once you’ve selected the correct data source, change the time window for the query to Last 5 minutes:

Change the query time window

If you don’t do this, you’ll get an “expanding series: query must contain metric name” error, as Grafana Cloud limits the size of expensive queries.

Now that you’ve adjusted the time range for your query, enter the following PromQL query in the query toolbar:

topk(10, count by (__name__)({__name__=~".+"}))

Query toolbar

This query finds the 10 metrics with the highest cardinality.

Next, change the Query Type to Instant. Your metrics cardinalities are likely not changing over time, so we just need a snapshot of the current counts, and not a graph of metrics and their cardinalities over time.

When you’re done, hit SHIFT+ENTER or click on Run Query in the top right corner of your screen. You should see the a table with metrics and their corresponding cardinalities:

Query result table

You can adjust the 10 parameter in the PromQL query to any number or omit the topk operator entirely:

count by (__name__)({__name__=~".+"})

This will return a list of all metrics and their associated cardinalities. To learn more about these queries and PromQL, please see Querying Prometheus from the official Prometheus documentation.

From here, you can query any individual high-cardinality metric to drill down into all its different permutations. For example, the apiserver_request_duration_seconds_bucket metric above has 8294 different label combinations, so we can dig in by querying it. Ensure that Query type is still set to Instant or your query may time out:

Metrics query result

This returns a list of series for the apiserver_request_duration_seconds_bucket metric across all label values.

To count the dimensionality of a label, or the number of unique values for a given label, run the following query:

count(count by (label_name) (metric_name))

Be sure to replace label_name with the name of the label, and metric_name with the name of the metric. In the example above, this query would be:

count(count by (verb) (apiserver_request_duration_seconds_bucket))

To count the number of unique values for the HTTP verb label and apiserver_request_duration_seconds_bucket metric.

By digging into label dimensionality, you can identify high-cardinality labels that you can optimize by dropping, aggregating, or using recording rules. To learn more about this please see Reducing Prometheus metric usage.