---
title: "Analyze log costs with Grafana Explore | Grafana Cloud documentation"
description: "Learn how to analyze logs costs with Grafana Explore."
---

# Analyze log costs with Grafana Explore

Grafana Cloud provides a managed Loki environment for storing logs. Similar to metrics labels in Prometheus, Loki only indexes the log metadata using labels. This guide will help you analyze and understand logs usage in Grafana Cloud.

> Note
> 
> This process will work with any Grafana Loki install, not just within Grafana Cloud.

## Before you begin

To view and manage logs, you must have the following:

- A Grafana Cloud account
- Admin or Editor user permissions for the managed Grafana Cloud instance

## Limitations

If you expect queries to return a large number of results, you can use a smaller duration of time to avoid timeout errors as Grafana Cloud limits the size of expensive queries. For more information on setting query duration limits, see [Prometheus querying basics](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations). You can also aggregate the results along some label dimension. See [Aggregating Logs Usage](#aggregating-logs-usage%22) below.

If you would prefer to explore logs with a command line interface, see the [LogCLI documentation](/docs/loki/latest/query/logcli/).

## View log usage by log stream

In this section we’ll query for metrics about each log stream that contains the label `job`, by using wildcard regex.

1. Log in to your instance and click the **Explore** (compass) icon in the menu sidebar.
2. Use the data sources dropdown located at the top of the page to select the data source corresponding to your Grafana Cloud Logs endpoint. The data source name should be similar to `grafanacloud-<yourstackname>-logs`.

<!--THE END-->

1. Select **Instant** for the **Query Type**, and **Code** for the query-editor. Then use the following LogQL searches in the query toolbar to explore usage for your environment (logs generated by Synthetic Monitoring checks will also be returned):
   
   - Number of entries for each log stream over a five minute interval:
     
     Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
     
     ```logql
     count_over_time({job=~".+"}[5m])
     ```
     
     This count will be listed on the far right hand column of the results table under **Value #A**.
   - Bytes used by each log stream for the past five minutes:
     
     Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
     
     ```logql
     bytes_over_time({job=~".+"}[5m])
     ```
   - Count the number of entries within the last minute and return any job with greater than 100 log lines. Adjust the number of log lines as needed for more insight:
     
     Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
     
     ```logql
     count_over_time({job=~".+"}[1m]) > 100
     ```
   - Count of logs ingested over the past hour and specify `filename`, `host`, and `job` names if those labels exist:
     
     Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
     
     ```logql
     sum(count_over_time({job=~".+"}[1h])) by (filename, host, job)
     ```

> Note
> 
> The `job` label is used in these statements, but you can use other labels. If you are not sure which labels your environment might be using, click on the **Log browser** tab in **Explore** to review the available labels. If an expected label is missing, this is a good indication that these logs are not being successfully received by your Grafana Cloud environment.

1. Save your queries (optional).
   
   You can save a query in your query history to quickly access your favorites.
   
   You can also download the query results as a text file directly from Explore using Grafana Inspector. For more information, see [Grafana Inspector](/docs/grafana/latest/explore/explore-inspector/).

## Aggregating logs usage

You may wish to view your logs usage grouped by some dimension, such as an app name or team name, cluster or even log level. These dimensions don’t need to be in the label set. You may also wish to track usage over a long time period. You can use log lines from the Grafana Cloud Synthetic Monitoring app to in both of these cases because the queries should be easy to adapt for these purposes, specifically `bytes_over_time` [metric query](/docs/loki/latest/query/metric_queries/), which is the most relevant to Grafana Cloud costs.

### Aggregation by label

Aggregation by label is a great option if you have labeled log streams for apps, services, or teams. A typical Synthetic Monitoring log line looks like this:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
level=info target=https://grafana-assets.grafana.net probe=Amsterdam region=EMEA instance=https://grafana-assets.grafana.net job=grafanaAssetsCDN check_name=http source=synthetic-monitoring-agent label_env=production msg="Check succeeded" duration_seconds=0.242576593
```

We might want to determine the bytes ingested per probe, so we can use the following query:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
sum by (probe) (bytes_over_time({source="synthetic-monitoring-agent"} [1m]))
```

If a result comes up with an empty string under the `probe` column, then some of your queried log streams do not contain that label.

### Aggregation by log line content

Sometimes, you might need to aggregate something that you don’t have scraped into a label. Continuing our Synthetic Monitoring example, we don’t have the response message (`msg`) mapped into a label, since it is [potentially unbounded](/docs/loki/latest/get-started/labels/bp-labels/#label-values-must-always-be-bounded).

However, you may need to see logs ingested by message. You can do this by using `logfmt` to extract log fields for aggregation:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
sum by (msg) (bytes_over_time({source="synthetic-monitoring-agent"} | logfmt [1m]))
```

> **What if I get time series limit errors?**

`logfmt` is a very powerful tool, but since it creates a temporary log stream for every combination of log line fields, it can easily hit the time series limits. You can use `regexp` to ensure you don’t increase the cardinality of results too much before aggregation.

Moving away from our Synthetic Monitoring example, consider this log line:

`logger=context traceID=12ab34cd56ef userId=x orgId=y uname=grafanauser t=2022-12-19T23:22:18.825687729Z level=info msg="Request Completed" method=POST path=/api/ds/query status=200 remote_addr=127.0.0.6 time_ms=38 duration=38.275733ms size=1369 referer="https://grafanauser.grafana.net/d/89gh01ij/super-cool-dashboard?from=now-1h&orgId=1&refresh=10s&to=now" db_call_count=1 handler=/api/ds/query`

This is a typical log line you’ll see from the Grafana backend, and in many Go web services. But if you try to use`logfmt` to insert it into a metric query, it’ll quickly create hundreds of thousands of log streams due to parts like the timestamp, duration, size, and referrer.

For example, the following query might fail due to time series limit:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
sum by (logger) (bytes_over_time({app="grafana"} | logfmt | __error__="" [1m]))
```

To avoid these errors, you can replace `logfmt` and add a regular expression:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
sum by (logger) (bytes_over_time({app="grafana"} | regexp `logger=(?P<logger>[^ ]+)` | logger != "" [1m]))
```

Here, we take advantage of named capture groups to enrich to log labels with a single new label.

> **What if I STILL get time series limit errors?**

This depends on the use case. Query time series limits are set in the Loki configuration to help manage resources. Raising them can be an option, but should be done with caution. Alternatives include:

1. Filtering out some high cardinality log-streams. For example, the label-value with the highest cardinality: In this case, `logger=context` is associated with a large number of log streams, so we can use the selector
   
   Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
   
   ```logql
   {app="grafana"} != "logger=context" | regexp `logger=(?P<logger>[^ ]+)` | logger != ""
   ```
2. Adjusting the regex to get a subset of those fields: This query will select only those loggers which start with ‘A’ through ‘H’ (case insensitive).
   
   Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
   
   ```logql
   {app="grafana"} | regexp `logger=(?P<logger>[a-h]{1}[^ ]+)i` | logger != ""
   ```
3. Scrubbing one or more labels with `label_format`: This will reduce the number of time series in the result set. This scrubs the `pod` label, which often has a Kubernetes UID attached and inflates the number of time series created.
   
   Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy
   
   ```logql
   {app="grafana"} | label_format pod=`` | regexp `logger=(?P<logger>[a-h]{1}[^ ]+)i` | logger != ""
   ```

> **Why are you using such small time ranges?**

In short, these queries will almost always be fairly slow.

The example queries here have all considered a small time range to consider two cases:

1. sudden and recent spikes in usage
2. use in recording rules for continuous tracking

Running any of the queries over a long time range will be fairly slow. Since they require the querier to process and mutate every single line, it’s just not possible to get response times down to what you can get with time-series databases such as Prometheus. See below to read about setting up recording rules.

> **What if I don’t have a label or any log line content common across all of my log lines?**

Mixing the above approaches will be the only way. Suppose you’re using the following query:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
sum by (logger) (bytes_over_time({app="grafana"} | regexp `logger=(?P<logger>[^ ]+)` | logger != "" [1m]))
```

You can use the following stream selector to pull logs that don’t contain a logger, and configure another query to retrieve them:

Logql ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```logql
{app="grafana"} !~ `logger=[^ ]+)`
```

Once you’ve crafted another query to account for the other logs, make sure to check your work to prevent double counting. One approach would be use the empty value from one `sum by (<something>)` query to spot check the other.

### Recording rules for long-term tracking

You can use recording rules for very high-resolution usage attribution, usage attribution across multiple dimensions, and usage attribution over long periods of time. To create a recording rule:

1. Hover over the Grafana Alerting icon (bell).
2. Click **New Alert Rule**.
3. Use a descriptive name.
4. Select **Mimir or Loki Recording Rule**.

<!--THE END-->

1. Select the Loki data source you want to perform usage attribution on.

<!--THE END-->

1. Set the query.
2. If desired, set the **Namespace** and **Group**.

<!--THE END-->

1. Click **Save**.

<!--THE END-->

1. Return to Explore and query your hosted metrics with the alert name. You might have to wait a few moments for results to be returned.

> **What if I have a large volume of historical logs I want to attribute from before I set up the recording rules?**

If you’re using the query in a dashboard, optimize them by running them as ranges with a specific minimum interval, then sum results on visualization panel using transformations. We tested the this approach on up to a week of internal logs, totalling 23 PB, which took almost three minutes.

## Related Grafana Cloud resources

- [LogQL](/docs/grafana-cloud/connect-externally-hosted/data-sources/loki//)
- [How labels in Loki can make log queries faster and easier](/blog/2020/04/21/how-labels-in-loki-can-make-log-queries-faster-and-easier/)
- [How to alert on high cardinality data with Grafana Loki](/blog/2021/05/28/how-to-alert-on-high-cardinality-data-with-grafana-loki/)
- [Getting started with Loki](/go/webinar/loki-getting-started-emea/)
- [Get more and spend less with Grafana Loki for logs](/go/grafanaconline/2021/loki/)
- [Effective troubleshooting with Grafana Loki - Query basics](/go/webinar/effective-troubleshooting-with-grafana-loki/)
- [Introduction to Loki: Like Prometheus, but for Logs](/go/webinar/intro-to-loki-like-prometheus-but-for-logs/)
- [How to use LogQL range aggregations in Loki](/blog/2021/01/11/how-to-use-logql-range-aggregations-in-loki/)
