Menu

Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.

Open source

Observing Grafana Loki

Both Grafana Loki and Promtail expose a /metrics endpoint that expose Prometheus metrics. You will need a local Prometheus and add Loki and Promtail as targets. See configuring Prometheus for more information.

All components of Loki expose the following metrics:

Metric NameMetric TypeDescription
loki_log_messages_totalCounterDEPRECATED. Use internal_log_messages_total for the same functionality. Total number of log messages created by loki itself.
loki_internal_log_messages_totalCounterTotal number of log messages created by loki itself.
loki_request_duration_secondsHistogramNumber of received HTTP requests.

The Loki Distributors expose the following metrics:

Metric NameMetric TypeDescription
loki_distributor_ingester_appends_totalCounterThe total number of batch appends sent to ingesters.
loki_distributor_ingester_append_failures_totalCounterThe total number of failed batch appends sent to ingesters.
loki_distributor_bytes_received_totalCounterThe total number of uncompressed bytes received per both tenant and retention hours.
loki_distributor_lines_received_totalCounterThe total number of log entries received per tenant (not necessarily of lines, as an entry can have more than one line of text).

The Loki Ingesters expose the following metrics:

Metric NameMetric TypeDescription
cortex_ingester_flush_queue_lengthGaugeThe total number of series pending in the flush queue.
loki_chunk_store_index_entries_per_chunkHistogramNumber of index entries written to storage per chunk.
loki_ingester_memory_chunksGaugeThe total number of chunks in memory.
loki_ingester_memory_streamsGaugeThe total number of streams in memory.
loki_ingester_chunk_age_secondsHistogramDistribution of chunk ages when flushed.
loki_ingester_chunk_encode_time_secondsHistogramDistribution of chunk encode times.
loki_ingester_chunk_entriesHistogramDistribution of lines per-chunk when flushed.
loki_ingester_chunk_size_bytesHistogramDistribution of chunk sizes when flushed.
loki_ingester_chunk_utilizationHistogramDistribution of chunk utilization (filled uncompressed bytes vs maximum uncompressed bytes) when flushed.
loki_ingester_chunk_compression_ratioHistogramDistribution of chunk compression ratio when flushed.
loki_ingester_chunk_stored_bytes_totalCounterTotal bytes stored in chunks per tenant.
loki_ingester_chunks_created_totalCounterThe total number of chunks created in the ingester.
loki_ingester_chunks_stored_totalCounterTotal stored chunks per tenant.
loki_ingester_received_chunksCounterThe total number of chunks sent by this ingester whilst joining during the handoff process.
loki_ingester_samples_per_chunkHistogramThe number of samples in a chunk.
loki_ingester_sent_chunksCounterThe total number of chunks sent by this ingester whilst leaving during the handoff process.
loki_ingester_streams_created_totalCounterThe total number of streams created per tenant.
loki_ingester_streams_removed_totalCounterThe total number of streams removed per tenant.

The Loki compactor exposes the following metrics:

Metric NameMetric TypeDescription
loki_compactor_delete_requests_processed_totalCounterNumber of delete requests processed per user.
loki_compactor_delete_requests_chunks_selected_totalCounterNumber of chunks selected while building delete plans per user.
loki_compactor_delete_processing_fails_totalCounterNumber of times the delete phase of compaction has failed.
loki_compactor_load_pending_requests_attempts_totalCounterNumber of attempts that were made to load pending requests with status.
loki_compactor_oldest_pending_delete_request_age_secondsGaugeAge of oldest pending delete request in seconds since they are over their cancellation period.
loki_compactor_pending_delete_requests_countGaugeCount of delete requests which are over their cancellation period and have not finished processing yet.
loki_compactor_deleted_linesCounterNumber of deleted lines per user.

Promtail exposes these metrics:

Metric NameMetric TypeDescription
promtail_read_bytes_totalGaugeNumber of bytes read.
promtail_read_lines_totalCounterNumber of lines read.
promtail_dropped_bytes_totalCounterNumber of bytes dropped because failed to be sent to the ingester after all retries.
promtail_dropped_entries_totalCounterNumber of log entries dropped because failed to be sent to the ingester after all retries.
promtail_encoded_bytes_totalCounterNumber of bytes encoded and ready to send.
promtail_file_bytes_totalGaugeNumber of bytes read from files.
promtail_files_active_totalGaugeNumber of active files.
promtail_request_duration_secondsHistogramNumber of send requests.
promtail_sent_bytes_totalCounterNumber of bytes sent.
promtail_sent_entries_totalCounterNumber of log entries sent to the ingester.
promtail_targets_active_totalGaugeNumber of total active targets.
promtail_targets_failed_totalCounterNumber of total failed targets.

Most of these metrics are counters and should continuously increase during normal operations:

  1. Your app emits a log line to a file that is tracked by Promtail.
  2. Promtail reads the new line and increases its counters.
  3. Promtail forwards the log line to a Loki distributor, where the received counters should increase.
  4. The Loki distributor forwards the log line to a Loki ingester, where the request duration counter should increase.

If Promtail uses any pipelines with metrics stages, those metrics will also be exposed by Promtail at its /metrics endpoint. See Promtail’s documentation on Pipelines for more information.

An example Grafana dashboard was built by the community and is available as dashboard 10004.

Metrics cardinality

Some of the Loki observability metrics are emitted per tracked file (active), with the file path included in labels. This increases the quantity of label values across the environment, thereby increasing cardinality. Best practices with Prometheus labels discourage increasing cardinality in this way. Review your emitted metrics before scraping with Prometheus, and configure the scraping to avoid this issue.

Mixins

The Loki repository has a mixin that includes a set of dashboards, recording rules, and alerts. Together, the mixin gives you a comprehensive package for monitoring Loki in production.

For more information about mixins, take a look at the docs for the monitoring-mixins project.