Menu

Self-hosted Grafana Mimir integration for Grafana Cloud

The Grafana Mimir integration allows you to send self-monitoring metrics and logs from Grafana Mimir or GEM running in your Kubernetes cluster to Grafana Cloud.

This integration comes with pre-built dashboards to help monitor the health of your Mimir or GEM cluster and understand per-tenant usage and behavior.

Note: Coming soon: Pre-configured alerts to signal when your cluster is unhealthy or nearing capacity limits.

Install Self-hosted Grafana Mimir integration for Grafana Cloud

  1. In your Grafana Cloud instance, click Integrations and Connections (lightning bolt icon).
  2. Navigate to the Self-hosted Grafana Mimir tile and review the prerequisites. Then click Install integration.
  3. Once the integration is installed, follow the steps on the Configuration Details page to setup Grafana Agent and start sending Self-hosted Grafana Mimir metrics to your Grafana Cloud instance.

Post-install configuration for the Self-hosted Grafana Mimir integration

This integration is configured to work with the mimir-distributed Helm chart v2.x.x deployed on Kubernetes.

Starting from version 3.0.0, the Helm chart sends metrics to a Prometheus-compatible server and logs to a Loki cluster. Grafana Agent is automatically installed and configured by the Helm chart.

Be sure to modify the chart configuration to provide your Grafana Agent credentials.

For more information on setting up metrics and logs collection, refer to Self-hosted Grafana Mimir integration for Grafana Cloud.

Dashboards

The Self-hosted Grafana Mimir integration installs the following dashboards in your Grafana Cloud instance to help monitor your metrics.

  • Mimir / Alertmanager
  • Mimir / Alertmanager resources
  • Mimir / Compactor
  • Mimir / Compactor resources
  • Mimir / Config
  • Mimir / Object Store
  • Mimir / Overrides
  • Mimir / Overview
  • Mimir / Queries
  • Mimir / Reads
  • Mimir / Reads networking
  • Mimir / Reads resources
  • Mimir / Remote ruler reads
  • Mimir / Remote ruler reads resources
  • Mimir / Rollout progress
  • Mimir / Ruler
  • Mimir / Scaling
  • Mimir / Slow queries
  • Mimir / Tenants
  • Mimir / Top tenants
  • Mimir / Writes
  • Mimir / Writes networking
  • Mimir / Writes resources

Tenants

image

Writes resources

image

Metrics

The following metrics are automatically written to your Grafana Cloud instance through this integration:

  • cluster_job:cortex_alertmanager_alerts_invalid_total:rate5m
  • cluster_job:cortex_alertmanager_alerts_received_total:rate5m
  • cluster_job:cortex_alertmanager_partial_state_merges_failed_total:rate5m
  • cluster_job:cortex_alertmanager_partial_state_merges_total:rate5m
  • cluster_job:cortex_alertmanager_state_replication_failed_total:rate5m
  • cluster_job:cortex_alertmanager_state_replication_total:rate5m
  • cluster_job:cortex_ingester_queried_exemplars_bucket:sum_rate
  • cluster_job:cortex_ingester_queried_exemplars_count:sum_rate
  • cluster_job:cortex_ingester_queried_exemplars_sum:sum_rate
  • cluster_job:cortex_ingester_queried_samples_bucket:sum_rate
  • cluster_job:cortex_ingester_queried_samples_count:sum_rate
  • cluster_job:cortex_ingester_queried_samples_sum:sum_rate
  • cluster_job:cortex_ingester_queried_series_bucket:sum_rate
  • cluster_job:cortex_ingester_queried_series_count:sum_rate
  • cluster_job:cortex_ingester_queried_series_sum:sum_rate
  • cluster_job_integration:cortex_alertmanager_notifications_failed_total:rate5m
  • cluster_job_integration:cortex_alertmanager_notifications_total:rate5m
  • cluster_job_pod:cortex_alertmanager_alerts:sum
  • cluster_job_pod:cortex_alertmanager_silences:sum
  • cluster_job_route:cortex_querier_request_duration_seconds_bucket:sum_rate
  • cluster_job_route:cortex_querier_request_duration_seconds_count:sum_rate
  • cluster_job_route:cortex_querier_request_duration_seconds_sum:sum_rate
  • cluster_job_route:cortex_request_duration_seconds_bucket:sum_rate
  • cluster_job_route:cortex_request_duration_seconds_count:sum_rate
  • cluster_job_route:cortex_request_duration_seconds_sum:sum_rate
  • cluster_namespace_deployment:actual_replicas:count
  • cluster_namespace_deployment_reason:required_replicas:count
  • cluster_namespace_job:cortex_distributor_exemplars_in:rate5m
  • cluster_namespace_job:cortex_distributor_received_exemplars:rate5m
  • cluster_namespace_job:cortex_distributor_received_samples:rate5m
  • cluster_namespace_job:cortex_ingester_ingested_exemplars:rate5m
  • cluster_namespace_job:cortex_ingester_tsdb_exemplar_exemplars_appended:rate5m
  • container_cpu_usage_seconds_total
  • container_fs_writes_bytes_total
  • container_memory_rss
  • container_memory_usage_bytes
  • container_memory_working_set_bytes
  • container_network_receive_bytes_total
  • container_network_transmit_bytes_total
  • container_spec_cpu_period
  • container_spec_cpu_quota
  • container_spec_memory_limit_bytes
  • cortex_alertmanager_alerts
  • cortex_alertmanager_alerts_invalid_total
  • cortex_alertmanager_alerts_received_total
  • cortex_alertmanager_notification_latency_seconds_bucket
  • cortex_alertmanager_notification_latency_seconds_count
  • cortex_alertmanager_notification_latency_seconds_sum
  • cortex_alertmanager_notifications_failed_total
  • cortex_alertmanager_notifications_total
  • cortex_alertmanager_partial_state_merges_failed_total
  • cortex_alertmanager_partial_state_merges_total
  • cortex_alertmanager_ring_check_errors_total
  • cortex_alertmanager_silences
  • cortex_alertmanager_state_fetch_replica_state_failed_total
  • cortex_alertmanager_state_fetch_replica_state_total
  • cortex_alertmanager_state_initial_sync_completed_total
  • cortex_alertmanager_state_initial_sync_duration_seconds_bucket
  • cortex_alertmanager_state_initial_sync_duration_seconds_count
  • cortex_alertmanager_state_initial_sync_duration_seconds_sum
  • cortex_alertmanager_state_persist_failed_total
  • cortex_alertmanager_state_persist_total
  • cortex_alertmanager_state_replication_failed_total
  • cortex_alertmanager_state_replication_total
  • cortex_alertmanager_sync_configs_failed_total
  • cortex_alertmanager_sync_configs_total
  • cortex_alertmanager_tenants_discovered
  • cortex_alertmanager_tenants_owned
  • cortex_bucket_blocks_count
  • cortex_bucket_index_load_duration_seconds_bucket
  • cortex_bucket_index_load_duration_seconds_count
  • cortex_bucket_index_load_duration_seconds_sum
  • cortex_bucket_index_load_failures_total
  • cortex_bucket_index_loaded
  • cortex_bucket_index_loads_total
  • cortex_bucket_store_block_drop_failures_total
  • cortex_bucket_store_block_drops_total
  • cortex_bucket_store_block_load_failures_total
  • cortex_bucket_store_block_loads_total
  • cortex_bucket_store_blocks_loaded
  • cortex_bucket_store_indexheader_lazy_load_duration_seconds_bucket
  • cortex_bucket_store_indexheader_lazy_load_duration_seconds_count
  • cortex_bucket_store_indexheader_lazy_load_duration_seconds_sum
  • cortex_bucket_store_indexheader_lazy_load_total
  • cortex_bucket_store_indexheader_lazy_unload_total
  • cortex_bucket_store_series_blocks_queried_sum
  • cortex_bucket_store_series_data_fetched_sum
  • cortex_bucket_store_series_data_touched_sum
  • cortex_bucket_store_series_get_all_duration_seconds_bucket
  • cortex_bucket_store_series_get_all_duration_seconds_count
  • cortex_bucket_store_series_get_all_duration_seconds_sum
  • cortex_bucket_store_series_hash_cache_hits_total
  • cortex_bucket_store_series_hash_cache_requests_total
  • cortex_bucket_store_series_merge_duration_seconds_bucket
  • cortex_bucket_store_series_merge_duration_seconds_count
  • cortex_bucket_store_series_merge_duration_seconds_sum
  • cortex_bucket_store_series_result_series_count
  • cortex_bucket_store_series_result_series_sum
  • cortex_bucket_stores_gate_queries_in_flight
  • cortex_build_info
  • cortex_cache_fetched_keys
  • cortex_cache_hits
  • cortex_cache_memory_hits_total
  • cortex_cache_memory_requests_total
  • cortex_cache_request_duration_seconds_bucket
  • cortex_cache_request_duration_seconds_count
  • cortex_cache_request_duration_seconds_sum
  • cortex_compactor_block_cleanup_failures_total
  • cortex_compactor_blocks_cleaned_total
  • cortex_compactor_blocks_marked_for_deletion_total
  • cortex_compactor_last_successful_run_timestamp_seconds
  • cortex_compactor_meta_sync_duration_seconds_bucket
  • cortex_compactor_meta_sync_duration_seconds_count
  • cortex_compactor_meta_sync_duration_seconds_sum
  • cortex_compactor_meta_sync_failures_total
  • cortex_compactor_meta_syncs_total
  • cortex_compactor_runs_completed_total
  • cortex_compactor_runs_failed_total
  • cortex_compactor_runs_started_total
  • cortex_compactor_tenants_discovered
  • cortex_compactor_tenants_processing_failed
  • cortex_compactor_tenants_processing_succeeded
  • cortex_compactor_tenants_skipped
  • cortex_config_hash
  • cortex_discarded_exemplars_total
  • cortex_discarded_requests_total
  • cortex_discarded_samples_total
  • cortex_distributor_deduped_samples_total
  • cortex_distributor_exemplars_in_total
  • cortex_distributor_latest_seen_sample_timestamp_seconds
  • cortex_distributor_non_ha_samples_received_total
  • cortex_distributor_received_exemplars_total
  • cortex_distributor_received_requests_total
  • cortex_distributor_received_samples_total
  • cortex_distributor_replication_factor
  • cortex_distributor_requests_in_total
  • cortex_distributor_samples_in_total
  • cortex_frontend_query_range_duration_seconds_count
  • cortex_frontend_query_result_cache_attempted_total
  • cortex_frontend_query_result_cache_skipped_total
  • cortex_frontend_query_sharding_rewrites_attempted_total
  • cortex_frontend_query_sharding_rewrites_succeeded_total
  • cortex_frontend_sharded_queries_per_query_bucket
  • cortex_frontend_sharded_queries_per_query_count
  • cortex_frontend_sharded_queries_per_query_sum
  • cortex_frontend_split_queries_total
  • cortex_inflight_requests
  • cortex_ingester_active_series
  • cortex_ingester_active_series_custom_tracker
  • cortex_ingester_client_request_duration_seconds_bucket
  • cortex_ingester_client_request_duration_seconds_count
  • cortex_ingester_client_request_duration_seconds_sum
  • cortex_ingester_ingested_exemplars_total
  • cortex_ingester_ingested_samples_total
  • cortex_ingester_memory_series
  • cortex_ingester_memory_series_created_total
  • cortex_ingester_memory_series_removed_total
  • cortex_ingester_queried_exemplars_bucket
  • cortex_ingester_queried_exemplars_count
  • cortex_ingester_queried_exemplars_sum
  • cortex_ingester_queried_samples_bucket
  • cortex_ingester_queried_samples_count
  • cortex_ingester_queried_samples_sum
  • cortex_ingester_queried_series_bucket
  • cortex_ingester_queried_series_count
  • cortex_ingester_queried_series_sum
  • cortex_ingester_shipper_upload_failures_total
  • cortex_ingester_shipper_uploads_total
  • cortex_ingester_tsdb_checkpoint_creations_failed_total
  • cortex_ingester_tsdb_checkpoint_creations_total
  • cortex_ingester_tsdb_compaction_duration_seconds_bucket
  • cortex_ingester_tsdb_compaction_duration_seconds_count
  • cortex_ingester_tsdb_compaction_duration_seconds_sum
  • cortex_ingester_tsdb_compactions_failed_total
  • cortex_ingester_tsdb_compactions_total
  • cortex_ingester_tsdb_exemplar_exemplars_appended_total
  • cortex_ingester_tsdb_exemplar_exemplars_in_storage
  • cortex_ingester_tsdb_exemplar_last_exemplars_timestamp_seconds
  • cortex_ingester_tsdb_exemplar_series_with_exemplars_in_storage
  • cortex_ingester_tsdb_mmap_chunk_corruptions_total
  • cortex_ingester_tsdb_storage_blocks_bytes
  • cortex_ingester_tsdb_symbol_table_size_bytes
  • cortex_ingester_tsdb_wal_corruptions_total
  • cortex_ingester_tsdb_wal_truncate_duration_seconds_count
  • cortex_ingester_tsdb_wal_truncate_duration_seconds_sum
  • cortex_ingester_tsdb_wal_truncations_failed_total
  • cortex_ingester_tsdb_wal_truncations_total
  • cortex_kv_request_duration_seconds_bucket
  • cortex_kv_request_duration_seconds_count
  • cortex_kv_request_duration_seconds_sum
  • cortex_limits_defaults
  • cortex_limits_overrides
  • cortex_memcache_request_duration_seconds_bucket
  • cortex_memcache_request_duration_seconds_count
  • cortex_memcache_request_duration_seconds_sum
  • cortex_prometheus_notifications_dropped_total
  • cortex_prometheus_notifications_errors_total
  • cortex_prometheus_notifications_queue_capacity
  • cortex_prometheus_notifications_queue_length
  • cortex_prometheus_notifications_sent_total
  • cortex_prometheus_rule_evaluation_duration_seconds_count
  • cortex_prometheus_rule_evaluation_duration_seconds_sum
  • cortex_prometheus_rule_evaluation_failures_total
  • cortex_prometheus_rule_evaluations_total
  • cortex_prometheus_rule_group_duration_seconds_count
  • cortex_prometheus_rule_group_duration_seconds_sum
  • cortex_prometheus_rule_group_iterations_missed_total
  • cortex_prometheus_rule_group_rules
  • cortex_querier_blocks_consistency_checks_failed_total
  • cortex_querier_blocks_consistency_checks_total
  • cortex_querier_request_duration_seconds_bucket
  • cortex_querier_request_duration_seconds_count
  • cortex_querier_request_duration_seconds_sum
  • cortex_querier_storegateway_instances_hit_per_query_bucket
  • cortex_querier_storegateway_instances_hit_per_query_count
  • cortex_querier_storegateway_instances_hit_per_query_sum
  • cortex_querier_storegateway_refetches_per_query_bucket
  • cortex_querier_storegateway_refetches_per_query_count
  • cortex_querier_storegateway_refetches_per_query_sum
  • cortex_query_frontend_queue_duration_seconds_bucket
  • cortex_query_frontend_queue_duration_seconds_count
  • cortex_query_frontend_queue_duration_seconds_sum
  • cortex_query_frontend_queue_length
  • cortex_query_frontend_retries_bucket
  • cortex_query_frontend_retries_count
  • cortex_query_frontend_retries_sum
  • cortex_query_scheduler_queue_duration_seconds_bucket
  • cortex_query_scheduler_queue_duration_seconds_count
  • cortex_query_scheduler_queue_duration_seconds_sum
  • cortex_query_scheduler_queue_length
  • cortex_request_duration_seconds_bucket
  • cortex_request_duration_seconds_count
  • cortex_request_duration_seconds_sum
  • cortex_ruler_managers_total
  • cortex_runtime_config_hash
  • cortex_tcp_connections
  • cortex_tcp_connections_limit
  • go_memstats_heap_inuse_bytes
  • kube_deployment_spec_replicas
  • kube_deployment_status_replicas_unavailable
  • kube_deployment_status_replicas_updated
  • kube_persistentvolumeclaim_labels
  • kube_pod_container_info
  • kube_pod_container_resource_requests
  • kube_pod_container_resource_requests_cpu_cores
  • kube_pod_container_resource_requests_memory_bytes
  • kube_statefulset_replicas
  • kube_statefulset_status_replicas_current
  • kube_statefulset_status_replicas_ready
  • kube_statefulset_status_replicas_updated
  • kubelet_volume_stats_capacity_bytes
  • kubelet_volume_stats_used_bytes
  • memcached_limit_bytes
  • node_disk_read_bytes_total
  • node_disk_written_bytes_total
  • node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
  • prometheus_engine_query_duration_seconds
  • prometheus_tsdb_compaction_duration_seconds_bucket
  • prometheus_tsdb_compaction_duration_seconds_count
  • prometheus_tsdb_compaction_duration_seconds_sum
  • prometheus_tsdb_compactions_total
  • thanos_cache_memcached_hits_total
  • thanos_cache_memcached_requests_total
  • thanos_memcached_operation_duration_seconds_bucket
  • thanos_memcached_operation_duration_seconds_count
  • thanos_memcached_operation_duration_seconds_sum
  • thanos_memcached_operations_total
  • thanos_objstore_bucket_operation_duration_seconds_bucket
  • thanos_objstore_bucket_operation_duration_seconds_count
  • thanos_objstore_bucket_operation_duration_seconds_sum
  • thanos_objstore_bucket_operation_failures_total
  • thanos_objstore_bucket_operations_total
  • thanos_store_index_cache_hits_total
  • thanos_store_index_cache_requests_total

Cost

By connecting your Self-hosted Grafana Mimir instance to Grafana Cloud you might incur charges. To view information on the number of active series that your Grafana Cloud account uses for metrics included in each Cloud tier, see Active series and dpm usage and Cloud tier pricing.