Visualization and monitoring solutions
Visualization and monitoring solutions  /  Monitor Grafana Mimir (self-hosted)
Grafana Mimir (self-hosted) logo

Monitor Grafana Mimir (self-hosted) easily with Grafana

Easily monitor your self-hosted instance of Grafana Mimir, an open source, horizontally scalable, highly available, multi-tenant TSDB for long-term storage for Prometheus, with Grafana Cloud’s out-of-the-box monitoring solution. The Grafana Cloud forever-free tier includes 3 users and up to 10k metrics series to support your monitoring needs.

Self-hosted Grafana Mimir monitoring dashboard overview example

Key metrics included

cluster_job:cortex_alertmanager_alerts_invalid_total:rate5m
cluster_job:cortex_alertmanager_alerts_received_total:rate5m
cluster_job:cortex_alertmanager_partial_state_merges_failed_total:rate5m
cluster_job:cortex_alertmanager_partial_state_merges_total:rate5m
cluster_job:cortex_alertmanager_state_replication_failed_total:rate5m
cluster_job:cortex_alertmanager_state_replication_total:rate5m
cluster_job:cortex_ingester_queried_exemplars_bucket:sum_rate
cluster_job:cortex_ingester_queried_exemplars_count:sum_rate
cluster_job:cortex_ingester_queried_exemplars_sum:sum_rate
cluster_job:cortex_ingester_queried_samples_bucket:sum_rate
cluster_job:cortex_ingester_queried_samples_count:sum_rate
cluster_job:cortex_ingester_queried_samples_sum:sum_rate
cluster_job:cortex_ingester_queried_series_bucket:sum_rate
cluster_job:cortex_ingester_queried_series_count:sum_rate
cluster_job:cortex_ingester_queried_series_sum:sum_rate
cluster_job_integration:cortex_alertmanager_notifications_failed_total:rate5m
cluster_job_integration:cortex_alertmanager_notifications_total:rate5m
cluster_job_pod:cortex_alertmanager_alerts:sum
cluster_job_pod:cortex_alertmanager_silences:sum
cluster_job_route:cortex_querier_request_duration_seconds_bucket:sum_rate
cluster_job_route:cortex_querier_request_duration_seconds_count:sum_rate
cluster_job_route:cortex_querier_request_duration_seconds_sum:sum_rate
cluster_job_route:cortex_request_duration_seconds_bucket:sum_rate
cluster_job_route:cortex_request_duration_seconds_count:sum_rate
cluster_job_route:cortex_request_duration_seconds_sum:sum_rate
cluster_namespace_deployment:actual_replicas:count
cluster_namespace_deployment_reason:required_replicas:count
cluster_namespace_job:cortex_distributor_exemplars_in:rate5m
cluster_namespace_job:cortex_distributor_received_exemplars:rate5m
cluster_namespace_job:cortex_distributor_received_samples:rate5m
cluster_namespace_job:cortex_ingester_ingested_exemplars:rate5m
cluster_namespace_job:cortex_ingester_tsdb_exemplar_exemplars_appended:rate5m
cluster_namespace_job_route:cortex_request_duration_seconds:99quantile
cluster_namespace_pod:cortex_ingester_ingested_samples_total:rate1m
container_cpu_usage_seconds_total
container_fs_writes_bytes_total
container_memory_rss
container_memory_usage_bytes
container_memory_working_set_bytes
container_network_receive_bytes_total
container_network_transmit_bytes_total
container_spec_cpu_period
container_spec_cpu_quota
container_spec_memory_limit_bytes
cortex_alertmanager_alerts
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_alerts_received_total
cortex_alertmanager_dispatcher_aggregation_groups
cortex_alertmanager_notification_latency_seconds_bucket
cortex_alertmanager_notification_latency_seconds_count
cortex_alertmanager_notification_latency_seconds_sum
cortex_alertmanager_notifications_failed_total
cortex_alertmanager_notifications_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_ring_check_errors_total
cortex_alertmanager_silences
cortex_alertmanager_state_fetch_replica_state_failed_total
cortex_alertmanager_state_fetch_replica_state_total
cortex_alertmanager_state_initial_sync_completed_total
cortex_alertmanager_state_initial_sync_duration_seconds_bucket
cortex_alertmanager_state_initial_sync_duration_seconds_count
cortex_alertmanager_state_initial_sync_duration_seconds_sum
cortex_alertmanager_state_persist_failed_total
cortex_alertmanager_state_persist_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_sync_configs_failed_total
cortex_alertmanager_sync_configs_total
cortex_alertmanager_tenants_discovered
cortex_alertmanager_tenants_owned
cortex_bucket_blocks_count
cortex_bucket_index_last_successful_update_timestamp_seconds
cortex_bucket_index_load_duration_seconds_bucket
cortex_bucket_index_load_duration_seconds_count
cortex_bucket_index_load_duration_seconds_sum
cortex_bucket_index_load_failures_total
cortex_bucket_index_loaded
cortex_bucket_index_loads_total
cortex_bucket_store_block_drop_failures_total
cortex_bucket_store_block_drops_total
cortex_bucket_store_block_load_failures_total
cortex_bucket_store_block_loads_total
cortex_bucket_store_blocks_loaded
cortex_bucket_store_indexheader_lazy_load_duration_seconds_bucket
cortex_bucket_store_indexheader_lazy_load_duration_seconds_count
cortex_bucket_store_indexheader_lazy_load_duration_seconds_sum
cortex_bucket_store_indexheader_lazy_load_total
cortex_bucket_store_indexheader_lazy_unload_total
cortex_bucket_store_series_batch_preloading_load_duration_seconds_sum
cortex_bucket_store_series_batch_preloading_wait_duration_seconds_sum
cortex_bucket_store_series_blocks_queried_sum
cortex_bucket_store_series_data_size_fetched_bytes_sum
cortex_bucket_store_series_data_size_touched_bytes_sum
cortex_bucket_store_series_hash_cache_hits_total
cortex_bucket_store_series_hash_cache_requests_total
cortex_bucket_store_series_request_stage_duration_seconds_bucket
cortex_bucket_store_series_request_stage_duration_seconds_count
cortex_bucket_store_series_request_stage_duration_seconds_sum
cortex_bucket_stores_blocks_last_successful_sync_timestamp_seconds
cortex_bucket_stores_tenants_synced
cortex_build_info
cortex_cache_fetched_keys
cortex_cache_hits
cortex_cache_memory_hits_total
cortex_cache_memory_requests_total
cortex_cache_request_duration_seconds_bucket
cortex_cache_request_duration_seconds_count
cortex_cache_request_duration_seconds_sum
cortex_compactor_block_cleanup_failures_total
cortex_compactor_block_cleanup_last_successful_run_timestamp_seconds
cortex_compactor_blocks_cleaned_total
cortex_compactor_blocks_marked_for_deletion_total
cortex_compactor_blocks_marked_for_no_compaction_total
cortex_compactor_group_compaction_runs_started_total
cortex_compactor_last_successful_run_timestamp_seconds
cortex_compactor_meta_sync_duration_seconds_bucket
cortex_compactor_meta_sync_duration_seconds_count
cortex_compactor_meta_sync_duration_seconds_sum
cortex_compactor_meta_sync_failures_total
cortex_compactor_meta_syncs_total
cortex_compactor_runs_completed_total
cortex_compactor_runs_failed_total
cortex_compactor_runs_started_total
cortex_compactor_tenants_discovered
cortex_compactor_tenants_processing_failed
cortex_compactor_tenants_processing_succeeded
cortex_compactor_tenants_skipped
cortex_config_hash
cortex_discarded_exemplars_total
cortex_discarded_requests_total
cortex_discarded_samples_total
cortex_distributor_deduped_samples_total
cortex_distributor_exemplars_in_total
cortex_distributor_inflight_push_requests
cortex_distributor_instance_limits
cortex_distributor_latest_seen_sample_timestamp_seconds
cortex_distributor_non_ha_samples_received_total
cortex_distributor_received_exemplars_total
cortex_distributor_received_requests_total
cortex_distributor_received_samples_total
cortex_distributor_replication_factor
cortex_distributor_requests_in_total
cortex_distributor_samples_in_total
cortex_frontend_query_range_duration_seconds_count
cortex_frontend_query_result_cache_attempted_total
cortex_frontend_query_result_cache_skipped_total
cortex_frontend_query_sharding_rewrites_attempted_total
cortex_frontend_query_sharding_rewrites_succeeded_total
cortex_frontend_sharded_queries_per_query_bucket
cortex_frontend_sharded_queries_per_query_count
cortex_frontend_sharded_queries_per_query_sum
cortex_frontend_split_queries_total
cortex_inflight_requests
cortex_ingester_active_series
cortex_ingester_active_series_custom_tracker
cortex_ingester_client_request_duration_seconds_bucket
cortex_ingester_client_request_duration_seconds_count
cortex_ingester_client_request_duration_seconds_sum
cortex_ingester_ingested_exemplars_total
cortex_ingester_ingested_samples_total
cortex_ingester_instance_limits
cortex_ingester_memory_series
cortex_ingester_memory_series_created_total
cortex_ingester_memory_series_removed_total
cortex_ingester_memory_users
cortex_ingester_oldest_unshipped_block_timestamp_seconds
cortex_ingester_queried_exemplars_bucket
cortex_ingester_queried_exemplars_count
cortex_ingester_queried_exemplars_sum
cortex_ingester_queried_samples_bucket
cortex_ingester_queried_samples_count
cortex_ingester_queried_samples_sum
cortex_ingester_queried_series_bucket
cortex_ingester_queried_series_count
cortex_ingester_queried_series_sum
cortex_ingester_shipper_upload_failures_total
cortex_ingester_shipper_uploads_total
cortex_ingester_tsdb_checkpoint_creations_failed_total
cortex_ingester_tsdb_checkpoint_creations_total
cortex_ingester_tsdb_checkpoint_deletions_failed_total
cortex_ingester_tsdb_compaction_duration_seconds_bucket
cortex_ingester_tsdb_compaction_duration_seconds_count
cortex_ingester_tsdb_compaction_duration_seconds_sum
cortex_ingester_tsdb_compactions_failed_total
cortex_ingester_tsdb_compactions_total
cortex_ingester_tsdb_exemplar_exemplars_appended_total
cortex_ingester_tsdb_exemplar_exemplars_in_storage
cortex_ingester_tsdb_exemplar_last_exemplars_timestamp_seconds
cortex_ingester_tsdb_exemplar_series_with_exemplars_in_storage
cortex_ingester_tsdb_head_truncations_failed_total
cortex_ingester_tsdb_mmap_chunk_corruptions_total
cortex_ingester_tsdb_storage_blocks_bytes
cortex_ingester_tsdb_symbol_table_size_bytes
cortex_ingester_tsdb_wal_corruptions_total
cortex_ingester_tsdb_wal_truncate_duration_seconds_count
cortex_ingester_tsdb_wal_truncate_duration_seconds_sum
cortex_ingester_tsdb_wal_truncations_failed_total
cortex_ingester_tsdb_wal_truncations_total
cortex_ingester_tsdb_wal_writes_failed_total
cortex_kv_request_duration_seconds_bucket
cortex_kv_request_duration_seconds_count
cortex_kv_request_duration_seconds_sum
cortex_limits_defaults
cortex_limits_overrides
cortex_memcache_request_duration_seconds_bucket
cortex_memcache_request_duration_seconds_count
cortex_memcache_request_duration_seconds_sum
cortex_prometheus_notifications_dropped_total
cortex_prometheus_notifications_errors_total
cortex_prometheus_notifications_queue_capacity
cortex_prometheus_notifications_queue_length
cortex_prometheus_notifications_sent_total
cortex_prometheus_rule_evaluation_duration_seconds_count
cortex_prometheus_rule_evaluation_duration_seconds_sum
cortex_prometheus_rule_evaluation_failures_total
cortex_prometheus_rule_evaluations_total
cortex_prometheus_rule_group_duration_seconds_count
cortex_prometheus_rule_group_duration_seconds_sum
cortex_prometheus_rule_group_iterations_missed_total
cortex_prometheus_rule_group_iterations_total
cortex_prometheus_rule_group_rules
cortex_querier_blocks_consistency_checks_failed_total
cortex_querier_blocks_consistency_checks_total
cortex_querier_blocks_last_successful_scan_timestamp_seconds
cortex_querier_request_duration_seconds_bucket
cortex_querier_request_duration_seconds_count
cortex_querier_request_duration_seconds_sum
cortex_querier_storegateway_instances_hit_per_query_bucket
cortex_querier_storegateway_instances_hit_per_query_count
cortex_querier_storegateway_instances_hit_per_query_sum
cortex_querier_storegateway_refetches_per_query_bucket
cortex_querier_storegateway_refetches_per_query_count
cortex_querier_storegateway_refetches_per_query_sum
cortex_query_frontend_queries_total
cortex_query_frontend_queue_duration_seconds_bucket
cortex_query_frontend_queue_duration_seconds_count
cortex_query_frontend_queue_duration_seconds_sum
cortex_query_frontend_queue_length
cortex_query_frontend_retries_bucket
cortex_query_frontend_retries_count
cortex_query_frontend_retries_sum
cortex_query_scheduler_queue_duration_seconds_bucket
cortex_query_scheduler_queue_duration_seconds_count
cortex_query_scheduler_queue_duration_seconds_sum
cortex_query_scheduler_queue_length
cortex_request_duration_seconds_bucket
cortex_request_duration_seconds_count
cortex_request_duration_seconds_sum
cortex_ring_members
cortex_ruler_managers_total
cortex_ruler_queries_failed_total
cortex_ruler_queries_total
cortex_ruler_ring_check_errors_total
cortex_ruler_write_requests_failed_total
cortex_ruler_write_requests_total
cortex_runtime_config_hash
cortex_runtime_config_last_reload_successful
cortex_tcp_connections
cortex_tcp_connections_limit
go_memstats_heap_inuse_bytes
keda_metrics_adapter_scaler_errors
keda_metrics_adapter_scaler_metrics_value
kube_deployment_spec_replicas
kube_deployment_status_replicas_unavailable
kube_deployment_status_replicas_updated
kube_horizontalpodautoscaler_spec_target_metric
kube_horizontalpodautoscaler_status_condition
kube_persistentvolumeclaim_labels
kube_pod_container_info
kube_pod_container_resource_requests
kube_pod_container_resource_requests_cpu_cores
kube_pod_container_resource_requests_memory_bytes
kube_statefulset_replicas
kube_statefulset_status_current_revision
kube_statefulset_status_replicas_current
kube_statefulset_status_replicas_ready
kube_statefulset_status_replicas_updated
kube_statefulset_status_update_revision
kubelet_volume_stats_capacity_bytes
kubelet_volume_stats_used_bytes
memberlist_client_cluster_members_count
memcached_limit_bytes
mimir_continuous_test_queries_failed_total
mimir_continuous_test_query_result_checks_failed_total
mimir_continuous_test_writes_failed_total
node_disk_read_bytes_total
node_disk_written_bytes_total
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
process_memory_map_areas
process_memory_map_areas_limit
process_start_time_seconds
prometheus_tsdb_compaction_duration_seconds_bucket
prometheus_tsdb_compaction_duration_seconds_count
prometheus_tsdb_compaction_duration_seconds_sum
prometheus_tsdb_compactions_total
rollout_operator_last_successful_group_reconcile_timestamp_seconds
test_exporter_test_case_result_total
thanos_cache_hits_total
thanos_cache_memcached_hits_total
thanos_cache_memcached_requests_total
thanos_cache_operation_duration_seconds_bucket
thanos_cache_operation_duration_seconds_count
thanos_cache_operation_duration_seconds_sum
thanos_cache_operation_failures_total
thanos_cache_operations_total
thanos_cache_requests_total
thanos_memcached_operation_duration_seconds_bucket
thanos_memcached_operation_duration_seconds_count
thanos_memcached_operation_duration_seconds_sum
thanos_memcached_operation_failures_total
thanos_memcached_operations_total
thanos_objstore_bucket_last_successful_upload_time
thanos_objstore_bucket_operation_duration_seconds_bucket
thanos_objstore_bucket_operation_duration_seconds_count
thanos_objstore_bucket_operation_duration_seconds_sum
thanos_objstore_bucket_operation_failures_total
thanos_objstore_bucket_operations_total
thanos_shipper_last_successful_upload_time
thanos_store_index_cache_hits_total
thanos_store_index_cache_requests_total

Key alerting rules included

MimirIngesterUnhealthy
MimirRequestErrors
MimirRequestLatency
MimirQueriesIncorrect
MimirInconsistentRuntimeConfig
MimirBadRuntimeConfig
MimirFrontendQueriesStuck
MimirSchedulerQueriesStuck
MimirCacheRequestErrors
MimirIngesterRestarts
MimirKVStoreFailure
MimirMemoryMapAreasTooHigh
MimirIngesterInstanceHasNoTenants
MimirRulerInstanceHasNoRuleGroups
MimirRingMembersMismatch
MimirIngesterReachingSeriesLimit (Warning)
MimirIngesterReachingSeriesLimit (Critical)
MimirIngesterReachingTenantsLimit (Warning)
MimirIngesterReachingTenantsLimit (Critical)
MimirReachingTCPConnectionsLimit
MimirDistributorReachingInflightPushRequestLimit
MimirRolloutStuck (Warning)
MimirRolloutStuck (Critical)
RolloutOperatorNotReconciling
MimirProvisioningTooManyActiveSeries
MimirProvisioningTooManyWrites
MimirAllocatingTooMuchMemory (Warning)
MimirAllocatingTooMuchMemory (Critical)
MimirRulerTooManyFailedPushes
MimirRulerTooManyFailedQueries
MimirRulerMissedEvaluations
MimirRulerFailedRingCheck
MimirRulerRemoteEvaluationFailing
MimirGossipMembersMismatch
EtcdAllocatingTooMuchMemory (Warning)
EtcdAllocatingTooMuchMemory (Critical)
MimirAlertmanagerSyncConfigsFailing
MimirAlertmanagerRingCheckFailing
MimirAlertmanagerPartialStateMergeFailing
MimirAlertmanagerReplicationFailing
MimirAlertmanagerPersistStateFailing
MimirAlertmanagerInitialSyncFailed
MimirAlertmanagerAllocatingTooMuchMemory (Warning)
MimirAlertmanagerAllocatingTooMuchMemory (Critical)
MimirAlertmanagerInstanceHasNoTenants
MimirIngesterHasNotShippedBlocks
MimirIngesterHasNotShippedBlocksSinceStart
MimirIngesterHasUnshippedBlocks
MimirIngesterTSDBHeadCompactionFailed
MimirIngesterTSDBHeadTruncationFailed
MimirIngesterTSDBCheckpointCreationFailed
MimirIngesterTSDBCheckpointDeletionFailed
MimirIngesterTSDBWALTruncationFailed
MimirIngesterTSDBWALCorrupted
MimirIngesterTSDBWALCorrupted
MimirIngesterTSDBWALWritesFailed
MimirQuerierHasNotScanTheBucket
MimirStoreGatewayHasNotSyncTheBucket
MimirStoreGatewayNoSyncedTenants
MimirBucketIndexNotUpdated
MimirCompactorHasNotSuccessfullyCleanedUpBlocks
MimirCompactorHasNotSuccessfullyRunCompaction
MimirCompactorHasNotUploadedBlocks
MimirCompactorSkippedBlocksWithOutOfOrderChunks
MimirAutoscalerNotActive
MimirAutoscalerKedaFailing
MimirContinuousTestNotRunningOnWrites
MimirContinuousTestNotRunningOnReads
MimirContinuousTestFailed