Visualization and monitoring solutions
Visualization and monitoring solutions  /  Monitor Kafka
Kafka logo

Monitor Kafka easily with Grafana

Easily monitor your deployment of Kafka, the popular open source distributed event streaming platform, with Grafana Cloud’s out-of-the-box monitoring solution. The Grafana Cloud forever-free tier includes 3 users and up to 10k metrics series to support your monitoring needs.

Key metrics included

jvm_gc_collection_seconds_sum
jvm_memory_bytes_max
jvm_memory_bytes_used
kafka_cluster_partition_underminisr
kafka_cluster_partition_underreplicated
kafka_connect_app_info
kafka_connect_connect_metrics_connection_count
kafka_connect_connect_metrics_failed_authentication_total
kafka_connect_connect_metrics_incoming_byte_rate
kafka_connect_connect_metrics_io_ratio
kafka_connect_connect_metrics_network_io_rate
kafka_connect_connect_metrics_outgoing_byte_rate
kafka_connect_connect_metrics_request_rate
kafka_connect_connect_metrics_response_rate
kafka_connect_connect_metrics_successful_authentication_rate
kafka_connect_connect_worker_metrics_connector_count
kafka_connect_connect_worker_metrics_connector_destroyed_task_count
kafka_connect_connect_worker_metrics_connector_failed_task_count
kafka_connect_connect_worker_metrics_connector_paused_task_count
kafka_connect_connect_worker_metrics_connector_running_task_count
kafka_connect_connect_worker_metrics_connector_startup_failure_total
kafka_connect_connect_worker_metrics_connector_startup_success_total
kafka_connect_connect_worker_metrics_connector_total_task_count
kafka_connect_connect_worker_metrics_connector_unassigned_task_count
kafka_connect_connect_worker_metrics_task_count
kafka_connect_connect_worker_metrics_task_startup_failure_total
kafka_connect_connect_worker_metrics_task_startup_success_total
kafka_connect_connect_worker_rebalance_metrics_rebalance_avg_time_ms
kafka_connect_connect_worker_rebalance_metrics_time_since_last_rebalance_ms
kafka_connect_connector_info
kafka_connect_connector_metrics
kafka_connect_connector_task_metrics_batch_size_avg
kafka_connect_connector_task_metrics_batch_size_max
kafka_connect_connector_task_metrics_offset_commit_avg_time_ms
kafka_connect_connector_task_metrics_offset_commit_success_percentage
kafka_connect_connector_task_metrics_pause_ratio
kafka_connect_connector_task_metrics_running_ratio
kafka_connect_sink_task_metrics_partition_count
kafka_connect_sink_task_metrics_put_batch_avg_time_ms
kafka_connect_sink_task_metrics_put_batch_max_time_ms
kafka_connect_source_task_metrics_poll_batch_avg_time_ms
kafka_connect_source_task_metrics_poll_batch_max_time_ms
kafka_connect_source_task_metrics_source_record_active_count_avg
kafka_connect_source_task_metrics_source_record_active_count_max
kafka_connect_source_task_metrics_source_record_poll_rate
kafka_connect_source_task_metrics_source_record_write_rate
kafka_connect_task_error_metrics_deadletterqueue_produce_requests
kafka_connect_task_error_metrics_total_errors_logged
kafka_connect_task_error_metrics_total_record_errors
kafka_connect_task_error_metrics_total_record_failures
kafka_connect_task_error_metrics_total_records_skipped
kafka_connect_task_error_metrics_total_retries
kafka_consumer_lag_millis
kafka_consumergroup_current_offset
kafka_consumergroup_uncommitted_offsets
kafka_controller_controllerstats_uncleanleaderelectionspersec
kafka_controller_kafkacontroller_activecontrollercount
kafka_controller_kafkacontroller_offlinepartitionscount
kafka_controller_kafkacontroller_preferredreplicaimbalancecount
kafka_coordinator_group_groupmetadatamanager_numgroups
kafka_coordinator_group_groupmetadatamanager_numgroupscompletingrebalance
kafka_coordinator_group_groupmetadatamanager_numgroupsdead
kafka_coordinator_group_groupmetadatamanager_numgroupsempty
kafka_coordinator_group_groupmetadatamanager_numgroupspreparingrebalance
kafka_coordinator_group_groupmetadatamanager_numgroupsstable
kafka_log_log_logendoffset
kafka_log_log_logstartoffset
kafka_log_log_size
kafka_network_acceptor_acceptorblockedpercent
kafka_network_requestchannel_requestqueuesize
kafka_network_requestchannel_responsequeuesize
kafka_network_requestmetrics_localtimems
kafka_network_requestmetrics_remotetimems
kafka_network_requestmetrics_requestqueuetimems
kafka_network_requestmetrics_requestspersec
kafka_network_requestmetrics_responsequeuetimems
kafka_network_requestmetrics_responsesendtimems
kafka_network_socketserver_networkprocessoravgidlepercent
kafka_schema_registry_jersey_metrics_request_latency_99
kafka_schema_registry_jersey_metrics_request_rate
kafka_schema_registry_jetty_metrics_connections_active
kafka_schema_registry_registered_count
kafka_schema_registry_schemas_created
kafka_server_brokertopicmetrics_bytesinpersec
kafka_server_brokertopicmetrics_bytesoutpersec
kafka_server_brokertopicmetrics_fetchmessageconversionspersec
kafka_server_brokertopicmetrics_messagesinpersec
kafka_server_brokertopicmetrics_producemessageconversionspersec
kafka_server_brokertopicmetrics_totalfetchrequestspersec
kafka_server_brokertopicmetrics_totalproducerequestspersec
kafka_server_kafkarequesthandlerpool_requesthandleravgidlepercent_total
kafka_server_kafkaserver_brokerstate
kafka_server_replicamanager_isrexpandspersec
kafka_server_replicamanager_isrshrinkspersec
kafka_server_replicamanager_leadercount
kafka_server_replicamanager_partitioncount
kafka_server_replicamanager_underreplicatedpartitions
kafka_server_sessionexpirelistener_zookeeperauthfailurespersec
kafka_server_sessionexpirelistener_zookeeperdisconnectspersec
kafka_server_sessionexpirelistener_zookeeperexpirespersec
kafka_server_sessionexpirelistener_zookeepersyncconnectspersec
kafka_server_socketservermetrics_connection_close_rate
kafka_server_socketservermetrics_connection_count
kafka_server_socketservermetrics_connection_creation_rate
kafka_server_socketservermetrics_connections
kafka_server_zookeeperclientmetrics_zookeeperrequestlatencyms
kafka_streams_stream_state_metrics_delete_latency_avg
kafka_streams_stream_state_metrics_delete_latency_max
kafka_streams_stream_state_metrics_delete_rate
kafka_streams_stream_state_metrics_fetch_latency_avg
kafka_streams_stream_state_metrics_fetch_rate
kafka_streams_stream_state_metrics_put_if_absent_latency_avg
kafka_streams_stream_state_metrics_put_if_absent_latency_max
kafka_streams_stream_state_metrics_put_if_absent_rate_rate
kafka_streams_stream_state_metrics_put_latency_avg
kafka_streams_stream_state_metrics_put_latency_max
kafka_streams_stream_state_metrics_put_rate
kafka_streams_stream_state_metrics_restore_latency_avg
kafka_streams_stream_state_metrics_restore_latency_max
kafka_streams_stream_state_metrics_restore_rate
kafka_streams_stream_thread_metrics_commit_latency_avg
kafka_streams_stream_thread_metrics_commit_latency_max
kafka_streams_stream_thread_metrics_poll_latency_avg
kafka_streams_stream_thread_metrics_poll_latency_max
kafka_streams_stream_thread_metrics_process_latency_avg
kafka_streams_stream_thread_metrics_process_latency_max
kafka_streams_stream_thread_metrics_punctuate_latency_avg
kafka_streams_stream_thread_metrics_punctuate_latency_max
kafka_topic_partition_current_offset
ksql_ksql_engine_query_stats_error_queries
ksql_ksql_engine_query_stats_liveness_indicator
ksql_ksql_engine_query_stats_messages_consumed_per_sec
ksql_ksql_engine_query_stats_messages_produced_per_sec
ksql_ksql_engine_query_stats_not_running_queries
ksql_ksql_engine_query_stats_num_active_queries
ksql_ksql_engine_query_stats_num_idle_queries
ksql_ksql_engine_query_stats_num_persistent_queries
ksql_ksql_engine_query_stats_pending_shutdown_queries
ksql_ksql_engine_query_stats_rebalancing_queries
ksql_ksql_engine_query_stats_running_queries
ksql_ksql_metrics_ksql_queries_query_status
process_cpu_seconds_total
zookeeper_avgrequestlatency
zookeeper_inmemorydatatree_nodecount
zookeeper_inmemorydatatree_watchcount
zookeeper_maxrequestlatency
zookeeper_minrequestlatency
zookeeper_numaliveconnections
zookeeper_outstandingrequests
zookeeper_quorumsize
zookeeper_status_quorumsize
zookeeper_ticktime

Key alerting rules included

KafkaOfflinePartitonCount (Critical)
KafkaUnderReplicatedPartitionCount (Critical)
KafkaActiveController (Critical)
KafkaUncleanLeaderElection (Critical)
KafkaISRExpandRate (Warning)
KafkaISRShrinkRate (Warning)
KafkaBrokerCount (Critical)
KafkaZookeeperSyncConnect (Warning)