Slide 5 of 12

Kafka integration

Kafka integration

The Kafka integration provides streaming platform monitoring for Apache Kafka clusters, including brokers, topics, and consumer groups.

What it’s forMonitoring broker health, consumer lag, and message throughput
Who uses itData engineers, platform teams, anyone running Kafka streaming infrastructure
Under the hoodCollects JMX metrics from Kafka brokers via the JMX exporter

Metrics collected

  • Broker — JVM, requests, replication
  • Topics — Messages in/out, partitions
  • Consumer — Lag, offset, group status
  • ZooKeeper — Connections, latency (if used)

Trade-offs

Best for: Apache Kafka clusters and streaming architectures

ProsCons
Pre-built dashboards: broker, topic, consumerJMX configuration required
Pre-built alerts: consumer lag, broker healthHigh cardinality with many topics
Partition distribution visibilityComplex metric space

Learning path

Deploy this integration step by step.

Kafka integration

Script

Kafka is complex, with brokers, topics, partitions, and consumer groups all working together. The integration gives you visibility into all of it. Broker metrics are your cluster health foundation: JVM memory and garbage collection, request handling rates, replication status. Kafka is Java-based, so JVM health directly impacts performance.

Topic metrics show message rates and partition counts. But consumer lag is the metric that matters most for most teams. Consumer lag tells you if your consumers are keeping up with producers. Growing lag means processing is falling behind, and eventually, that affects your applications. The integration surfaces lag prominently because it’s so critical.

Partition distribution shows if load is balanced across brokers. Imbalanced partitions mean some brokers work harder than others, a common source of unexpected failures.

Setup requires JMX access to Kafka brokers. If JMX isn’t enabled in your Kafka config, you’ll need to enable it. One warning about cardinality: environments with thousands of topics and consumer groups generate enormous numbers of time series. Consider filtering to the topics and groups that actually matter.

For traditional deployments, ZooKeeper metrics are included. If you’ve moved to KRaft (Kafka’s newer consensus mechanism), the metrics are different.

This integration is essential for production Kafka. Streaming reliability depends on catching problems early.