Menu
Grafana Cloud

Apache Cassandra integration for Grafana Cloud

Apache Cassandra is an open source NoSQL distributed database. This integration for Grafana Cloud allows users to collect metrics and system logs for monitoring an Apache Cassandra instance or clustered deployment.

Metrics include number of nodes in a cluster, virtual memory and cpu usage, read/write latencies, compaction tasks, and garbage collections. It includes useful visualizations for cluster, node, and keyspace metrics.

This integration includes 8 useful alerts and 3 pre-built dashboards to help monitor and visualize Apache Cassandra metrics and logs.

Before you begin

In order for the integration to properly work, you must set up the JMX Exporter for Prometheus on each node in your cluster.

Set up JMX Exporter

Each instance of the JMX Exporter can be run with this Apache Cassandra JMX Exporter configuration file.

For more information on how to configure the JVM on each node, please refer to the JMX Exporter documentation for further configuration details.

Apache Cassandra nodes by default expose 7199 for JMX by default, please make sure the port matches your environment. The JMX URL for the exporter should be properly configured to match your environment e.g. jmxUrl: service:jmx:rmi:///jndi/rmi://localhost:7199/jmxrmi

Once deployed, the Grafana Agent will be able to scrape JMX Exporter.

Install Apache Cassandra integration for Grafana Cloud

  1. In your Grafana Cloud stack, click Connections.
  2. Navigate to the Apache Cassandra tile and review the prerequisites. Then click Install integration.
  3. Once the integration is installed, follow the steps on the Configuration Details page to set up Grafana Agent and start sending Apache Cassandra metrics and logs to your Grafana Cloud instance.

Post-install configuration for the Apache Cassandra integration

After enabling the metrics generation, instruct the Grafana Agent to scrape your Apache Cassandra nodes.

Make sure to change targets in the snippet according to your environment.

You must configure a custom label for this integration:

  • cluster, the value that identifies a Apache Cassandra cluster

You can define a cluster label by adding an extra label to the scrape_configs of the JMX Exporter.

If you want to show logs and metrics signals correlated in your dashboards as a single pane of glass, ensure the following:

  • job and instance label values must match for the Apache Cassandra integration and logs scrape config in your agent configuration file.
  • job must be set to integrations/apache-cassandra
  • instance label must be set to a value that uniquely identifies your Apache Cassandra node. Please replace the default hostname value according to your environment - it should be set manually. Note that if you use localhost for multiple nodes, the dashboards will not be able to filter correctly by instance.
  • cluster must be the value that identifies the Apache Cassandra Cluster this node belongs to.
metrics:
  configs:
    - scrape_configs:
      - job_name: integrations/apache-cassandra
        static_configs:
        - targets:
          - '<node1>:<exporter-port>'
          - '<node2>:<exporter-port>'
          - '<node3>:<exporter-port>'
          labels: 
            cluster: '<your-cluster-name>'
logs:
  configs:
    scrape_configs:
    - job_name: integrations/apache-cassandra
      static_configs:
        - targets: [localhost] 
          labels:
            job: integrations/apache-cassandra
            instance: '<your-instance-name>:<exporter-port>'
            cluster: '<your-cluster-name>'
            __path__: /var/log/cassandra/system.log

Dashboards

The Apache Cassandra integration installs the following dashboards in your Grafana Cloud instance to help monitor your system.

  • Apache Cassandra keyspaces
  • Apache Cassandra nodes
  • Apache Cassandra overview

Apache Cassandra overview 1

image

Apache Cassandra overview 2

image

Apache Cassandra nodes 1

image

Alerts

The Apache Cassandra integration includes the following useful alerts:

AlertDescription
HighReadLatencyCritical: There is a high level of read latency within the node.
HighWriteLatencyCritical: There is a high level of write latency within the node.
HighPendingCompactionTasksWarning: Compaction task queue is filling up.
BlockedCompactionTasksFoundCritical: Compaction task queue is full.
HintsStoredOnNodeWarning: Hints have been recently written to this node.
UnavailableWriteRequestsFoundCritical: Unavailable exceptions have been encountered while performing writes in this cluster.
HighCpuUsageCritical: A node has a CPU usage higher than the configured threshold.
HighMemoryUsageCritical: A node has a higher memory utilization than the configured threshold.

Metrics

The most important metrics provided by the Apache Cassandra integration, which are used on the dashboards (and Prometheus Alerts and rules) are as follows:

NameTypeDescription
cassandra_cache_hitrategaugeAttribute exposed for management org.apache.cassandra.metrics:name=HitRate,type=Cache,attribute=Value
cassandra_cache_sizegaugeAttribute exposed for management org.apache.cassandra.metrics:name=Size,type=Cache,attribute=Value
cassandra_clientrequest_latency_seconds_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=Latency,type=ClientRequest,attribute=Count
cassandra_clientrequest_timeouts_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=Timeouts,type=ClientRequest,attribute=Count
cassandra_clientrequest_unavailables_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=Unavailables,type=ClientRequest,attribute=Count
cassandra_compaction_pendingtasksgaugeAttribute exposed for management org.apache.cassandra.metrics:name=PendingTasks,type=Compaction,attribute=Value
cassandra_connection_largemessageactivetasks
cassandra_connection_largemessagedroppedtasksgaugeAttribute exposed for management org.apache.cassandra.metrics:name=LargeMessageDroppedTasks,type=Connection,attribute=Value
cassandra_connection_largemessagependingtasksgaugeAttribute exposed for management org.apache.cassandra.metrics:name=LargeMessagePendingTasks,type=Connection,attribute=Value
cassandra_connection_smallmessageactivetasks
cassandra_connection_smallmessagedroppedtasksgaugeAttribute exposed for management org.apache.cassandra.metrics:name=SmallMessageDroppedTasks,type=Connection,attribute=Value
cassandra_connection_smallmessagependingtasksgaugeAttribute exposed for management org.apache.cassandra.metrics:name=SmallMessagePendingTasks,type=Connection,attribute=Value
cassandra_connection_timeouts_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=Timeouts,type=Connection,attribute=Count
cassandra_down_endpoint_countgaugeAttribute exposed for management org.apache.cassandra.net:name=null,type=FailureDetector,attribute=DownEndpointCount
cassandra_keyspace_caspreparelatency_secondsgaugeAttribute exposed for management org.apache.cassandra.metrics:name=CasPrepareLatency,type=Keyspace,attribute=50thPercentile
cassandra_keyspace_pendingcompactionsgaugeAttribute exposed for management org.apache.cassandra.metrics:name=PendingCompactions,type=Keyspace,attribute=Value
cassandra_keyspace_readlatency_secondsgaugeAttribute exposed for management org.apache.cassandra.metrics:name=ReadLatency,type=Keyspace,attribute=50thPercentile
cassandra_keyspace_readlatency_seconds_averagegaugeAttribute exposed for management org.apache.cassandra.metrics:name=ReadLatency,type=Keyspace,attribute=Mean
cassandra_keyspace_readlatency_seconds_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=ReadLatency,type=Keyspace,attribute=Count
cassandra_keyspace_repairjobscompleted_count
cassandra_keyspace_repairjobsstarted_count
cassandra_keyspace_totaldiskspaceusedgaugeAttribute exposed for management org.apache.cassandra.metrics:name=TotalDiskSpaceUsed,type=Keyspace,attribute=Value
cassandra_keyspace_writelatency_secondsgaugeAttribute exposed for management org.apache.cassandra.metrics:name=WriteLatency,type=Keyspace,attribute=50thPercentile
cassandra_keyspace_writelatency_seconds_averagegaugeAttribute exposed for management org.apache.cassandra.metrics:name=WriteLatency,type=Keyspace,attribute=Mean
cassandra_keyspace_writelatency_seconds_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=WriteLatency,type=Keyspace,attribute=Count
cassandra_keyspace_writelatency_seconds_sumuntypedAttribute exposed for management org.apache.cassandra.metrics:name=WriteTotalLatency,type=Keyspace,attribute=Count
cassandra_messaging_crossnodelatency_seconds
cassandra_storage_load_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=Load,type=Storage,attribute=Count
cassandra_storage_totalhints_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=TotalHints,type=Storage,attribute=Count
cassandra_table_maxpartitionsize
cassandra_table_readlatency_seconds_count
cassandra_table_readlatency_seconds_sum
cassandra_threadpools_currentlyblockedtasks_countuntypedAttribute exposed for management org.apache.cassandra.metrics:name=CurrentlyBlockedTasks,type=ThreadPools,attribute=Count
cassandra_up_endpoint_countgaugeAttribute exposed for management org.apache.cassandra.net:name=null,type=FailureDetector,attribute=UpEndpointCount
jvm_gc_collection_countgaugejava.lang:name=ConcurrentMarkSweep,type=GarbageCollector,attribute=CollectionCount
jvm_gc_duration_secondsgaugeCompositeType for GC info for ConcurrentMarkSweep java.lang:name=ConcurrentMarkSweep,type=GarbageCollector,attribute=duration
jvm_memory_usage_used_bytesgaugejava.lang.management.MemoryUsage java.lang:name=null,type=Memory,attribute=used
jvm_physical_memory_sizegaugejava.lang:name=null,type=OperatingSystem,attribute=TotalPhysicalMemorySize
jvm_process_cpu_loadgaugejava.lang:name=null,type=OperatingSystem,attribute=ProcessCpuLoad

Changelog

# 0.0.1 - March 2023

* Initial release

Cost

By connecting your Apache Cassandra instance to Grafana Cloud you might incur charges. To view information on the number of active series that your Grafana Cloud account uses for metrics included in each Cloud tier, see Active series and dpm usage and Cloud tier pricing.