Apache Spark integration for Grafana Cloud
Apache Spark is a unified analytics engine for large-scale data processing.
Before you begin, install the Grafana Cloud Ceph integration: Ceph integration for Grafana Cloud.
This integration monitors an Apache Spark cluster based on the built-in Prometheus plugin, which is available in version 3.0 and later. You can enable the plugin by using the official documentation. This tutorial by dzlab might be helpful as well.
After enabling the plugin, configure Grafana Agent to scrape your Spark nodes.
Add the following labels to each scrape: instance_type
, spark_cluster
.
The first label value must be one of master
, worker
, application
, or driver
, so that the integration can identify which type of instance the node is.
The second label must be given a value that identifies the spark cluster. If you are monitoring different clusters, give each a unique name and group all composing instances with the same value.
metrics:
wal_directory: /tmp/wal
configs:
- job_name: 'integrations/spark-master'
scrape_interval: 10s
metrics_path: '/metrics/master/prometheus'
static_configs:
- targets: ['spark-master:8080']
labels:
instance_type: 'master'
spark_cluster: 'my-cluster'
- job_name: 'integrations/spark-worker'
scrape_interval: 10s
metrics_path: '/metrics/prometheus'
static_configs:
- targets: ['spark-worker:8081']
labels:
instance_type: 'worker'
spark_cluster: 'my-cluster'
- job_name: 'integrations/spark-driver'
scrape_interval: 10s
metrics_path: '/metrics/prometheus/'
static_configs:
- targets: ['spark-driver:4040']
labels:
instance_type: 'driver'
spark_cluster: 'my-cluster'
Please refer to the full reference of options in Grafana Agent configuration reference.
Related Grafana Cloud resources
Intro to Prometheus and Grafana Cloud
Prometheus is taking over the monitoring world! In this webinar, we will start with a quick introduction to the open source project that’s the de facto standard for monitoring modern, cloud native systems.
How to set up and visualize synthetic monitoring at scale with Grafana Cloud
Learn how to use Kubernetes, Grafana Loki, and Grafana Cloud’s synthetic monitoring feature to set up your infrastructure's checks in this GrafanaCONline session.
Using Grafana Cloud to drive manufacturing plant efficiency
This GrafanaCONline session tells how Grafana helps a 75-year-old manufacturing company with product quality and equipment maintenance.