Consul Server Monitoring

Dashboard

Maintained by the Consul team at HashiCorp. Displays critical health metrics about Consul servers, which are key to understanding Consul servers' behavior and stability in production. Also offers pre-built sections and panels for understanding usage of Consul by feature such as: KVs, DNS, the Catalog, and ACLs.
Last updated: 4 months ago

Start with Grafana Cloud and the new FREE tier. Includes 10K series Prometheus or Graphite Metrics and 50gb Loki Logs

Downloads: 166

Reviews: 1

  • Screen Shot 2020-11-12 at 2.15.44 PM.png
    Screen Shot 2020-11-12 at 2.15.44 PM.png
  • Screen Shot 2020-11-12 at 2.15.33 PM.png
    Screen Shot 2020-11-12 at 2.15.33 PM.png

Consul Server Monitoring Dashboard

Maintained by the Consul team at HashiCorp. Displays critical health metrics about Consul servers, which are key to understanding Consul servers' behavior and stability in production. Also offers pre-built sections and panels for understanding usage of Consul by feature such as: KVs, DNS, the Catalog, and ACLs.

Critical metrics are based on the "key metrics" section in Consul's telemetry docs: https://www.consul.io/docs/agent/telemetry.html See these docs for more information on individual stats. If you have any questions, please reach out on our community discuss board at: https://discuss.hashicorp.com/c/consul/29

Due to Consul's architecture, some metrics are emitted on both server and client agents. Typical deploys have many more clients than servers running, which can add noise when monitoring Consul server health. To filter it down, we recommend adding labels in prometheus' scrape_config based on the consul's agent's role on the host. E.g. role="server" for Consul servers and role="client" for Consul client agents. This will allow you to adapt the panel queries to filter on role="server", showing only the timeseries emitted from servers. https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config