The CloudWatch metrics integration continuously pulls metrics that have tags applied to them from CloudWatch and pushes them to your Grafana Cloud hosted metrics instance. Then you can drill into your data and identify issues.
With this integration, you can:
- Pull CloudWatch metrics from multiple AWS accounts and regions, without installing the Grafana Agent.
- Create multiple configurations called “scrape jobs” to separate data.
- Ingest the tags from your AWS instance and make them available for querying and alerting.
- Query and alert on metrics data using the Prometheus query language (PromQL).
- Use preconfigured dashboards out of the box.
The CloudWatch metrics integration offers out-of-the-box dashboards for different services, so you don’t need to build them.
Note: If you are using the data source and not the integration, refer to AWS CloudWatch data source.
How it works
The CloudWatch metrics integration uses an open source exporter to continually pull CloudWatch metrics and store them in a Prometheus format. Then you can use PromQL to query metrics later at no additional cost. PromQL allows you to run familiar expressions, such as
You can create any number of scrape jobs, which are sets of configurations that dictate which services, regions, and AWS accounts to collect from. In this way, you can logically split your data into specific jobs and scrape any number of AWS accounts to better organize your data.
As part of creating a job, Grafana needs access to the CloudWatch data available in your account. To grant access, we use AWS account delegation. Grafana can then assume a role that has access only to your CloudWatch data, with no need to share access and secret keys.
You can use Grafana Cloud to connect over 60 of the most popular AWS services, including EC2, Lambda, EBS, RDS, S3, ECS, ELB, and Billing. To see a complete list of services and what is gathered for each one, refer to Services.
Timestamps in Grafana and CloudWatch metrics
The timestamp of a metric pulled by the CloudWatch metrics integration is set to the time the metric is pulled. This might seem counterintuitive, but its intent is to simplify the writing of alert queries. The timestamps from the integration will always appear more delayed than they actually are.
Assume you are looking at a single metric, CPU Maximum, pulled every five minutes. This leads to CloudWatch metrics pulling data with a CloudWatch period of five minutes.
CloudWatch timestamps mark the beginning of a period, not the end.
CloudWatch samples are visible at the beginning of a period and aggregated through the period window.
The CloudWatch metrics integration pulls on a consistent interval, and only requests data which has been fully aggregated.
This results in a Grafana Cloud timestamp of 0:08 for a metric CloudWatch stamped at 0:00.
If the CloudWatch timestamp was used instead:
- Metrics would appear to be eight minutes old when ingested.
- Any alert queries written would need to consider this extra variable delay.
The pull timestamp gives the appearance of an eight-minute delay. But actually, only three minutes have passed since the value stopped being updated. Your alert queries can remain simple.