Services
CloudWatch metrics supports the following services, and allows you to pick from a wide array of available metrics and statistics. Metrics in bold text are included in the default configuration. The statistics for all metrics are Average, Maximum, Minimum, Sum, SampleCount, p50, p75, p90, p95, p99.
AWS/ACMPrivateCA
Function: Provides a private certificate authority for managing SSL/TLS certificates
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_acmprivateca_info | ||
aws_acmprivateca_crlgenerated | CRLGenerated | Monitors the number of Certificate Revocation Lists (CRLs) generated. Used to ensure the regular creation of revocation lists for certificate management. |
aws_acmprivateca_failure | Failure | Tracks the number of failures in Private CA operations. Useful for identifying issues in certificate issuance or other operations. |
aws_acmprivateca_misconfigured_crlbucket | MisconfiguredCRLBucket | Monitors the number of instances where the CRL bucket is misconfigured. Useful for ensuring proper configuration and access to the CRL storage bucket. |
aws_acmprivateca_success | Success | Tracks the number of successful operations within the ACM Private CA. Useful for monitoring operational efficiency and successful certificate issuances. |
aws_acmprivateca_time | Time | Measures the time taken for various operations in ACM Private CA, helping to monitor performance and identify any slowdowns in certificate processing. |
AWS/AmazonMQ
Function: Managed message broker service for Apache ActiveMQ and RabbitMQ
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_amazonmq_info | ||
aws_amazonmq_ack_rate | AckRate | Monitors the acknowledgment rate of messages, ensuring efficient message processing and acknowledgment. |
aws_amazonmq_burst_balance | BurstBalance | Tracks the balance of burst credits, monitoring if the broker can handle sudden spikes in traffic. |
aws_amazonmq_channel_count | ChannelCount | Monitors the number of active channels, indicating resource usage and load on the broker. |
aws_amazonmq_confirm_rate | ConfirmRate | Measures the rate at which messages are confirmed, ensuring message delivery guarantees. |
aws_amazonmq_connection_count | ConnectionCount | Tracks the number of active connections, helping monitor broker usage and possible overloading. |
aws_amazonmq_consumer_count | ConsumerCount | Monitors the number of consumers connected, useful for understanding broker demand and throughput. |
aws_amazonmq_cpu_credit_balance | CpuCreditBalance | Tracks the remaining CPU credits, important for ensuring the broker has enough processing power to handle workload. |
aws_amazonmq_cpu_utilization | CpuUtilization | Measures the percentage of CPU usage, helping identify potential performance bottlenecks. |
aws_amazonmq_current_connections_count | CurrentConnectionsCount | Shows the number of currently connected clients, useful for tracking session loads. |
aws_amazonmq_dequeue_count | DequeueCount | Monitors the number of messages dequeued, which helps gauge message consumption activity. |
aws_amazonmq_dispatch_count | DispatchCount | Measures the number of messages dispatched to consumers, helping monitor message flow. |
aws_amazonmq_enqueue_count | EnqueueCount | Tracks the number of messages enqueued, giving insights into the volume of messages entering the system. |
aws_amazonmq_enqueue_time | EnqueueTime | Measures the time taken to enqueue messages, used to monitor latency and performance. |
aws_amazonmq_established_connections_count | EstablishedConnectionsCount | Tracks the number of successfully established connections, used to monitor system stability. |
aws_amazonmq_exchange_count | ExchangeCount | Monitors the number of exchanges, useful for analyzing message routing activity. |
aws_amazonmq_expired_count | ExpiredCount | Tracks the number of messages that have expired without being consumed, useful for monitoring failed message deliveries. |
aws_amazonmq_heap_usage | HeapUsage | Measures the heap memory usage of the broker, useful for detecting memory-related performance issues. |
aws_amazonmq_in_flight_count | InFlightCount | Monitors the number of messages currently in transit, helping to ensure the broker isn’t overwhelmed by unacknowledged messages. |
aws_amazonmq_inactive_durable_topic_subscribers_count | InactiveDurableTopicSubscribersCount | Monitors inactive durable subscribers, useful for tracking unused resources or inefficient topic subscriptions. |
aws_amazonmq_job_scheduler_store_percent_usage | JobSchedulerStorePercentUsage | Measures the percentage of the job scheduler store usage, important for capacity planning and performance. |
aws_amazonmq_journal_files_for_fast_recovery | JournalFilesForFastRecovery | Monitors the number of journal files available for fast recovery, ensuring quick system recovery. |
aws_amazonmq_journal_files_for_full_recovery | JournalFilesForFullRecovery | Tracks journal files required for full recovery, ensuring data durability and integrity during failures. |
aws_amazonmq_memory_usage | MemoryUsage | Measures the memory usage of the broker, ensuring the broker has adequate memory for message processing. |
aws_amazonmq_message_count | MessageCount | Tracks the total number of messages in the broker, providing insights into message load and storage. |
aws_amazonmq_message_ready_count | MessageReadyCount | Monitors the number of messages ready for delivery, helping gauge the efficiency of message consumption. |
aws_amazonmq_message_unacknowledged_count | MessageUnacknowledgedCount | Tracks unacknowledged messages, useful for detecting potential message delivery problems. |
aws_amazonmq_network_in | NetworkIn | Measures the incoming network traffic, useful for tracking data ingestion and throughput. |
aws_amazonmq_network_out | NetworkOut | Measures the outgoing network traffic, helping monitor data egress and bandwidth usage. |
aws_amazonmq_open_transaction_count | OpenTransactionCount | Tracks the number of open transactions, useful for identifying resource contention or potential system stalls. |
aws_amazonmq_producer_count | ProducerCount | Monitors the number of producers, useful for understanding message production activity in the system. |
aws_amazonmq_publish_rate | PublishRate | Measures the rate at which messages are being published, providing insights into message inflow. |
aws_amazonmq_queue_count | QueueCount | Tracks the number of active queues, useful for analyzing message distribution across queues. |
aws_amazonmq_queue_size | QueueSize | Monitors the size of the message queues, helping gauge message backlog and system load. |
aws_amazonmq_rabbit_mqdisk_free | RabbitMQDiskFree | Tracks the available disk space for RabbitMQ, ensuring that there’s enough storage for message persistence. |
aws_amazonmq_rabbit_mqdisk_free_limit | RabbitMQDiskFreeLimit | Monitors the disk free space threshold, alerting when approaching critical limits to avoid disruptions. |
aws_amazonmq_rabbit_mqfd_used | RabbitMQFdUsed | Tracks the number of file descriptors used by RabbitMQ, ensuring system resources are not exhausted. |
aws_amazonmq_rabbit_mqmem_limit | RabbitMQMemLimit | Monitors the memory usage limit for RabbitMQ, ensuring the broker doesn’t run out of memory. |
aws_amazonmq_rabbit_mqmem_used | RabbitMQMemUsed | Measures the memory currently in use by RabbitMQ, useful for monitoring resource efficiency. |
aws_amazonmq_receive_count | ReceiveCount | Tracks the number of received messages, helping monitor message inflow and processing rates. |
aws_amazonmq_store_percent_usage | StorePercentUsage | Monitors the percentage of the store usage, ensuring sufficient capacity for message persistence. |
aws_amazonmq_system_cpu_utilization | SystemCpuUtilization | Measures the CPU usage of the underlying system, helping to detect potential CPU bottlenecks. |
aws_amazonmq_temp_percent_usage | TempPercentUsage | Monitors the percentage usage of temporary storage, useful for avoiding storage exhaustion during peak loads. |
aws_amazonmq_total_consumer_count | TotalConsumerCount | Tracks the total number of consumers, helping assess the overall load and activity on the broker. |
aws_amazonmq_total_dequeue_count | TotalDequeueCount | Monitors the total number of dequeued messages, useful for analyzing message consumption rates. |
aws_amazonmq_total_enqueue_count | TotalEnqueueCount | Tracks the total number of enqueued messages, providing insights into message production volumes. |
aws_amazonmq_total_message_count | TotalMessageCount | Monitors the total count of messages in the system, giving an overview of the message load. |
aws_amazonmq_total_producer_count | TotalProducerCount | Tracks the total number of producers, useful for understanding message inflow activity. |
aws_amazonmq_volume_read_ops | VolumeReadOps | Measures the number of read operations on the broker’s volume, helping monitor disk I/O performance. |
aws_amazonmq_volume_write_ops | VolumeWriteOps | Measures the number of write operations on the broker’s volume, useful for detecting disk I/O bottlenecks. |
AWS/ApiGateway
Function: Enables developers to create and manage APIs for accessing data and services
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_apigateway_info | ||
aws_apigateway_4xx | 4xx | Monitors the number of 4xx client errors, used to track issues related to invalid requests from clients. |
aws_apigateway_5xx | 5xx | Tracks the number of 5xx server errors, used to monitor API Gateway or backend server issues. |
aws_apigateway_count | Count | Measures the total number of API requests, providing insights into traffic volume. |
aws_apigateway_integration_latency | IntegrationLatency | Monitors the latency between API Gateway and the backend integration, useful for diagnosing performance issues in backend services. |
aws_apigateway_latency | Latency | Tracks overall API latency, including both API Gateway processing and backend integration latency, helping to monitor user experience. |
aws_apigateway_4_xxerror | 4XXError | Measures the occurrence of 4xx errors (client errors), useful for understanding the rate of client-related issues. |
aws_apigateway_5_xxerror | 5XXError | Monitors 5xx errors (server errors), used to detect server-side failures in the API Gateway or its backend. |
aws_apigateway_cache_hit_count | CacheHitCount | Tracks the number of times API requests were served from the cache, helping to monitor the efficiency of cache usage. |
aws_apigateway_cache_miss_count | CacheMissCount | Monitors the number of cache misses, useful for optimizing cache configuration and reducing backend load. |
aws_apigateway_client_error | ClientError | Measures errors originating from the client (4xx), used to monitor the rate of invalid requests sent by clients. |
aws_apigateway_connect_count | ConnectCount | Tracks the number of successful WebSocket connection requests, providing insights into the usage of WebSocket APIs. |
aws_apigateway_data_processed | DataProcessed | Monitors the amount of data processed by the API Gateway, useful for analyzing API data transfer and throughput. |
aws_apigateway_execution_error | ExecutionError | Tracks execution errors during the API request process, useful for identifying failures in API execution logic. |
aws_apigateway_integration_error | IntegrationError | Monitors errors that occur during integration with backend services, useful for detecting issues in backend communication. |
aws_apigateway_message_count | MessageCount | Tracks the number of messages sent and received in WebSocket APIs, useful for monitoring message flow in real-time communication APIs. |
AWS/AppStream
Function: Delivers cloud-based desktops and applications to end-users on any device
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_appstream_info | ||
aws_appstream_actual_capacity | ActualCapacity | Monitors the actual number of available instances for streaming, used to ensure enough resources are deployed. |
aws_appstream_available_capacity | AvailableCapacity | Tracks the number of instances available for use but not currently in use, helping to gauge spare capacity for handling future demand. |
aws_appstream_capacity_utilization | CapacityUtilization | Measures the percentage of capacity utilization, useful for optimizing resource allocation and ensuring cost-effective usage. |
aws_appstream_desired_capacity | DesiredCapacity | Represents the desired number of instances based on scaling policies, helping to monitor scaling efficiency and capacity planning. |
aws_appstream_in_use_capacity | InUseCapacity | Tracks the number of instances currently in use, helping to monitor active workload and resource consumption. |
aws_appstream_insufficient_capacity_error | InsufficientCapacityError | Measures the number of times a capacity request failed due to insufficient resources, indicating capacity shortages or bottlenecks. |
aws_appstream_pending_capacity | PendingCapacity | Monitors instances that are in the process of being provisioned, helping to track the status of scaling events. |
aws_appstream_running_capacity | RunningCapacity | Tracks the total number of running instances, providing insights into the active resources currently being used to support users. |
AWS/AppSync
Function: Managed service for building GraphQL APIs that connects to data sources like DynamoDB
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_appsync_info | ||
aws_appsync_4_xxerror | 4XXError | Monitors client-side (4xx) errors in requests, useful for tracking invalid requests made by clients. |
aws_appsync_5_xxerror | 5XXError | Tracks server-side (5xx) errors, helping to detect issues in the API or the server infrastructure. |
aws_appsync_active_connections | ActiveConnections | Measures the number of active WebSocket connections, useful for understanding the real-time activity on the AppSync API. |
aws_appsync_active_subscriptions | ActiveSubscriptions | Tracks the number of active subscriptions, helping to monitor usage and engagement with subscription-based real-time data services. |
aws_appsync_connect_client_error | ConnectClientError | Monitors errors encountered by clients while trying to establish connections, indicating issues in the client-side configuration or request. |
aws_appsync_connect_server_error | ConnectServerError | Tracks server-side errors during the connection process, helping to identify server-side failures or misconfigurations during connection attempts. |
aws_appsync_connect_success | ConnectSuccess | Measures the successful WebSocket connection attempts, useful for monitoring overall connection success rates. |
aws_appsync_connection_duration | ConnectionDuration | Monitors the duration of WebSocket connections, helping to gauge session longevity and user engagement. |
aws_appsync_disconnect_client_error | DisconnectClientError | Tracks errors that occur when clients try to disconnect, useful for monitoring client-side disconnection issues. |
aws_appsync_disconnect_server_error | DisconnectServerError | Monitors server-side errors during disconnection, helping to detect issues in properly closing WebSocket connections. |
aws_appsync_disconnect_success | DisconnectSuccess | Measures successful disconnections from WebSocket connections, useful for ensuring smooth session terminations. |
aws_appsync_latency | Latency | Tracks the time taken to process requests, useful for monitoring API performance and identifying latency issues. |
aws_appsync_publish_data_message_client_error | PublishDataMessageClientError | Monitors client-side errors during data message publishing, used to detect issues with client-side data transmission. |
aws_appsync_publish_data_message_server_error | PublishDataMessageServerError | Tracks server-side errors during data message publishing, helping to identify issues in server-side message handling or transmission. |
aws_appsync_publish_data_message_size | PublishDataMessageSize | Measures the size of data messages being published, useful for tracking payload sizes and ensuring efficient message transmission. |
aws_appsync_publish_data_message_success | PublishDataMessageSuccess | Tracks successful data message publications, helping to monitor overall message delivery success. |
aws_appsync_requests | Requests | Measures the total number of requests processed by AppSync, providing insights into traffic and API usage. |
aws_appsync_subscribe_client_error | SubscribeClientError | Monitors client-side errors during subscription attempts, useful for tracking issues in subscribing to real-time data feeds. |
aws_appsync_subscribe_server_error | SubscribeServerError | Tracks server-side errors during subscription attempts, helping to identify server failures when clients try to subscribe. |
aws_appsync_subscribe_success | SubscribeSuccess | Measures successful subscription attempts, useful for monitoring subscription adoption and engagement rates. |
aws_appsync_tokens_consumed | TokensConsumed | Tracks the number of tokens consumed by requests, useful for managing API rate limits and monitoring user activity. |
aws_appsync_unsubscribe_client_error | UnsubscribeClientError | Monitors client-side errors during unsubscription attempts, used to detect issues when clients try to unsubscribe from data feeds. |
aws_appsync_unsubscribe_server_error | UnsubscribeServerError | Tracks server-side errors during unsubscription attempts, useful for identifying server-side issues when clients try to unsubscribe. |
aws_appsync_unsubscribe_success | UnsubscribeSuccess | Measures successful unsubscription attempts, ensuring smooth termination of real-time data subscriptions. |
AWS/ApplicationELB
Function: Distributes incoming traffic to targets like EC2 instances, containers, and IP addresses
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_applicationelb_info | ||
aws_applicationelb_active_connection_count | ActiveConnectionCount | Monitors the number of active connections, useful for understanding current load on the load balancer. |
aws_applicationelb_client_tlsnegotiation_error_count | ClientTLSNegotiationErrorCount | Tracks the number of failed TLS negotiations between clients and the load balancer, used to detect TLS handshake issues. |
aws_applicationelb_consumed_lcus | ConsumedLCUs | Measures the number of Load Balancer Capacity Units (LCUs) used, helping to track resource consumption and cost. |
aws_applicationelb_elbauth_error | ELBAuthError | Tracks errors during authentication processes, useful for monitoring failures in authentication workflows. |
aws_applicationelb_elbauth_failure | ELBAuthFailure | Monitors failed authentication attempts, helping detect potential security issues or configuration problems. |
aws_applicationelb_elbauth_latency | ELBAuthLatency | Measures the latency of authentication requests, useful for identifying delays in authentication workflows. |
aws_applicationelb_elbauth_refresh_token_success | ELBAuthRefreshTokenSuccess | Tracks successful refresh token requests, useful for monitoring token refresh operations. |
aws_applicationelb_elbauth_success | ELBAuthSuccess | Measures successful authentication requests, useful for monitoring authentication performance. |
aws_applicationelb_elbauth_user_claims_size_exceeded | ELBAuthUserClaimsSizeExceeded | Monitors instances where user claims exceed the allowed size, which can help in tuning authentication configurations. |
aws_applicationelb_httpcode_elb_3_xx_count | HTTPCode_ELB_3XX_Count | Tracks the number of 3xx HTTP responses, which indicate redirection, useful for monitoring redirects on the load balancer. |
aws_applicationelb_httpcode_elb_4_xx_count | HTTPCode_ELB_4XX_Count | Monitors the number of 4xx client error responses, useful for detecting invalid client requests. |
aws_applicationelb_httpcode_elb_5_xx_count | HTTPCode_ELB_5XX_Count | Tracks the number of 5xx server error responses, helping identify backend issues. |
aws_applicationelb_httpcode_target_2_xx_count | HTTPCode_Target_2XX_Count | Measures the number of successful 2xx responses from targets, useful for tracking successful request handling. |
aws_applicationelb_httpcode_target_3_xx_count | HTTPCode_Target_3XX_Count | Monitors the number of 3xx redirects from target servers, useful for understanding traffic redirection by targets. |
aws_applicationelb_httpcode_target_4_xx_count | HTTPCode_Target_4XX_Count | Tracks 4xx client errors returned by target servers, helping identify configuration or client-side issues. |
aws_applicationelb_httpcode_target_5_xx_count | HTTPCode_Target_5XX_Count | Monitors the number of 5xx errors returned by target servers, useful for identifying server-side issues. |
aws_applicationelb_ipv6_processed_bytes | IPv6ProcessedBytes | Measures the number of bytes processed over IPv6, useful for tracking IPv6 traffic volume. |
aws_applicationelb_ipv6_request_count | IPv6RequestCount | Tracks the number of IPv6 requests, providing insights into IPv6 usage and adoption. |
aws_applicationelb_new_connection_count | NewConnectionCount | Monitors the number of new connections established, helping understand connection initiation patterns. |
aws_applicationelb_processed_bytes | ProcessedBytes | Measures the total amount of data processed by the load balancer, useful for tracking overall throughput. |
aws_applicationelb_rejected_connection_count | RejectedConnectionCount | Tracks the number of connections rejected by the load balancer, useful for identifying capacity or configuration issues. |
aws_applicationelb_request_count | RequestCount | Measures the total number of requests handled by the load balancer, useful for monitoring traffic volume. |
aws_applicationelb_rule_evaluations | RuleEvaluations | Tracks the number of rule evaluations on the load balancer, helping to monitor rule complexity and processing time. |
aws_applicationelb_target_connection_error_count | TargetConnectionErrorCount | Monitors the number of connection errors to target servers, useful for identifying connectivity issues between the load balancer and targets. |
aws_applicationelb_target_response_time | TargetResponseTime | Measures the response time of target servers, helping to track backend performance and latency. |
aws_applicationelb_target_tlsnegotiation_error_count | TargetTLSNegotiationErrorCount | Tracks failed TLS negotiations between the load balancer and target servers, useful for detecting SSL/TLS issues with backend services. |
aws_applicationelb_anomalous_host_count | AnomalousHostCount | Monitors the number of hosts showing anomalous behavior, helping detect potential security issues or performance outliers. |
aws_applicationelb_desync_mitigation_mode_non_compliant_request_count | DesyncMitigationMode_NonCompliant_Request_Count | Tracks non-compliant requests under desync mitigation mode, useful for monitoring and securing application traffic. |
aws_applicationelb_dropped_invalid_header_request_count | DroppedInvalidHeaderRequestCount | Monitors requests dropped due to invalid headers, helping identify and fix misconfigurations or potential security risks. |
aws_applicationelb_forwarded_invalid_header_request_count | ForwardedInvalidHeaderRequestCount | Tracks invalid header requests that were forwarded, helping detect improper traffic that bypassed filtering. |
aws_applicationelb_grpc_request_count | GrpcRequestCount | Measures the number of gRPC requests handled, useful for tracking gRPC-based API traffic. |
aws_applicationelb_httpcode_elb_500_count | HTTPCode_ELB_500_Count | Tracks the number of 500 Internal Server Errors from the load balancer, useful for detecting backend or load balancer failures. |
aws_applicationelb_httpcode_elb_502_count | HTTPCode_ELB_502_Count | Monitors the number of 502 Bad Gateway errors, indicating backend communication failures. |
aws_applicationelb_httpcode_elb_503_count | HTTPCode_ELB_503_Count | Tracks the number of 503 Service Unavailable errors, helping detect capacity or service availability issues. |
aws_applicationelb_httpcode_elb_504_count | HTTPCode_ELB_504_Count | Measures the number of 504 Gateway Timeout errors, indicating backend timeouts. |
aws_applicationelb_http_fixed_response_count | HTTP_Fixed_Response_Count | Tracks the number of fixed responses sent by the load balancer, useful for monitoring traffic directed to predefined responses. |
aws_applicationelb_http_redirect_count | HTTP_Redirect_Count | Monitors the number of HTTP redirects sent by the load balancer, useful for tracking traffic redirection. |
aws_applicationelb_http_redirect_url_limit_exceeded_count | HTTP_Redirect_Url_Limit_Exceeded_Count | Tracks instances where the redirect URL limit was exceeded, indicating potential configuration issues. |
aws_applicationelb_healthy_host_count | HealthyHostCount | Measures the number of healthy hosts behind the load balancer, helping monitor service availability. |
aws_applicationelb_healthy_state_dns | HealthyStateDNS | Monitors DNS health state, useful for ensuring DNS routing functionality. |
aws_applicationelb_healthy_state_routing | HealthyStateRouting | Tracks the health of routing decisions by the load balancer, ensuring smooth traffic distribution. |
aws_applicationelb_lambda_internal_error | LambdaInternalError | Monitors internal errors in AWS Lambda functions invoked by the load balancer, useful for debugging serverless application issues. |
aws_applicationelb_lambda_target_processed_bytes | LambdaTargetProcessedBytes | Measures the bytes processed by Lambda targets, providing insights into data throughput for serverless applications. |
aws_applicationelb_lambda_user_error | LambdaUserError | Tracks user-triggered errors in Lambda functions, helping to identify issues in function logic or inputs. |
aws_applicationelb_mitigated_host_count | MitigatedHostCount | Monitors the number of hosts mitigated due to anomalies, useful for tracking security incidents. |
aws_applicationelb_non_sticky_request_count | NonStickyRequestCount | Measures the number of non-sticky requests handled, helping to monitor session persistence performance. |
aws_applicationelb_request_count_per_target | RequestCountPerTarget | Tracks the number of requests processed per target, useful for understanding traffic distribution and load balancing efficiency. |
aws_applicationelb_standard_processed_bytes | StandardProcessedBytes | Measures the total amount of bytes processed, useful for tracking data throughput on standard targets. |
aws_applicationelb_un_healthy_host_count | UnHealthyHostCount | Monitors the number of unhealthy hosts behind the load balancer, helping to identify availability issues. |
aws_applicationelb_unhealthy_routing_request_count | UnhealthyRoutingRequestCount | |
aws_applicationelb_unhealthy_state_dns | UnhealthyStateDNS | |
aws_applicationelb_unhealthy_state_routing | UnhealthyStateRouting |
AWS/Athena
Function: Interactive query service to analyze data in S3 using SQL
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_athena_info | ||
aws_athena_engine_execution_time | EngineExecutionTime | Measures the time taken by the query engine to execute a query, helping to monitor query performance and identify execution bottlenecks. |
aws_athena_processed_bytes | ProcessedBytes | Tracks the amount of data processed by the query engine, useful for understanding query cost and efficiency. |
aws_athena_query_planning_time | QueryPlanningTime | Monitors the time taken to plan and prepare the query for execution, helping identify delays during the query planning phase. |
aws_athena_query_queue_time | QueryQueueTime | Measures the time a query spends in the queue before execution, useful for monitoring system load and query prioritization issues. |
aws_athena_service_processing_time | ServiceProcessingTime | Tracks the time taken by Athena’s internal services to process a query, helping to identify processing delays within the service. |
aws_athena_total_execution_time | TotalExecutionTime | Measures the total time from query submission to completion, providing a comprehensive view of query performance and potential bottlenecks. |
AWS/AutoScaling
Function: Automatically adjusts capacity to maintain performance and cost efficiency
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_autoscaling_info | ||
aws_autoscaling_group_and_warm_pool_desired_capacity | GroupAndWarmPoolDesiredCapacity | Monitors the desired capacity of both the Auto Scaling group and the warm pool, used to ensure adequate resources are provisioned. |
aws_autoscaling_group_and_warm_pool_total_capacity | GroupAndWarmPoolTotalCapacity | Tracks the total capacity of the Auto Scaling group and warm pool, providing an overview of the available resources. |
aws_autoscaling_group_desired_capacity | GroupDesiredCapacity | Measures the desired number of instances in the Auto Scaling group, useful for capacity planning and scaling decisions. |
aws_autoscaling_group_in_service_capacity | GroupInServiceCapacity | Tracks the number of instances currently in service, helping to monitor the active workload. |
aws_autoscaling_group_in_service_instances | GroupInServiceInstances | Monitors the actual number of instances currently running in the group, useful for managing resource availability. |
aws_autoscaling_group_max_size | GroupMaxSize | Measures the maximum size of the Auto Scaling group, helping ensure the group does not exceed the defined limit. |
aws_autoscaling_group_min_size | GroupMinSize | Tracks the minimum size of the Auto Scaling group, ensuring a baseline level of capacity is maintained. |
aws_autoscaling_group_pending_capacity | GroupPendingCapacity | Monitors the capacity of instances that are pending launch, useful for understanding the state of scaling events. |
aws_autoscaling_group_pending_instances | GroupPendingInstances | Tracks the number of instances that are pending launch, helping monitor scaling processes in progress. |
aws_autoscaling_group_standby_capacity | GroupStandbyCapacity | Measures the capacity of instances in standby mode, useful for tracking inactive but available resources. |
aws_autoscaling_group_standby_instances | GroupStandbyInstances | Monitors the number of instances in standby mode, helping assess resource availability for scaling. |
aws_autoscaling_group_terminating_capacity | GroupTerminatingCapacity | Tracks the capacity of instances being terminated, helping to monitor scaling down activities. |
aws_autoscaling_group_terminating_instances | GroupTerminatingInstances | Monitors the number of instances being terminated, useful for understanding scaling down operations. |
aws_autoscaling_group_total_capacity | GroupTotalCapacity | Measures the total capacity of the Auto Scaling group, providing a complete view of resources available for scaling. |
aws_autoscaling_group_total_instances | GroupTotalInstances | Tracks the total number of instances in the Auto Scaling group, helping to monitor overall resource allocation. |
aws_autoscaling_predictive_scaling_capacity_forecast | PredictiveScalingCapacityForecast | Provides forecasted capacity based on predictive scaling, helping to plan for future resource needs. |
aws_autoscaling_predictive_scaling_load_forecast | PredictiveScalingLoadForecast | Tracks forecasted load on the Auto Scaling group, helping to ensure capacity meets future demand. |
aws_autoscaling_predictive_scaling_metric_pair_correlation | PredictiveScalingMetricPairCorrelation | Measures the correlation between metric pairs for predictive scaling, useful for improving prediction accuracy. |
aws_autoscaling_warm_pool_desired_capacity | WarmPoolDesiredCapacity | Monitors the desired capacity of the warm pool, helping to ensure the pool has sufficient resources for quick scaling. |
aws_autoscaling_warm_pool_min_size | WarmPoolMinSize | Tracks the minimum size of the warm pool, ensuring a baseline level of resources for rapid scaling. |
aws_autoscaling_warm_pool_pending_capacity | WarmPoolPendingCapacity | Measures the capacity of instances pending in the warm pool, useful for understanding warm pool availability. |
aws_autoscaling_warm_pool_terminating_capacity | WarmPoolTerminatingCapacity | Monitors the capacity of instances being terminated in the warm pool, helping to track scaling down activities. |
aws_autoscaling_warm_pool_total_capacity | WarmPoolTotalCapacity | Tracks the total capacity of the warm pool, providing a complete view of available resources for quick scaling. |
aws_autoscaling_warm_pool_warmed_capacity | WarmPoolWarmedCapacity | Measures the capacity of warmed instances in the warm pool, useful for tracking resources that are ready for immediate use. |
AWS/Backup
Function: Centralized backup service to automate and manage backups across AWS services
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_backup_info | ||
aws_backup_number_of_backup_jobs_aborted | NumberOfBackupJobsAborted | Tracks the number of backup jobs that were aborted, useful for monitoring failed or incomplete backup operations. |
aws_backup_number_of_backup_jobs_completed | NumberOfBackupJobsCompleted | Measures the number of backup jobs successfully completed, useful for tracking the effectiveness of backup operations. |
aws_backup_number_of_backup_jobs_created | NumberOfBackupJobsCreated | Tracks the total number of backup jobs initiated, helping to monitor backup frequency and schedule adherence. |
aws_backup_number_of_backup_jobs_expired | NumberOfBackupJobsExpired | Monitors the number of backup jobs that have expired, useful for ensuring data retention policies are followed. |
aws_backup_number_of_backup_jobs_failed | NumberOfBackupJobsFailed | Measures the number of backup jobs that have failed, useful for identifying errors in the backup process. |
aws_backup_number_of_backup_jobs_pending | NumberOfBackupJobsPending | Tracks the number of backup jobs currently in a pending state, helping monitor delays or scheduling issues. |
aws_backup_number_of_backup_jobs_running | NumberOfBackupJobsRunning | Monitors the number of backup jobs that are currently running, useful for tracking ongoing backup processes. |
aws_backup_number_of_copy_jobs_completed | NumberOfCopyJobsCompleted | Measures the number of copy jobs successfully completed, helping track backup data replication across regions or storage tiers. |
aws_backup_number_of_copy_jobs_created | NumberOfCopyJobsCreated | Tracks the number of initiated copy jobs, useful for monitoring data replication schedules. |
aws_backup_number_of_copy_jobs_failed | NumberOfCopyJobsFailed | Monitors the number of failed copy jobs, helping to detect issues with backup replication processes. |
aws_backup_number_of_copy_jobs_running | NumberOfCopyJobsRunning | Tracks the number of copy jobs currently in progress, useful for monitoring ongoing replication activities. |
aws_backup_number_of_recovery_points_cold | NumberOfRecoveryPointsCold | Measures the number of cold (archived) recovery points, useful for tracking long-term storage of backup data. |
aws_backup_number_of_recovery_points_completed | NumberOfRecoveryPointsCompleted | Tracks the total number of recovery points successfully created, helping to ensure that data can be restored when needed. |
aws_backup_number_of_recovery_points_deleting | NumberOfRecoveryPointsDeleting | Monitors the number of recovery points being deleted, useful for tracking clean-up or retention policy actions. |
aws_backup_number_of_recovery_points_expired | NumberOfRecoveryPointsExpired | Measures the number of expired recovery points, useful for ensuring compliance with retention policies. |
aws_backup_number_of_recovery_points_partial | NumberOfRecoveryPointsPartial | Tracks the number of incomplete (partial) recovery points, helping to identify issues with backup integrity or storage capacity. |
aws_backup_number_of_restore_jobs_completed | NumberOfRestoreJobsCompleted | Measures the number of successful restore jobs, useful for tracking data recovery operations. |
aws_backup_number_of_restore_jobs_failed | NumberOfRestoreJobsFailed | Monitors the number of restore jobs that have failed, useful for identifying problems in the recovery process. |
aws_backup_number_of_restore_jobs_pending | NumberOfRestoreJobsPending | Tracks the number of restore jobs that are pending, useful for monitoring delays in data recovery. |
aws_backup_number_of_restore_jobs_running | NumberOfRestoreJobsRunning | Monitors the number of restore jobs currently in progress, helping to track ongoing recovery processes. |
AWS/Billing
Function: Provides detailed usage and cost data for AWS services. This service only produces metrics to specific regions in AWS. Any jobs configured with this service will only gather data from the us-east-1 regions.
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_billing_estimated_charges | EstimatedCharges | Tracks the estimated charges for your AWS account, providing insights into overall AWS cost and usage. This is useful for budget monitoring and cost management over time, helping to identify cost spikes or unusual charges. |
AWS/Cassandra
Function: Managed Apache Cassandra-compatible database service
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_cassandra_info | ||
aws_cassandra_account_max_reads | AccountMaxReads | Tracks the maximum number of read requests for the account, helping monitor and manage read activity and limits. |
aws_cassandra_account_max_table_level_reads | AccountMaxTableLevelReads | Measures the maximum number of reads at the table level, useful for understanding read distribution across tables. |
aws_cassandra_account_max_table_level_writes | AccountMaxTableLevelWrites | Tracks the maximum number of write operations at the table level, helping identify write-heavy tables. |
aws_cassandra_account_max_writes | AccountMaxWrites | Measures the maximum number of write requests for the account, useful for managing overall write throughput. |
aws_cassandra_account_provisioned_read_capacity_utilization | AccountProvisionedReadCapacityUtilization | Monitors the utilization of provisioned read capacity, helping ensure optimal read capacity allocation. |
aws_cassandra_account_provisioned_write_capacity_utilization | AccountProvisionedWriteCapacityUtilization | Tracks the utilization of provisioned write capacity, ensuring efficient use of write resources. |
aws_cassandra_conditional_check_failed_requests | ConditionalCheckFailedRequests | Measures the number of failed conditional checks, useful for monitoring logical errors during write operations. |
aws_cassandra_consumed_read_capacity_units | ConsumedReadCapacityUnits | Tracks the number of read capacity units consumed, helping monitor read activity and optimize capacity. |
aws_cassandra_consumed_write_capacity_units | ConsumedWriteCapacityUnits | Monitors the number of write capacity units consumed, providing insights into write operations and capacity optimization. |
aws_cassandra_max_provisioned_table_read_capacity_utilization | MaxProvisionedTableReadCapacityUtilization | Tracks the maximum utilization of provisioned read capacity at the table level, helping manage read resources per table. |
aws_cassandra_max_provisioned_table_write_capacity_utilization | MaxProvisionedTableWriteCapacityUtilization | Monitors the maximum utilization of provisioned write capacity at the table level, ensuring efficient use of write resources per table. |
aws_cassandra_returned_item_count | ReturnedItemCount | Measures the total number of items returned by read operations, useful for understanding query efficiency. |
aws_cassandra_returned_item_count_by_select | ReturnedItemCountBySelect | Tracks the number of items returned by select queries, helping optimize query results and performance. |
aws_cassandra_successful_request_count | SuccessfulRequestCount | Monitors the number of successful requests, providing insights into the operational success rate of read and write operations. |
aws_cassandra_successful_request_latency | SuccessfulRequestLatency | Measures the latency of successful requests, helping to optimize performance and identify bottlenecks. |
aws_cassandra_system_errors | SystemErrors | Tracks the number of system-related errors, useful for identifying and addressing infrastructure or service issues. |
aws_cassandra_user_errors | UserErrors | Monitors the number of user-related errors, helping identify application-level issues or misconfigurations. |
AWS/CertificateManager
Function: Manages the provisioning, renewal, and deployment of SSL/TLS certificates
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_certificatemanager_info | ||
aws_certificatemanager_days_to_expiry | DaysToExpiry | Tracks the number of days remaining until an SSL/TLS certificate expires. This metric is useful for monitoring certificate lifecycles and ensuring that certificates are renewed before expiration to avoid service disruptions. |
AWS/CloudFront
Function: Content delivery network to deliver data, videos, applications globally
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_cloudfront_info | ||
aws_cloudfront_4xx_error_rate | 4xxErrorRate | Tracks the rate of 4xx client-side errors, helping to monitor user request issues. |
aws_cloudfront_5xx_error_rate | 5xxErrorRate | Tracks the rate of 5xx server-side errors, useful for detecting backend or CloudFront issues. |
aws_cloudfront_bytes_downloaded | BytesDownloaded | Measures the total bytes downloaded via CloudFront, useful for monitoring bandwidth usage. |
aws_cloudfront_bytes_uploaded | BytesUploaded | Monitors the amount of data uploaded to CloudFront, helping track upload activity. |
aws_cloudfront_requests | Requests | Tracks the total number of requests processed by CloudFront, providing insight into traffic volume. |
aws_cloudfront_total_error_rate | TotalErrorRate | Measures the combined rate of all error responses (both 4xx and 5xx), helping monitor service reliability. |
aws_cloudfront_401_error_rate | 401ErrorRate | Tracks the rate of 401 Unauthorized errors, useful for monitoring authentication issues. |
aws_cloudfront_403_error_rate | 403ErrorRate | Monitors the rate of 403 Forbidden errors, helping to detect access control issues. |
aws_cloudfront_404_error_rate | 404ErrorRate | Measures the rate of 404 Not Found errors, useful for tracking invalid requests or missing resources. |
aws_cloudfront_502_error_rate | 502ErrorRate | Tracks the rate of 502 Bad Gateway errors, indicating backend server or network issues. |
aws_cloudfront_503_error_rate | 503ErrorRate | Monitors the rate of 503 Service Unavailable errors, helping to detect capacity or availability issues. |
aws_cloudfront_504_error_rate | 504ErrorRate | Tracks the rate of 504 Gateway Timeout errors, indicating backend server delays. |
aws_cloudfront_cache_hit_rate | CacheHitRate | Measures the percentage of requests served from CloudFront’s cache, useful for optimizing content delivery efficiency. |
aws_cloudfront_function_compute_utilization | FunctionComputeUtilization | Tracks the compute utilization of CloudFront Functions, helping to monitor resource usage for custom code execution. |
aws_cloudfront_function_execution_errors | FunctionExecutionErrors | Monitors the number of execution errors in CloudFront Functions, helping to identify failures in custom logic. |
aws_cloudfront_function_invocations | FunctionInvocations | Tracks the total number of CloudFront Function invocations, useful for monitoring function usage. |
aws_cloudfront_function_throttles | FunctionThrottles | Measures throttled CloudFront Function invocations, indicating capacity or rate-limiting issues. |
aws_cloudfront_function_validation_errors | FunctionValidationErrors | Tracks validation errors for CloudFront Functions, useful for debugging incorrect function configurations. |
aws_cloudfront_lambda_execution_error | LambdaExecutionError | Monitors errors during Lambda@Edge function execution, useful for identifying issues with serverless logic. |
aws_cloudfront_lambda_limit_exceeded_errors | LambdaLimitExceededErrors | Tracks instances where Lambda@Edge functions exceed their resource limits, helping detect performance bottlenecks. |
aws_cloudfront_lambda_validation_error | LambdaValidationError | Measures Lambda@Edge validation errors, useful for ensuring proper configuration. |
aws_cloudfront_origin_latency | OriginLatency | Tracks the latency from CloudFront to the origin server, helping to identify performance bottlenecks in origin server communication. |
AWS/Cognito
Function: Provides authentication, authorization, and user management for web and mobile apps
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_cognito_info | ||
aws_cognito_account_take_over_risk | AccountTakeOverRisk | Tracks the risk of account takeover attempts, useful for detecting malicious login attempts. |
aws_cognito_compromised_credentials_risk | CompromisedCredentialsRisk | Monitors the risk of compromised credentials, helping to detect and mitigate security threats. |
aws_cognito_federation_successes | FederationSuccesses | Tracks the number of successful federated sign-ins, useful for monitoring third-party identity provider usage. |
aws_cognito_federation_throttles | FederationThrottles | Measures the number of throttled federation sign-in attempts, useful for identifying rate-limiting issues. |
aws_cognito_no_risk | NoRisk | Tracks the number of no-risk sign-ins, indicating successful and secure login attempts. |
aws_cognito_override_block | OverrideBlock | Monitors instances where an administrator overrides a block, useful for auditing account management actions. |
aws_cognito_risk | Risk | Tracks general login risk events, helping to monitor suspicious activity. |
aws_cognito_sign_in_successes | SignInSuccesses | Tracks the number of successful sign-ins, helping to monitor user authentication success. |
aws_cognito_sign_in_throttles | SignInThrottles | Measures the number of throttled sign-in attempts, useful for detecting excessive login activity or rate-limiting. |
aws_cognito_sign_up_successes | SignUpSuccesses | Tracks successful user sign-ups, providing insight into account creation trends. |
aws_cognito_sign_up_throttles | SignUpThrottles | Measures throttled sign-up attempts, useful for identifying potential rate-limiting or abuse during account creation. |
aws_cognito_token_refresh_successes | TokenRefreshSuccesses | Tracks the number of successful token refreshes, useful for monitoring user session continuity. |
aws_cognito_token_refresh_throttles | TokenRefreshThrottles | Monitors the number of throttled token refresh requests, helping identify rate-limiting or session issues. |
AWS/DDoSProtection
Function: Protects against distributed denial of service attacks with AWS Shield
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ddosprotection_info | ||
aws_ddosprotection_ddo_sattack_bits_per_second | DDoSAttackBitsPerSecond | Monitors the volume of a DDoS attack in terms of data transfer per second, useful for detecting bandwidth-based attacks. |
aws_ddosprotection_ddo_sattack_packets_per_second | DDoSAttackPacketsPerSecond | Tracks the number of packets involved in a DDoS attack per second, helping to identify packet flood attacks. |
aws_ddosprotection_ddo_sattack_requests_per_second | DDoSAttackRequestsPerSecond | Monitors the number of requests in a DDoS attack per second, useful for identifying application-layer DDoS attacks. |
aws_ddosprotection_ddo_sdetected | DDoSDetected | Tracks the detection of DDoS attacks, providing alerts when a potential attack is detected. |
aws_ddosprotection_volume_bits_per_second | VolumeBitsPerSecond | Monitors the data transfer volume per second during a DDoS attack, helping to understand the scale of the attack. |
aws_ddosprotection_volume_packets_per_second | VolumePacketsPerSecond | Measures the volume of packets per second, useful for tracking the size of DDoS attacks in terms of packet rate. |
AWS/DMS
Function: Migrates databases to AWS with minimal downtime
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_dms_info | ||
aws_dms_cdcchanges_disk_source | CDCChangesDiskSource | Tracks changes to the disk source during Change Data Capture (CDC) operations, useful for monitoring disk-based CDC changes. |
aws_dms_cdcchanges_disk_target | CDCChangesDiskTarget | Monitors changes to the disk target during CDC, useful for tracking target-side disk usage in migrations. |
aws_dms_cdcchanges_memory_source | CDCChangesMemorySource | Tracks memory usage on the source during CDC operations, helping monitor memory-based migrations. |
aws_dms_cdcchanges_memory_target | CDCChangesMemoryTarget | Monitors memory usage on the target during CDC operations, useful for tracking memory consumption on the target side. |
aws_dms_cdcincoming_changes | CDCIncomingChanges | Measures the number of incoming changes during CDC operations, helping to monitor the rate of data changes. |
aws_dms_cdclatency_source | CDCLatencySource | Tracks latency on the source side during CDC operations, helping to identify performance issues with data changes. |
aws_dms_cdclatency_target | CDCLatencyTarget | Monitors the latency on the target side during CDC operations, useful for tracking potential bottlenecks. |
aws_dms_cdcthroughput_bandwidth_source | CDCThroughputBandwidthSource | Measures the source bandwidth usage during CDC operations, helping to monitor network usage. |
aws_dms_cdcthroughput_bandwidth_target | CDCThroughputBandwidthTarget | Monitors the target bandwidth usage during CDC, useful for tracking data transfer rates. |
aws_dms_cdcthroughput_rows_source | CDCThroughputRowsSource | Tracks the number of rows processed from the source during CDC operations, useful for monitoring data throughput. |
aws_dms_cdcthroughput_rows_target | CDCThroughputRowsTarget | Monitors the number of rows written to the target during CDC, helping to ensure data is migrated efficiently. |
aws_dms_cpuutilization | CPUUtilization | Measures the CPU usage of DMS instances, helping to ensure that the system has enough resources to perform migrations. |
aws_dms_free_storage_space | FreeStorageSpace | Tracks the amount of free storage available on the DMS instance, useful for preventing storage exhaustion during migrations. |
aws_dms_freeable_memory | FreeableMemory | Monitors the available memory on the DMS instance, useful for ensuring that enough memory is available for operations. |
aws_dms_full_load_throughput_bandwidth_source | FullLoadThroughputBandwidthSource | Tracks bandwidth usage during full load operations on the source, useful for monitoring network utilization. |
aws_dms_full_load_throughput_bandwidth_target | FullLoadThroughputBandwidthTarget | Monitors bandwidth usage during full load operations on the target, helping track data transfer efficiency. |
aws_dms_full_load_throughput_rows_source | FullLoadThroughputRowsSource | Tracks the number of rows processed from the source during full load migrations, helping to monitor data throughput. |
aws_dms_full_load_throughput_rows_target | FullLoadThroughputRowsTarget | Monitors the number of rows loaded to the target during full load operations, helping to ensure migration progress. |
aws_dms_network_receive_throughput | NetworkReceiveThroughput | Tracks the network receive rate, helping to monitor inbound network performance during migrations. |
aws_dms_network_transmit_throughput | NetworkTransmitThroughput | Measures the network transmit rate, useful for monitoring outbound network performance. |
aws_dms_read_iops | ReadIOPS | Tracks the number of read operations per second, helping to monitor disk read performance. |
aws_dms_read_latency | ReadLatency | Measures the latency of read operations, helping to identify performance issues in disk reads. |
aws_dms_read_throughput | ReadThroughput | Monitors the throughput of read operations, useful for tracking how much data is being read during migrations. |
aws_dms_swap_usage | SwapUsage | Tracks the amount of swap space used, helping monitor memory performance. |
aws_dms_write_iops | WriteIOPS | Measures the number of write operations per second, useful for monitoring disk write performance. |
aws_dms_write_latency | WriteLatency | Tracks the latency of write operations, helping identify performance issues during data writes. |
aws_dms_write_throughput | WriteThroughput | Monitors the throughput of write operations, helping to understand the speed of data writes during migration operations. |
AWS/DX
Function: AWS Direct Connect provides a dedicated network connection to AWS.
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_dx_info | ||
aws_dx_connection_bps_egress | ConnectionBpsEgress | Measures the egress bandwidth (bits per second) for Direct Connect connections, helping monitor outbound data transfer. |
aws_dx_connection_bps_ingress | ConnectionBpsIngress | Monitors the ingress bandwidth (bits per second), providing insights into inbound data transfer rates. |
aws_dx_connection_crcerror_count | ConnectionCRCErrorCount | Tracks CRC errors on the connection, useful for identifying data integrity issues or hardware problems. |
aws_dx_connection_encryption_state | ConnectionEncryptionState | Monitors the encryption state of Direct Connect connections, helping ensure secure data transfer. |
aws_dx_connection_error_count | ConnectionErrorCount | Tracks the number of errors on the Direct Connect connection, useful for diagnosing connectivity issues. |
aws_dx_connection_light_level_rx | ConnectionLightLevelRx | Measures the received light level, helping monitor the health of fiber optic connections. |
aws_dx_connection_light_level_tx | ConnectionLightLevelTx | Tracks the transmitted light level, helping ensure proper signal strength in fiber optic connections. |
aws_dx_connection_pps_egress | ConnectionPpsEgress | Monitors the number of packets per second being transmitted (egress), useful for tracking network traffic patterns. |
aws_dx_connection_pps_ingress | ConnectionPpsIngress | Tracks the number of packets per second being received (ingress), useful for understanding inbound traffic load. |
aws_dx_connection_state | ConnectionState | Monitors the operational state of Direct Connect connections, helping to detect connection status changes. |
aws_dx_virtual_interface_bps_egress | VirtualInterfaceBpsEgress | Measures the outbound bandwidth usage for virtual interfaces, helping track the data flow from virtual interfaces. |
aws_dx_virtual_interface_bps_ingress | VirtualInterfaceBpsIngress | Monitors inbound bandwidth usage for virtual interfaces, providing insight into data ingress through virtual interfaces. |
aws_dx_virtual_interface_pps_egress | VirtualInterfacePpsEgress | Tracks the number of outbound packets per second for virtual interfaces, helping monitor packet-based traffic. |
aws_dx_virtual_interface_pps_ingress | VirtualInterfacePpsIngress | Measures the number of inbound packets per second for virtual interfaces, useful for monitoring packet-level ingress. |
AWS/DocDB
Function: Managed document database service that supports MongoDB workloads
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_docdb_info | ||
aws_docdb_backup_retention_period_storage_used | BackupRetentionPeriodStorageUsed | Tracks the amount of storage used for backup retention, helping manage backup costs and storage. |
aws_docdb_buffer_cache_hit_ratio | BufferCacheHitRatio | Monitors the cache hit ratio, helping to ensure data is being effectively cached. |
aws_docdb_cpuutilization | CPUUtilization | Measures the CPU usage of the database, useful for monitoring resource consumption. |
aws_docdb_change_stream_log_size | ChangeStreamLogSize | Tracks the size of the change stream log, helping monitor the volume of changes being processed. |
aws_docdb_dbcluster_replica_lag_maximum | DBClusterReplicaLagMaximum | Monitors the maximum replication lag between the primary and replica nodes in the cluster. |
aws_docdb_dbcluster_replica_lag_minimum | DBClusterReplicaLagMinimum | Tracks the minimum replication lag, helping ensure data replication is kept in sync. |
aws_docdb_dbinstance_replica_lag | DBInstanceReplicaLag | Monitors replication lag at the instance level, useful for tracking data consistency across instances. |
aws_docdb_database_connections | DatabaseConnections | Tracks the number of active connections to the database, helping monitor connection load. |
aws_docdb_database_connections_max | DatabaseConnectionsMax | Monitors the maximum number of connections allowed, helping avoid connection exhaustion. |
aws_docdb_database_cursors | DatabaseCursors | Tracks the number of database cursors in use, helping monitor query processing. |
aws_docdb_database_cursors_max | DatabaseCursorsMax | Monitors the maximum number of database cursors, useful for managing resource limits. |
aws_docdb_database_cursors_timed_out | DatabaseCursorsTimedOut | Tracks cursors that have timed out, helping identify performance issues. |
aws_docdb_disk_queue_depth | DiskQueueDepth | Measures the depth of the disk I/O queue, useful for monitoring disk performance. |
aws_docdb_documents_deleted | DocumentsDeleted | Tracks the number of documents deleted, helping to monitor data deletion operations. |
aws_docdb_documents_inserted | DocumentsInserted | Measures the number of documents inserted, helping to track data growth in the database. |
aws_docdb_documents_returned | DocumentsReturned | Tracks the number of documents returned by queries, useful for monitoring query performance. |
aws_docdb_documents_updated | DocumentsUpdated | Measures the number of documents updated, helping track changes in the database. |
aws_docdb_engine_uptime | EngineUptime | Monitors the total uptime of the database engine, useful for tracking availability. |
aws_docdb_free_local_storage | FreeLocalStorage | Tracks the amount of free storage on the database node, helping to prevent storage exhaustion. |
aws_docdb_freeable_memory | FreeableMemory | Monitors the amount of free memory, useful for ensuring sufficient memory availability. |
aws_docdb_network_receive_throughput | NetworkReceiveThroughput | Measures the amount of data being received by the database, useful for tracking inbound network usage. |
aws_docdb_network_throughput | NetworkThroughput | Monitors overall network throughput, helping track both inbound and outbound traffic. |
aws_docdb_network_transmit_throughput | NetworkTransmitThroughput | Measures the amount of data being transmitted from the database, helping track outbound traffic. |
aws_docdb_opcounters_command | OpcountersCommand | Tracks the number of database commands executed, useful for monitoring operational throughput. |
aws_docdb_opcounters_delete | OpcountersDelete | Monitors the number of delete operations, useful for tracking data modifications. |
aws_docdb_opcounters_getmore | OpcountersGetmore | Measures the number of getMore operations, useful for monitoring pagination in queries. |
aws_docdb_opcounters_insert | OpcountersInsert | Tracks the number of insert operations, helping monitor data insert performance. |
aws_docdb_opcounters_query | OpcountersQuery | Monitors the number of queries executed, useful for tracking query load. |
aws_docdb_opcounters_update | OpcountersUpdate | Measures the number of update operations, helping monitor data modifications in the database. |
aws_docdb_read_iops | ReadIOPS | Tracks the number of input/output operations per second for reads, helping to monitor read performance. |
aws_docdb_read_latency | ReadLatency | Measures the latency of read operations, helping to identify performance issues with data retrieval. |
aws_docdb_read_throughput | ReadThroughput | Monitors the rate of data being read from the database, useful for tracking read performance. |
aws_docdb_snapshot_storage_used | SnapshotStorageUsed | Tracks the amount of storage used for database snapshots, helping manage backup storage costs. |
aws_docdb_swap_usage | SwapUsage | Monitors the amount of swap space used, helping track memory efficiency. |
aws_docdb_total_backup_storage_billed | TotalBackupStorageBilled | Tracks the amount of backup storage billed, useful for understanding backup costs. |
aws_docdb_volume_bytes_used | VolumeBytesUsed | Measures the amount of storage volume in use, helping track database storage usage. |
aws_docdb_volume_read_iops | VolumeReadIOPs | Tracks the number of read input/output operations per second on the storage volume, useful for monitoring storage performance. |
aws_docdb_volume_write_iops | VolumeWriteIOPs | Measures the number of write I/O operations per second, helping monitor write performance on the storage volume. |
aws_docdb_write_iops | WriteIOPS | Tracks the number of write operations per second, useful for tracking write throughput. |
aws_docdb_write_latency | WriteLatency | Measures the latency of write operations, helping to identify performance bottlenecks during data insertion or updates. |
aws_docdb_write_throughput | WriteThroughput | Monitors the rate at which data is written to the database, useful for understanding write performance. |
AWS/DynamoDB
Function: Fully managed NoSQL database service for low-latency applications at scale
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_dynamodb_info | ||
aws_dynamodb_account_max_reads | AccountMaxReads | Monitors the maximum number of reads across all tables in the account, helping track overall read activity. |
aws_dynamodb_account_max_table_level_reads | AccountMaxTableLevelReads | Tracks the maximum reads at the table level, helping to identify read-heavy tables. |
aws_dynamodb_account_max_table_level_writes | AccountMaxTableLevelWrites | Measures the maximum number of writes at the table level, useful for identifying write-intensive tables. |
aws_dynamodb_account_max_writes | AccountMaxWrites | Tracks the maximum number of writes across all tables in the account, helping monitor write throughput. |
aws_dynamodb_account_provisioned_read_capacity_utilization | AccountProvisionedReadCapacityUtilization | Monitors the utilization of the provisioned read capacity, helping ensure sufficient read capacity allocation. |
aws_dynamodb_account_provisioned_write_capacity_utilization | AccountProvisionedWriteCapacityUtilization | Tracks the utilization of the provisioned write capacity, useful for efficient capacity management. |
aws_dynamodb_age_of_oldest_unreplicated_record | AgeOfOldestUnreplicatedRecord | Measures the age of the oldest unreplicated record, helping track replication lag. |
aws_dynamodb_conditional_check_failed_requests | ConditionalCheckFailedRequests | Tracks the number of failed conditional checks, useful for identifying logical issues during write operations. |
aws_dynamodb_consumed_change_data_capture_units | ConsumedChangeDataCaptureUnits | Measures the number of consumed Change Data Capture units, helping monitor CDC-based operations. |
aws_dynamodb_consumed_read_capacity_units | ConsumedReadCapacityUnits | Monitors the total read capacity units consumed, helping track and optimize read operations. |
aws_dynamodb_consumed_write_capacity_units | ConsumedWriteCapacityUnits | Measures the total write capacity units consumed, useful for monitoring and optimizing write operations. |
aws_dynamodb_failed_to_replicate_record_count | FailedToReplicateRecordCount | Tracks the number of records that failed to replicate, useful for identifying replication issues. |
aws_dynamodb_max_provisioned_table_read_capacity_utilization | MaxProvisionedTableReadCapacityUtilization | Measures the maximum utilization of the provisioned read capacity at the table level, useful for understanding table-specific read activity. |
aws_dynamodb_max_provisioned_table_write_capacity_utilization | MaxProvisionedTableWriteCapacityUtilization | Tracks the maximum utilization of provisioned write capacity at the table level, helping optimize write capacity. |
aws_dynamodb_on_demand_max_read_request_units | OnDemandMaxReadRequestUnits | Monitors the maximum number of read request units in on-demand mode, useful for managing scaling costs. |
aws_dynamodb_on_demand_max_write_request_units | OnDemandMaxWriteRequestUnits | Tracks the maximum number of write request units in on-demand mode, helping optimize scaling and cost management. |
aws_dynamodb_online_index_consumed_write_capacity | OnlineIndexConsumedWriteCapacity | Measures the write capacity consumed by online index builds, useful for tracking index creation overhead. |
aws_dynamodb_online_index_percentage_progress | OnlineIndexPercentageProgress | Monitors the progress of online index creation, useful for understanding index build status. |
aws_dynamodb_online_index_throttle_events | OnlineIndexThrottleEvents | Tracks throttle events during online index creation, useful for detecting capacity constraints. |
aws_dynamodb_pending_replication_count | PendingReplicationCount | Monitors the number of records pending replication, useful for tracking replication progress. |
aws_dynamodb_provisioned_read_capacity_units | ProvisionedReadCapacityUnits | Tracks the total provisioned read capacity units, useful for managing resource allocation. |
aws_dynamodb_provisioned_write_capacity_units | ProvisionedWriteCapacityUnits | Monitors the total provisioned write capacity units, helping ensure proper capacity allocation. |
aws_dynamodb_read_throttle_events | ReadThrottleEvents | Measures the number of throttled read requests, useful for identifying capacity limitations. |
aws_dynamodb_replication_latency | ReplicationLatency | Tracks the replication latency, helping ensure timely data consistency across replicas. |
aws_dynamodb_returned_bytes | ReturnedBytes | Monitors the amount of data returned in response to queries, useful for tracking data retrieval patterns. |
aws_dynamodb_returned_item_count | ReturnedItemCount | Measures the total number of items returned by read operations, useful for monitoring query performance. |
aws_dynamodb_returned_records_count | ReturnedRecordsCount | Tracks the number of records returned by queries, useful for understanding query load and performance. |
aws_dynamodb_successful_request_latency | SuccessfulRequestLatency | Monitors the latency of successful requests, useful for optimizing request performance. |
aws_dynamodb_system_errors | SystemErrors | Tracks system-level errors, helping identify infrastructure or platform issues. |
aws_dynamodb_throttled_put_record_count | ThrottledPutRecordCount | Monitors the number of throttled PutItem requests, useful for managing write capacity. |
aws_dynamodb_throttled_requests | ThrottledRequests | Tracks the total number of throttled requests, helping to identify capacity limitations or traffic spikes. |
aws_dynamodb_time_to_live_deleted_item_count | TimeToLiveDeletedItemCount | Measures the number of items deleted due to Time to Live (TTL) expiration, useful for managing automatic data deletion. |
aws_dynamodb_transaction_conflict | TransactionConflict | Monitors the number of transaction conflicts, helping to optimize transaction performance. |
aws_dynamodb_user_errors | UserErrors | Tracks user-level errors, helping identify application issues. |
aws_dynamodb_write_throttle_events | WriteThrottleEvents | Monitors the number of throttled write requests, useful for identifying capacity constraints during write operations. |
AWS/EBS
Function: Block storage for use with EC2 instances
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ebs_info | ||
aws_ebs_volume_read_bytes | VolumeReadBytes | Measures the total bytes read from the EBS volume, useful for monitoring data retrieval activity. |
aws_ebs_volume_write_bytes | VolumeWriteBytes | Tracks the total bytes written to the EBS volume, helping monitor data write operations. |
aws_ebs_volume_read_ops | VolumeReadOps | Monitors the number of read operations on the EBS volume, useful for tracking read performance. |
aws_ebs_volume_write_ops | VolumeWriteOps | Measures the number of write operations on the EBS volume, helping to monitor write throughput. |
aws_ebs_volume_total_read_time | VolumeTotalReadTime | Tracks the total time spent on read operations, useful for understanding read latency. |
aws_ebs_volume_total_write_time | VolumeTotalWriteTime | Monitors the total time spent on write operations, helping to understand write latency. |
aws_ebs_volume_idle_time | VolumeIdleTime | Measures the amount of idle time for the EBS volume, useful for understanding periods of inactivity. |
aws_ebs_volume_queue_length | VolumeQueueLength | Tracks the length of the queue for I/O requests on the EBS volume, helping to identify potential performance bottlenecks. |
aws_ebs_volume_throughput_percentage | VolumeThroughputPercentage | Monitors the throughput percentage of the EBS volume, useful for ensuring optimal performance. |
aws_ebs_volume_consumed_read_write_ops | VolumeConsumedReadWriteOps | Measures the number of read and write operations consumed, helping track IOPS utilization. |
aws_ebs_burst_balance | BurstBalance | Tracks the balance of burst credits available for burstable performance EBS volumes, helping manage performance spikes. |
aws_ebs_enable_copied_image_deprecation_completed | EnableCopiedImageDeprecationCompleted | Measures the completion of copied image deprecation operations, useful for lifecycle management. |
aws_ebs_enable_copied_image_deprecation_failed | EnableCopiedImageDeprecationFailed | Tracks the failure of copied image deprecation operations, helping identify issues with deprecation. |
aws_ebs_enable_image_deprecation_completed | EnableImageDeprecationCompleted | Measures the completion of image deprecation operations, helping monitor deprecation success. |
aws_ebs_enable_image_deprecation_failed | EnableImageDeprecationFailed | Tracks the failure of image deprecation operations, useful for identifying deprecation issues. |
aws_ebs_images_copied_region_completed | ImagesCopiedRegionCompleted | Monitors the completion of image copy operations across regions, helping manage multi-region image availability. |
aws_ebs_images_copied_region_deregister_completed | ImagesCopiedRegionDeregisterCompleted | Tracks the completion of deregistration of copied images across regions, useful for lifecycle management. |
aws_ebs_images_copied_region_deregistered_failed | ImagesCopiedRegionDeregisteredFailed | Measures failures during the deregistration of copied images, helping identify operational issues. |
aws_ebs_images_copied_region_failed | ImagesCopiedRegionFailed | Tracks failures in region-to-region image copy operations, useful for identifying cross-region availability issues. |
aws_ebs_images_copied_region_started | ImagesCopiedRegionStarted | |
aws_ebs_images_create_completed | ImagesCreateCompleted | |
aws_ebs_images_create_failed | ImagesCreateFailed | |
aws_ebs_images_create_started | ImagesCreateStarted | |
aws_ebs_images_deregister_completed | ImagesDeregisterCompleted | |
aws_ebs_images_deregister_failed | ImagesDeregisterFailed | |
aws_ebs_resources_targeted | ResourcesTargeted | |
aws_ebs_snapshots_copied_account_completed | SnapshotsCopiedAccountCompleted | |
aws_ebs_snapshots_copied_account_delete_completed | SnapshotsCopiedAccountDeleteCompleted | |
aws_ebs_snapshots_copied_account_delete_failed | SnapshotsCopiedAccountDeleteFailed | |
aws_ebs_snapshots_copied_account_failed | SnapshotsCopiedAccountFailed | |
aws_ebs_snapshots_copied_account_started | SnapshotsCopiedAccountStarted | |
aws_ebs_snapshots_copied_region_completed | SnapshotsCopiedRegionCompleted | |
aws_ebs_snapshots_copied_region_delete_completed | SnapshotsCopiedRegionDeleteCompleted | |
aws_ebs_snapshots_copied_region_delete_failed | SnapshotsCopiedRegionDeleteFailed | |
aws_ebs_snapshots_copied_region_failed | SnapshotsCopiedRegionFailed | |
aws_ebs_snapshots_copied_region_started | SnapshotsCopiedRegionStarted | |
aws_ebs_snapshots_create_completed | SnapshotsCreateCompleted | Tracks the successful completion of snapshot creation, helping monitor backup operations. |
aws_ebs_snapshots_create_failed | SnapshotsCreateFailed | Measures the number of failed snapshot creation attempts, useful for detecting backup failures. |
aws_ebs_snapshots_create_started | SnapshotsCreateStarted | |
aws_ebs_snapshots_delete_completed | SnapshotsDeleteCompleted | Tracks the completion of snapshot deletion, useful for storage management. |
aws_ebs_snapshots_delete_failed | SnapshotsDeleteFailed | Measures the number of failed snapshot deletion attempts, helping track operational issues with snapshot management. |
aws_ebs_snapshots_shared_completed | SnapshotsSharedCompleted |
AWS/EC2
Function: Virtual servers in the cloud for running applications
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ec2_info | ||
aws_ec2_cpuutilization | CPUUtilization | Measures the amount of data received by the EC2 instance, useful for monitoring inbound traffic. |
aws_ec2_network_in | NetworkIn | Measures the amount of data received by the EC2 instance, useful for monitoring inbound traffic. |
aws_ec2_network_out | NetworkOut | Monitors the amount of data sent from the EC2 instance, helping track outbound traffic. |
aws_ec2_network_packets_in | NetworkPacketsIn | Tracks the number of network packets received, useful for understanding inbound network traffic patterns. |
aws_ec2_network_packets_out | NetworkPacketsOut | Measures the number of network packets sent, helping monitor outbound network activity. |
aws_ec2_disk_read_bytes | DiskReadBytes | Monitors the number of bytes read from the instance’s storage, useful for tracking data retrieval performance. |
aws_ec2_disk_write_bytes | DiskWriteBytes | Measures the number of bytes written to the instance’s storage, helping to track storage write operations. |
aws_ec2_disk_read_ops | DiskReadOps | Tracks the number of read operations on the instance’s storage, useful for monitoring storage performance. |
aws_ec2_disk_write_ops | DiskWriteOps | Measures the number of write operations on the instance’s storage, helping track write activity. |
aws_ec2_status_check_failed | StatusCheckFailed | Tracks whether the EC2 instance has failed the instance or system status checks, useful for identifying potential issues. |
aws_ec2_status_check_failed_instance | StatusCheckFailed_Instance | Monitors whether the instance has failed the instance-level status checks, helping to detect internal instance issues. |
aws_ec2_status_check_failed_system | StatusCheckFailed_System | Tracks failures in the system-level status checks, useful for identifying infrastructure issues impacting the instance. |
aws_ec2_ebsiobalance_percent | EBSIOBalance% | Measures the I/O balance of attached EBS volumes, helping to ensure that the instance has adequate I/O capacity. |
aws_ec2_ebsbyte_balance_percent | EBSByteBalance% | Tracks the byte balance of attached EBS volumes, useful for managing storage throughput. |
aws_ec2_ebsread_ops | EBSReadOps | Monitors the number of read operations on attached EBS volumes, useful for tracking storage read performance. |
aws_ec2_ebswrite_ops | EBSWriteOps | Tracks the number of write operations on attached EBS volumes, helping to monitor storage write activity. |
aws_ec2_ebsread_bytes | EBSReadBytes | Measures the number of bytes read from attached EBS volumes, useful for monitoring data retrieval performance. |
aws_ec2_ebswrite_bytes | EBSWriteBytes | Tracks the number of bytes written to attached EBS volumes, helping to monitor data write performance. |
aws_ec2_cpucredit_balance | CPUCreditBalance | Monitors the remaining CPU credits for burstable instances, helping ensure that sufficient CPU credits are available for performance. |
aws_ec2_cpucredit_usage | CPUCreditUsage | Tracks the number of CPU credits used, useful for monitoring the consumption of burstable instances. |
aws_ec2_cpusurplus_credit_balance | CPUSurplusCreditBalance | Measures the surplus CPU credits available for burstable instances, useful for tracking instance performance capacity. |
aws_ec2_cpusurplus_credits_charged | CPUSurplusCreditsCharged | Tracks the number of surplus CPU credits charged, helping manage costs associated with overutilization. |
aws_ec2_dedicated_host_cpuutilization | DedicatedHostCPUUtilization | Measures the CPU usage of dedicated EC2 hosts, helping to optimize host-level resource allocation. |
aws_ec2_metadata_no_token | MetadataNoToken | Monitors the number of failed attempts to retrieve metadata without a token, useful for identifying security or access issues. |
aws_ec2_status_check_failed_attached_ebs | StatusCheckFailed_AttachedEBS | Tracks status check failures related to attached EBS volumes, helping monitor storage health and performance. |
AWS/EC2Spot
Function: Uses spare EC2 capacity at reduced prices for workloads with flexible start times
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ec2spot_info | ||
aws_ec2spot_available_instance_pools_count | AvailableInstancePoolsCount | Monitors the number of instance pools available for Spot requests, useful for tracking availability. |
aws_ec2spot_bids_submitted_for_capacity | BidsSubmittedForCapacity | |
Tracks the number of bids submitted for capacity in Spot instances, helping monitor the Spot instance bidding process. | ||
aws_ec2spot_eligible_instance_pool_count | EligibleInstancePoolCount | Measures the number of eligible instance pools for Spot requests, useful for understanding Spot market options. |
aws_ec2spot_fulfilled_capacity | FulfilledCapacity | Tracks the capacity fulfilled by Spot instances, helping monitor the success rate of Spot requests. |
aws_ec2spot_max_percent_capacity_allocation | MaxPercentCapacityAllocation | Measures the maximum percent of capacity allocated, useful for understanding the allocation of Spot instances. |
aws_ec2spot_pending_capacity | PendingCapacity | Tracks the pending Spot instance capacity, helping monitor Spot instance provisioning. |
aws_ec2spot_percent_capacity_allocation | PercentCapacityAllocation | Monitors the percentage of capacity allocated to Spot instances, useful for managing resource allocation. |
aws_ec2spot_target_capacity | TargetCapacity | Tracks the target capacity for Spot instances, useful for monitoring Spot instance request goals. |
aws_ec2spot_terminating_capacity | TerminatingCapacity | Measures the capacity being terminated in Spot instances, helping track Spot instance lifecycle management. |
AWS/ECR
Function: Managed container image registry for storing Docker images
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ecr_repository_pull_count | RepositoryPullCount | Monitors the number of pulls from an ECR repository, useful for tracking container image usage. |
AWS/ECS
Function: Fully managed container orchestration service for running Docker containers
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ecs_info | ||
aws_ecs_cpureservation | CPUReservation | Tracks the CPU reserved for ECS tasks, helping monitor resource reservation. |
aws_ecs_cpuutilization | CPUUtilization | Monitors the CPU utilization of ECS tasks, useful for tracking resource usage. |
aws_ecs_gpureservation | GPUReservation | Tracks GPU reservation for ECS tasks, helping manage GPU resources. |
aws_ecs_memory_reservation | MemoryReservation | Monitors the memory reserved for ECS tasks, helping track memory resource allocation. |
aws_ecs_memory_utilization | MemoryUtilization | Tracks the memory utilization of ECS tasks, useful for monitoring memory resource consumption. |
AWS/EFS
Function: Scalable and fully managed file storage for use with EC2 instances
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_efs_info | ||
aws_efs_burst_credit_balance | BurstCreditBalance | Monitors the balance of burst credits for EFS, useful for managing performance bursts. |
aws_efs_client_connections | ClientConnections | Tracks the number of client connections to EFS, useful for understanding file system usage. |
aws_efs_data_read_iobytes | DataReadIOBytes | Measures the amount of data read from EFS, helping track read performance. |
aws_efs_data_write_iobytes | DataWriteIOBytes | Tracks the amount of data written to EFS, helping monitor write performance. |
aws_efs_metadata_iobytes | MetadataIOBytes | Monitors the metadata operations on EFS, useful for tracking metadata-related I/O. |
aws_efs_metered_iobytes | MeteredIOBytes | Tracks the amount of metered I/O operations, helping manage performance limits. |
aws_efs_percent_iolimit | PercentIOLimit | Monitors the percentage of the I/O limit reached, useful for performance management. |
aws_efs_permitted_throughput | PermittedThroughput | Measures the allowed throughput for EFS, helping monitor throughput limits. |
aws_efs_storage_bytes | StorageBytes | Tracks the total storage used by EFS, useful for managing storage capacity. |
aws_efs_total_iobytes | TotalIOBytes | Measures the total I/O operations, helping monitor overall file system performance. |
AWS/ELB
Function: Distributes traffic across multiple targets like EC2 instances and containers
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_elb_info | |||
aws_elb_backend_connection_errors | BackendConnectionErrors | Tracks the number of connection errors between ELB and the backend instances, useful for identifying connection issues. | |
aws_elb_healthy_host_count | HealthyHostCount | Monitors the number of healthy backend instances, helping track instance health. | |
aws_elb_httpcode_backend_2_xx | HTTPCode_Backend_2XX | Tracks successful responses (2XX) from the backend, useful for monitoring backend application performance. | |
aws_elb_httpcode_backend_3_xx | HTTPCode_Backend_3XX | Measures redirection responses (3XX) from the backend, helping monitor routing performance. | |
aws_elb_httpcode_backend_4_xx | HTTPCode_Backend_4XX | Tracks client errors (4XX) from the backend, useful for identifying issues with client requests. | |
aws_elb_httpcode_backend_5_xx | HTTPCode_Backend_5XX | Monitors server errors (5XX) from the backend, helping track server-side issues. | |
aws_elb_httpcode_elb_4_xx | HTTPCode_ELB_4XX | Measures client errors (4XX) at the ELB level, useful for tracking errors handled by the ELB. | |
aws_elb_httpcode_elb_5_xx | HTTPCode_ELB_5XX | Tracks server errors (5XX) at the ELB level, helping monitor ELB server-side performance. | |
aws_elb_latency | Latency | Monitors the latency of requests through the ELB, useful for tracking response times. | |
aws_elb_request_count | RequestCount | Tracks the number of requests handled by the ELB, useful for monitoring traffic levels. | |
aws_elb_spillover_count | SpilloverCount | Measures the number of requests that were rejected due to lack of available resources, | helping track capacity limitations. |
aws_elb_surge_queue_length | SurgeQueueLength | Tracks the length of the request queue, useful for monitoring traffic surges. | |
aws_elb_un_healthy_host_count | UnHealthyHostCount | Monitors the number of unhealthy backend instances, helping identify infrastructure issues. | |
aws_elb_estimated_albactive_connection_count | EstimatedALBActiveConnectionCount | Tracks the number of active connections to the ALB, useful for monitoring load balancer usage. | |
aws_elb_estimated_albconsumed_lcus | EstimatedALBConsumedLCUs | Measures the load balancer capacity units (LCUs) consumed by the ALB, helping monitor resource usage. | |
aws_elb_estimated_albnew_connection_count | EstimatedALBNewConnectionCount | Tracks the number of new connections established with the ALB, useful for monitoring connection traffic. | |
aws_elb_estimated_processed_bytes | EstimatedProcessedBytes | Monitors the total bytes processed by the ALB, helping to track data flow through the load balancer. |
AWS/ES
Function: Managed Elasticsearch service for real-time search and analytics
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_es_info | |||
aws_es_info | aws_es_info | Provides general information about the Elasticsearch service | |
aws_es_2xx | 2xx | Tracks successful requests to the Elasticsearch service | |
aws_es_3xx | 3xx | Tracks redirection requests to the Elasticsearch service | |
aws_es_4xx | 4xx | Tracks client error responses from the Elasticsearch service | |
aws_es_5xx | 5xx | Tracks server error responses from the Elasticsearch service | |
aws_es_adanomaly_detectors_index_status_red | ADAnomalyDetectorsIndexStatus.red | Indicates if the anomaly detection index is in a red (critical) state | |
aws_es_adanomaly_detectors_index_status_index_exists | ADAnomalyDetectorsIndexStatusIndexExists | Tracks whether the anomaly detection index exists or not | |
aws_es_adanomaly_results_index_status_red | ADAnomalyResultsIndexStatus.red | Indicates if the anomaly results index is in a red (critical) state | |
aws_es_adanomaly_results_index_status_index_exists | ADAnomalyResultsIndexStatusIndexExists | Tracks whether the anomaly results index exists or not | |
aws_es_adexecute_failure_count | ADExecuteFailureCount | Tracks the number of times anomaly detection execution has failed | |
aws_es_adexecute_request_count | ADExecuteRequestCount | Tracks the number of anomaly detection execution requests | |
aws_es_adhcexecute_failure_count | ADHCExecuteFailureCount | Tracks the number of high cardinality anomaly detection execution failures | |
aws_es_adhcexecute_request_count | ADHCExecuteRequestCount | Tracks the number of high cardinality anomaly detection execution requests | |
aws_es_admodels_checkpoint_index_status_red | ADModelsCheckpointIndexStatus.red | Indicates if the model checkpoint index is in a red (critical) state | |
aws_es_admodels_checkpoint_index_status_index_exists | ADModelsCheckpointIndexStatusIndexExists | Tracks whether the model checkpoint index exists | |
aws_es_adplugin_unhealthy | ADPluginUnhealthy | Indicates if the anomaly detection plugin is in an unhealthy state | |
aws_es_alerting_degraded | AlertingDegraded | Indicates if the alerting feature is in a degraded state | |
aws_es_alerting_index_exists | AlertingIndexExists | Tracks whether the alerting index exists | |
aws_es_alerting_index_status_green | AlertingIndexStatus.green | Indicates if the alerting index is in a green (healthy) state | |
aws_es_alerting_index_status_red | AlertingIndexStatus.red | Indicates if the alerting index is in a red (critical) state | |
aws_es_alerting_index_status_yellow AlertingIndexStatus.yellow | Indicates if the alerting index is in a yellow (warning) state | ||
aws_es_alerting_nodes_not_on_schedule | AlertingNodesNotOnSchedule | Tracks the number of nodes not on schedule for alerting | |
aws_es_alerting_nodes_on_schedule | AlertingNodesOnSchedule | Tracks the number of nodes on schedule for alerting | |
aws_es_alerting_scheduled_job_enabled | AlertingScheduledJobEnabled | Indicates if alerting scheduled jobs are enabled | |
aws_es_asynchronous_search_cancelled | AsynchronousSearchCancelled | Tracks the number of asynchronous search requests that were canceled | |
aws_es_asynchronous_search_completion_rate | AsynchronousSearchCompletionRate | Tracks the rate of successful asynchronous search completions | |
aws_es_asynchronous_search_failure_rate | AsynchronousSearchFailureRate | Tracks the rate of failed asynchronous search requests | |
aws_es_asynchronous_search_initialized_rate | AsynchronousSearchInitializedRate | Tracks the rate of initialized asynchronous search requests | |
aws_es_asynchronous_search_max_running_time | AsynchronousSearchMaxRunningTime | Tracks the maximum time taken by asynchronous search requests | |
aws_es_asynchronous_search_persist_failed_rate | AsynchronousSearchPersistFailedRate | Tracks the rate of failed attempts to persist asynchronous search results | |
aws_es_asynchronous_search_persist_rate | AsynchronousSearchPersistRate | Tracks the rate of successful attempts to persist asynchronous search results | |
aws_es_asynchronous_search_rejected | AsynchronousSearchRejected | Tracks the number of asynchronous search requests that were rejected | |
aws_es_asynchronous_search_running_current | AsynchronousSearchRunningCurrent | Tracks the number of currently running asynchronous search requests | |
aws_es_asynchronous_search_store_health | AsynchronousSearchStoreHealth | Tracks the health of the store for asynchronous search | |
aws_es_asynchronous_search_store_size | AsynchronousSearchStoreSize | Tracks the size of the asynchronous search store | |
aws_es_asynchronous_search_stored_response_count | AsynchronousSearchStoredResponseCount | Tracks the number of responses stored for asynchronous search | |
aws_es_asynchronous_search_submission_rate | AsynchronousSearchSubmissionRate Tracks the rate of submitted asynchronous search requests | ||
aws_es_auto_follow_leader_call_failure | AutoFollowLeaderCallFailure | Tracks the number of failures when trying to call the leader for cross-cluster replication | |
aws_es_auto_follow_num_failed_start_replication | AutoFollowNumFailedStartReplication | Tracks the number of failed attempts to start cross-cluster replication | |
aws_es_auto_follow_num_success_start_replication | AutoFollowNumSuccessStartReplication | Tracks the number of successful attempts to start cross-cluster replication | |
aws_es_auto_tune_changes_history_heap_size | AutoTuneChangesHistoryHeapSize | Tracks the heap size usage history for auto-tune changes | |
aws_es_auto_tune_changes_history_jvmyoung_gen_args | AutoTuneChangesHistoryJVMYoungGenArgs | Tracks JVM young generation arguments for auto-tune changes | |
aws_es_auto_tune_failed | AutoTuneFailed | Tracks the number of failed auto-tune attempts | |
aws_es_auto_tune_succeeded | AutoTuneSucceeded | Tracks the number of successful auto-tune attempts | |
aws_es_auto_tune_value | AutoTuneValue | Tracks the value of auto-tune changes | |
aws_es_automated_snapshot_failure | AutomatedSnapshotFailure | Tracks the number of failures in automated | snapshots |
aws_es_avg_point_in_time_alive_time | AvgPointInTimeAliveTime | Tracks the average lifetime of point-in-time snapshots | |
aws_es_burst_balance | BurstBalance | Tracks the burst balance for the service | |
aws_es_cpucredit_balance | CPUCreditBalance | Tracks the balance of CPU credits for the nodes | |
aws_es_cpuutilization | CPUUtilization | Tracks the CPU utilization of the nodes | |
aws_es_cluster_index_writes_blocked | ClusterIndexWritesBlocked | Tracks whether index writes are blocked at the cluster level | |
aws_es_cluster_status_green | ClusterStatus.green | Indicates if the cluster is in a green (healthy) state | |
aws_es_cluster_status_red | ClusterStatus.red | Indicates if the cluster is in a red (critical) state | |
aws_es_cluster_status_yellow | ClusterStatus.yellow | Indicates if the cluster is in a yellow (warning) state | |
aws_es_cluster_used_space | ClusterUsedSpace | Tracks the amount of used storage space in the cluster | |
aws_es_cold_storage_space_utilization | ColdStorageSpaceUtilization | Tracks the storage utilization of cold data | |
aws_es_cold_to_warm_migration_failure_count | ColdToWarmMigrationFailureCount | Tracks the number of failures during migration from cold to warm storage | |
aws_es_cold_to_warm_migration_latency | ColdToWarmMigrationLatency | Tracks the latency of migration from cold to warm storage | |
aws_es_cold_to_warm_migration_queue_size | ColdToWarmMigrationQueueSize | Tracks the queue size for migration from cold to warm storage | |
aws_es_cold_to_warm_migration_success_count | ColdToWarmMigrationSuccessCount | Tracks the number of successful migrations from cold to warm storage | |
aws_es_coordinating_write_rejected | CoordinatingWriteRejected | Tracks the number of rejected coordinating node write requests | |
aws_es_cross_cluster_inbound_replication_requests | CrossClusterInboundReplicationRequests | Tracks the number of inbound replication requests for cross-cluster replication | |
aws_es_cross_cluster_inbound_requests | CrossClusterInboundRequests | Tracks the number of inbound requests for cross-cluster replication | |
aws_es_cross_cluster_outbound_connections | CrossClusterOutboundConnections | Tracks the number of outbound connections for cross-cluster replication | |
aws_es_cross_cluster_outbound_replication_requests | CrossClusterOutboundReplicationRequests | Tracks the number of outbound replication requests for cross-cluster replication | |
aws_es_cross_cluster_outbound_requests | CrossClusterOutboundRequests | Tracks the number of outbound requests for cross-cluster replication | |
aws_es_current_point_in_time | CurrentPointInTime | Tracks the current point in time (snapshot) available in Elasticsearch | |
aws_es_data_nodes | DataNodes | Tracks the number of data nodes in the Elasticsearch cluster | |
aws_es_data_nodes_shards_active | DataNodesShards.active | Tracks the number of active shards on data nodes | |
aws_es_data_nodes_shards_initializing | DataNodesShards.initializing | Tracks the number of shards that are initializing on data nodes | |
aws_es_data_nodes_shards_relocating | DataNodesShards.relocating | Tracks the number of shards that are relocating on data nodes | |
aws_es_data_nodes_shards_unassigned | DataNodesShards.unassigned | Tracks the number of unassigned shards on data nodes | |
aws_es_deleted_documents | DeletedDocuments | Tracks the number of deleted documents from the Elasticsearch cluster | |
aws_es_disk_queue_depth | DiskQueueDepth | Tracks the depth of the disk queue | |
aws_es_reporting_failed_request_sys_err_count | ESReportingFailedRequestSysErrCount | Tracks the number of failed reporting requests due to system errors | |
aws_es_reporting_failed_request_user_err_count | ESReportingFailedRequestUserErrCount | Tracks the number of failed reporting requests due to user errors | |
aws_es_reporting_request_count | ESReportingRequestCount | Tracks the number of reporting requests submitted to Elasticsearch | |
aws_es_reporting_success_count | ESReportingSuccessCount | Tracks the number of successful reporting requests | |
aws_es_elasticsearch_requests | ElasticsearchRequests | Tracks the number of requests to Elasticsearch | |
aws_es_follower_check_point | FollowerCheckPoint | Tracks the checkpoint of a follower node in cross-cluster replication | |
aws_es_free_storage_space | FreeStorageSpace | Tracks the available storage space in the Elasticsearch cluster | |
aws_es_has_active_point_in_time | HasActivePointInTime | Indicates `whether there is an active point-in-time snapshot | |
aws_es_has_used_point_in_time | HasUsedPointInTime | Indicates whether the point-in-time snapshot has been used | |
aws_es_hot_storage_space_utilization | HotStorageSpaceUtilization | Tracks the storage utilization of hot data | |
aws_es_hot_to_warm_migration_failure_count | HotToWarmMigrationFailureCount | Tracks the number of failures during migration from hot to warm storage | |
aws_es_hot_to_warm_migration_force_merge_latency | HotToWarmMigrationForceMergeLatency | Tracks the latency of force merging during migration from hot to warm storage | |
aws_es_hot_to_warm_migration_processing_latency | HotToWarmMigrationProcessingLatency | Tracks the latency of processing migration from hot to warm storage | |
aws_es_hot_to_warm_migration_queue_size | HotToWarmMigrationQueueSize | Tracks the queue size for migration from hot to warm storage | |
aws_es_hot_to_warm_migration_snapshot_latency | HotToWarmMigrationSnapshotLatency | Tracks the latency of snapshotting during migration from hot to warm storage | |
aws_es_hot_to_warm_migration_success_count | HotToWarmMigrationSuccessCount | Tracks the number of successful migrations from hot to warm storage | |
aws_es_hot_to_warm_migration_success_latency | HotToWarmMigrationSuccessLatency | Tracks the latency of successful migrations from hot to warm storage | |
aws_es_indexing_latency IndexingLatency | Tracks the latency of indexing documents in the Elasticsearch cluster | ||
aws_es_indexing_rate IndexingRate | Tracks the rate of indexing documents in the Elasticsearch cluster | ||
aws_es_invalid_host_header_requests | InvalidHostHeaderRequests | Tracks the number of requests with invalid host headers | |
aws_es_iops_throttle | IopsThrottle | Tracks throttling of input/output operations | |
aws_es_jvmgcold_collection_count | JVMGCOldCollectionCount | Tracks the number of garbage collection events in the old generation of JVM | |
aws_es_jvmgcold_collection_time | JVMGCOldCollectionTime | Tracks the time spent in garbage collection in the old generation of JVM | |
aws_es_jvmgcyoung_collection_count | JVMGCYoungCollectionCount | Tracks the number of garbage collection events in the young generation of JVM | |
aws_es_jvmgcyoung_collection_time | JVMGCYoungCollectionTime | Tracks the time spent in garbage collection in the young generation of JVM | |
aws_es_jvmmemory_pressure | JVMMemoryPressure | Tracks memory pressure on the JVM used by Elasticsearch | |
aws_es_kmskey_error KMSKeyError | Tracks the number of errors related to KMS keys used by the Elasticsearch cluster | ||
aws_es_kmskey_inaccessible | KMSKeyInaccessible | Tracks the number of times a KMS key is inaccessible for the Elasticsearch cluster | |
aws_es_knncache_capacity_reached | KNNCacheCapacityReached | Tracks when the KNN cache capacity is reached | |
aws_es_knncircuit_breaker_triggered | KNNCircuitBreakerTriggered | Tracks when the KNN circuit breaker is triggered | |
aws_es_knneviction_count | KNNEvictionCount | Tracks the number of evictions from the KNN cache | |
aws_es_knngraph_index_errors | KNNGraphIndexErrors | Tracks errors during KNN graph indexing | |
aws_es_knngraph_index_requests | KNNGraphIndexRequests | Tracks the number of KNN graph index requests | |
aws_es_knngraph_memory_usage | KNNGraphMemoryUsage | Tracks memory usage by the KNN graph | |
aws_es_knngraph_query_errors | KNNGraphQueryErrors | Tracks errors during KNN graph queries | |
aws_es_knngraph_query_requests | KNNGraphQueryRequests | Tracks the number of KNN graph query requests | |
aws_es_knnhit_count | KNNHitCount | Tracks the number of hits returned by KNN queries | |
aws_es_knnload_exception_count | KNNLoadExceptionCount | Tracks the number of exceptions during | KNN data loading |
aws_es_knnload_success_count | KNNLoadSuccessCount | Tracks the number of successful KNN data load operations | |
aws_es_knnmiss_count | KNNMissCount | Tracks the number of KNN cache misses | |
aws_es_knnquery_requests | KNNQueryRequests | Tracks the number of KNN queries | |
aws_es_knnscript_compilation_errors | KNNScriptCompilationErrors | Tracks the number of errors during KNN script compilation | |
aws_es_knnscript_compilations | KNNScriptCompilations | Tracks the number of KNN script compilations | |
aws_es_knnscript_query_errors | KNNScriptQueryErrors | Tracks errors during KNN script queries | |
aws_es_knnscript_query_requests | KNNScriptQueryRequests | Tracks the number of KNN script queries | |
aws_es_knntotal_load_time | KNNTotalLoadTime | Tracks the total load time for KNN operations | |
aws_es_kibana_concurrent_connections | KibanaConcurrentConnections | Tracks the number of concurrent Kibana connections | |
aws_es_kibana_healthy_nodes | KibanaHealthyNodes | Tracks the number of healthy Kibana nodes | |
aws_es_kibana_heap_total | KibanaHeapTotal | Tracks the total heap size of Kibana | |
aws_es_kibana_heap_used | KibanaHeapUsed | Tracks the heap size used by Kibana | |
aws_es_kibana_heap_utilization | KibanaHeapUtilization | Tracks the heap utilization of Kibana | |
aws_es_kibana_os1_minute_load | KibanaOS1MinuteLoad | Tracks the 1-minute load average of the Kibana node’s operating system | |
aws_es_kibana_reporting_failed_request_sys_err_count | KibanaReportingFailedRequestSysErrCount | Tracks the number of failed Kibana reporting requests due to system errors | |
aws_es_kibana_reporting_failed_request_user_err_count | KibanaReportingFailedRequestUserErrCount | Tracks the number of failed Kibana reporting requests due to user errors | |
aws_es_kibana_reporting_request_count | KibanaReportingRequestCount | Tracks the number of Kibana reporting requests | |
aws_es_kibana_reporting_success_count | KibanaReportingSuccessCount | Tracks the number of successful Kibana reporting requests | |
aws_es_kibana_request_total | KibanaRequestTotal | Tracks the total number of requests sent to Kibana | |
aws_es_kibana_response_times_max_in_millis | KibanaResponseTimesMaxInMillis | Tracks the maximum response time of Kibana requests in milliseconds | |
aws_es_ltrfeature_memory_usage_in_bytes | LTRFeatureMemoryUsageInBytes | Tracks memory usage by LTR features in bytes | |
aws_es_ltrfeatureset_memory_usage_in_bytes | LTRFeaturesetMemoryUsageInBytes | Tracks memory usage by LTR feature sets in bytes | |
aws_es_ltrmemory_usage | LTRMemoryUsage | Tracks overall memory usage by LTR features | |
aws_es_ltrmodel_memory_usage_in_bytes | LTRModelMemoryUsageInBytes | Tracks memory usage by LTR models in bytes | |
aws_es_ltrrequest_error_count | LTRRequestErrorCount | Tracks the number of errors in LTR requests | |
aws_es_ltrrequest_total_count | LTRRequestTotalCount | Tracks the total number of LTR requests | |
aws_es_ltrstatus_red | LTRStatus.red | Indicates if the LTR status is in a red (critical) state | |
aws_es_leader_check_point | LeaderCheckPoint | Tracks the checkpoint of the leader node in cross-cluster replication | |
aws_ es_master_cpucredit_balance | MasterCPUCreditBalance | Tracks the balance of CPU credits for the master node | |
aws_ es_master_cpuutilization | MasterCPUUtilization | Tracks the CPU utilization of the master node | |
aws_ es_master_free_storage_space | MasterFreeStorageSpace | Tracks the free storage space available on the naster node | |
aws_ es_master_jvmmemory_pressure | MasterJVMMemoryPressure | Tracks JVM memory pressure on the master node | |
aws_ es_master_old_gen_jvmmemory_pressure | MasterOldGenJVMMemoryPressure | Tracks old generation JVM memory pressure on the master node | |
aws_ es_master_reachable_from_node | MasterReachableFromNode | Tracks whether the master node is reachable from the data nodes | |
aws_ es_master_sys_memory_utilization | MasterSysMemoryUtilization | Tracks system memory utilization of the master node | |
aws_ es_max_provisioned_throughput | MaxProvisionedThroughput | Tracks the maximum provisioned throughput for Elasticsearch | |
aws_ es_nodes | Nodes | Tracks the number of nodes in the Elasticsearch cluster | |
aws_ es_old_gen_jvmmemory_pressure | OldGenJVMMemoryPressure | Tracks old generation JVM memory pressure on the nodes | |
aws_ es_open_search_dashboards_concurrent_connections | penSearchDashboardsConcurrentConnections | Tracks the number of concurrent connections to OpenSearch Dashboards | |
aws_ es_open_search_dashboards_healthy_node | OpenSearchDashboardsHealthyNode | Tracks the number of healthy OpenSearch Dashboard nodes | |
aws_ es_open_search_dashboards_healthy_nodes | OpenSearchDashboardsHealthyNodes | Tracks the number of healthy OpenSearch Dashboard nodes | |
aws_ es_open_search_dashboards_heap_total | OpenSearchDashboardsHeapTotal | Tracks the total heap size of OpenSearch Dashboards | |
aws_ es_open_search_dashboards_heap_used | OpenSearchDashboardsHeapUsed | Tracks the heap size used by OpenSearch Dashboards | |
aws_ es_open_search_dashboards_heap_utilization | OpenSearchDashboardsHeapUtilization | Tracks the heap utilization of OpenSearch Dashboards | |
aws_ es_open_search_dashboards_os1_minute_load | OpenSearchDashboardsOS1MinuteLoad | Tracks the 1-minute load average of the OpenSearch Dashboards node’s operating system | |
aws_ es_open_search_dashboards_request_total | OpenSearchDashboardsRequestTotal | Tracks the total number of requests sent to OpenSearch Dashboards | |
aws_ es_open_search_dashboards_response_times_max_in_millis | OpenSearchDashboardsResponseTimesMaxInMillis | Tracks the maximum response time of OpenSearch Dashboards requests in milliseconds | |
aws_ es_open_search_requests | OpenSearchRequests | Tracks the number of requests to OpenSearch | |
aws_ es_opensearch_dashboards_reporting_failed_request_sys_err_count | OpensearchDashboardsReportingFailedRequestSysErrCount | Tracks the number of failed OpenSearch Dashboards reporting requests due to system errors | |
aws_ es_opensearch_dashboards_reporting_failed_request_user_err_count | OpensearchDashboardsReportingFailedRequestUserErrCount | Tracks the number of failed OpenSearch Dashboards reporting requests due to user errors | |
aws_ es_opensearch_dashboards_reporting_request_count | OpensearchDashboardsReportingRequestCount | Tracks the number of OpenSearch Dashboards reporting requests | |
aws_ es_opensearch_dashboards_reporting_success_count | OpensearchDashboardsReportingSuccessCount | Tracks the number of successful OpenSearch Dashboards reporting requests | |
aws_es_pplfailed_request_count_by_cus_err | PPLFailedRequestCountByCusErr | Tracks the number of PPL failed requests due to customer errors | |
aws_es_pplfailed_request_count_by_sys_err | PPLFailedRequestCountBySysErr | Tracks the number of PPL failed requests due to system errors | |
aws_es_pplrequest_count | PPLRequestCount | Tracks the total number of PPL requests | |
aws_es_primary_write_rejected | PrimaryWriteRejected | Tracks the number of rejected primary write requests | |
aws_es_read_iops | ReadIOPS | Tracks input/output operations per second for reads | |
aws_es_read_iopsmicro_bursting | ReadIOPSMicroBursting | Tracks micro-bursting of input/output operations for reads | |
aws_es_read_latency | ReadLatency | Tracks the latency of read operations in the Elasticsearch cluster | |
aws_es_read_throughput | ReadThroughput | Tracks the throughput of read operations | |
aws_es_read_throughput_micro_bursting | ReadThroughputMicroBursting | Tracks micro-bursting of read throughput | |
aws_es_remote_storage_used_space | RemoteStorageUsedSpace | Tracks the amount of used space in remote storage | |
aws_es_remote_storage_write_rejected | RemoteStorageWriteRejected | Tracks the number of rejected write operations in remote storage | |
aws_es_replica_write_rejected | ReplicaWriteRejected | Tracks the number of rejected replica write requests | |
aws_es_replication_num_bootstrapping_indices | ReplicationNumBootstrappingIndices | Tracks the number of indices in the bootstrapping state for replication | |
aws_es_replication_num_failed_indices | ReplicationNumFailedIndices | Tracks the number of failed replication indices | |
aws_es_replication_num_paused_indices | ReplicationNumPausedIndices | Tracks the number of paused replication indices | |
aws_es_replication_num_syncing_indices | ReplicationNumSyncingIndices | Tracks the number of replication indices currently syncing | |
aws_es_replication_rate | ReplicationRate | Tracks the rate of replication in Elasticsearch | |
aws_es_sqldefault_cursor_request_count | SQLDefaultCursorRequestCount | Tracks the number of default SQL cursor requests | |
aws_es_sqlfailed_request_count_by_cus_err | SQLFailedRequestCountByCusErr | Tracks the number of SQL failed requests due to customer errors | |
aws_es_sqlfailed_request_count_by_sys_err | SQLFailedRequestCountBySysErr | Tracks the number of SQL failed requests due to system errors | |
aws_es_sqlrequest_count | SQLRequestCount | Tracks the total number of SQL requests | |
aws_es_sqlunhealthy | SQLUnhealthy | Tracks whether the SQL plugin is in an unhealthy state | |
aws_es_search_latency | SearchLatency | Tracks the latency of search operations in the Elasticsearch cluster | |
aws_es_search_rate | SearchRate | Tracks the rate of search operations | |
aws_es_search_shard_task_cancelled | SearchShardTaskCancelled | Tracks the number of search shard tasks that were canceled | |
aws_es_search_task_cancelled | SearchTaskCancelled | Tracks the number of canceled search tasks | |
aws_es_searchable_documents | SearchableDocuments | Tracks the number of searchable documents | |
aws_es_segment_count | SegmentCount | Tracks the number of segments in the Elasticsearch cluster | |
aws_es_shards_active | Shards.active | Tracks the number of active shards | |
aws_es_shards_active_primary | Shards.activePrimary | Tracks the number of active primary shards | |
aws_es_shards_delayed_unassigned | Shards.delayedUnassigned | Tracks the number of delayed unassigned shards | |
aws_es_shards_initializing | Shards.initializing | Tracks the number of initializing shards | |
aws_es_shards_relocating | Shards.relocating | Tracks the number of relocating shards | |
aws_es_shards_unassigned | Shards.unassigned | Tracks the number of unassigned shards | |
aws_es_sys_memory_utilization | SysMemoryUtilization | Tracks system memory utilization | |
aws_es_threadpool_bulk_queue | ThreadpoolBulkQueue | Tracks the size of the bulk thread pool queue | |
aws_es_threadpool_bulk_rejected | ThreadpoolBulkRejected | Tracks the number of bulk thread pool tasks that were rejected | |
aws_es_threadpool_bulk_threads | ThreadpoolBulkThreads | Tracks the number of active threads in the bulk thread pool | |
aws_es_threadpool_force_merge_queue | ThreadpoolForce_mergeQueue | Tracks the size of the force merge thread pool queue | |
aws_es_threadpool_force_merge_rejected | ThreadpoolForce_mergeRejected | Tracks the number of force merge thread pool tasks that were rejected | |
aws_es_threadpool_force_merge_threads | ThreadpoolForce_mergeThreads | Tracks the number of active threads in the force merge thread pool | |
aws_es_threadpool_index_queue | ThreadpoolIndexQueue | Tracks the size of the index thread pool queue | |
aws_es_threadpool_index_rejected | ThreadpoolIndexRejected | Tracks the number of index thread pool tasks that were rejected | |
aws_es_threadpool_index_threads | ThreadpoolIndexThreads | Tracks the number of active threads in the index thread pool | |
aws_es_threadpool_search_queue | ThreadpoolSearchQueue | Tracks the size of the search thread pool queue | |
aws_es_threadpool_search_rejected | ThreadpoolSearchRejected | Tracks the number of search thread pool tasks that were rejected | |
aws_es_threadpool_search_threads | ThreadpoolSearchThreads | Tracks the number of active threads in the search thread pool | |
aws_es_threadpool_write_queue | ThreadpoolWriteQueue | Tracks the size of the write thread pool queue | |
aws_es_threadpool_write_rejected | ThreadpoolWriteRejected | Tracks the number of write thread pool tasks that were rejected | |
aws_es_threadpool_write_threads | ThreadpoolWriteThreads | Tracks the number of active threads in the write thread pool | |
aws_es_threadpoolsql_worker_queue | Threadpoolsql-workerQueue | Tracks the size of the SQL worker thread pool queue | |
aws_es_threadpoolsql_worker_rejected | Threadpoolsql-workerRejected | Tracks the number of SQL worker thread pool tasks that were rejected | |
aws_es_threadpoolsql_worker_threads | Threadpoolsql-workerThreads | Tracks the number of active threads in the SQL worker thread pool | |
aws_es_throughput_throttle | ThroughputThrottle | Tracks throttling of throughput in the Elasticsearch cluster | |
aws_es_total_point_in_time | TotalPointInTime | Tracks the total number of point-in-time snapshots | |
aws_es_warm_cpuutilization | WarmCPUUtilization | Tracks the CPU utilization of warm data nodes | |
aws_es_warm_free_storage_space | WarmFreeStorageSpace | Tracks the available storage space in warm data nodes | |
aws_es_warm_jvmgcold_collection_count | WarmJVMGCOldCollectionCount | Tracks the number of garbage collection events in the old generation of JVM on warm data nodes | |
aws_es_warm_jvmgcyoung_collection_count | WarmJVMGCYoungCollectionCount | Tracks the number of garbage collection events in the young generation of JVM on warm data nodes | |
aws_es_warm_jvmgcyoung_collection_time | WarmJVMGCYoungCollectionTime | Tracks the time spent in garbage collection in the young generation of JVM on warm data nodes | |
aws_es_warm_jvmmemory_pressure | WarmJVMMemoryPressure | Tracks memory pressure on warm data nodes | |
aws_es_warm_old_gen_jvmmemory_pressure | WarmOldGenJVMMemoryPressure | Tracks old generation JVM memory pressure on warm data nodes | |
aws_es_warm_search_latency | WarmSearchLatency | Tracks the latency of search operations on warm data nodes | |
aws_es_warm_search_rate | WarmSearchRate | Tracks the rate of search operations on warm data nodes | |
aws_es_warm_searchable_documents | WarmSearchableDocuments | Tracks the number of searchable documents on warm data nodes | |
aws_es_warm_storage_space_utilization | WarmStorageSpaceUtilization | Tracks storage space utilization on warm data nodes | |
aws_es_warm_sys_memory_utilization | WarmSysMemoryUtilization | Tracks system memory utilization on warm data nodes | |
aws_es_warm_threadpool_search_queue | WarmThreadpoolSearchQueue | Tracks the size of the search thread pool queue on warm data nodes | |
aws_es_warm_threadpool_search_rejected | WarmThreadpoolSearchRejected | Tracks the number of search thread pool tasks that were rejected on warm data nodes | |
aws_es_warm_threadpool_search_threads | WarmThreadpoolSearchThreads | Tracks the number of active threads in the search thread pool on warm data nodes | |
aws_es_warm_to_cold_migration_failure_count | WarmToColdMigrationFailureCount | Tracks the number of failures during migration from warm to cold storage | |
aws_es_warm_to_cold_migration_latency | WarmToColdMigrationLatency | Tracks the latency of migration from warm to cold storage | |
aws_es_warm_to_cold_migration_queue_size | WarmToColdMigrationQueueSize | Tracks the queue size for migration from warm to cold storage | |
aws_es_warm_to_cold_migration_success_count | WarmToColdMigrationSuccessCount | Tracks the number of successful migrations from warm to cold storage | |
aws_es_warm_to_hot_migration_queue_size | WarmToHotMigrationQueueSize | Tracks the queue size for migration from warm to hot storage | |
aws_es_write_iops WriteIOPS | Tracks input/output operations per second for writes | ||
aws_es_write_iopsmicro_bursting | WriteIOPSMicroBursting | Tracks micro-bursting of input/output operations for writes | |
aws_es_write_latency | WriteLatency | Tracks the latency of write operations in the Elasticsearch cluster | |
aws_es_write_throughput | WriteThroughput | Tracks the throughput of write operations | |
aws_es_write_throughput_micro_bursting | WriteThroughputMicroBursting | Tracks micro-bursting of write throughput |
AWS/ElastiCache
Function: Managed Redis and Memcached for real-time caching
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_elasticache_info | |||
aws_elasticache_active_defrag_hits | ActiveDefragHits | Tracks the number of active defragmentation hits in ElastiCache | |
aws_elasticache_authentication_failures | AuthenticationFailures | Monitors the number of failed authentication attempts | |
aws_elasticache_bytes_read_from_disk | BytesReadFromDisk | Measures the number of bytes read from disk in the ElastiCache cluster | |
aws_elasticache_bytes_read_into_memcached | BytesReadIntoMemcached | Tracks the number of bytes read into Memcached | |
aws_elasticache_bytes_used_for_cache | BytesUsedForCache | Monitors the total amount of memory used for cache | |
aws_elasticache_bytes_used_for_cache_items | BytesUsedForCacheItems | Measures the memory used by items in cache | |
aws_elasticache_bytes_used_for_hash | BytesUsedForHash | Tracks memory used for hash tables in the cache | |
aws_elasticache_bytes_used_for_memory_db | BytesUsedForMemoryDB | Monitors memory usage for MemoryDB in ElastiCache | |
aws_elasticache_bytes_written_out_from_memcached | BytesWrittenOutFromMemcached | Tracks the number of bytes written out from Memcached | |
aws_elasticache_bytes_written_to_disk | BytesWrittenToDisk | Measures the number of bytes written to disk in the ElastiCache cluster | |
aws_elasticache_cpucredit_balance | CPUCreditBalance | Tracks the balance of CPU credits for burstable instance types | |
aws_elasticache_cpucredit_usage | CPUCreditUsage | Monitors CPU credit usage for burstable instance types | |
aws_elasticache_cpuutilization | CPUUtilization | Measures the CPU utilization of the ElastiCache instance | |
aws_elasticache_cache_hit_rate | CacheHitRate | Tracks the cache hit rate, indicating how often requested data is found in cache | |
aws_elasticache_cache_hits | CacheHits | Measures the total number of cache hits | |
aws_elasticache_cache_misses | CacheMisses | Tracks the number of cache misses, when requested data is not found in cache | |
aws_elasticache_cas_badval | CasBadval | Monitors the number of CAS operations that failed due to bad values | |
aws_elasticache_cas_hits | CasHits | Tracks the number of successful CAS operations | |
aws_elasticache_cas_misses | CasMisses | Measures the number of CAS operations that failed due to missing data | |
aws_elasticache_channel_authorization_failures | ChannelAuthorizationFailures | Tracks the number of channel authorization failures | |
aws_elasticache_cluster_based_cmds | ClusterBasedCmds | Monitors the number of cluster-based commands executed | |
aws_elasticache_cluster_based_cmds_latency | ClusterBasedCmdsLatency | Tracks the latency of cluster-based commands | |
aws_elasticache_cmd_config_get | CmdConfigGet | Measures the number of configuration GET commands executed | |
aws_elasticache_cmd_config_set | CmdConfigSet | Tracks the number of configuration SET commands executed | |
aws_elasticache_cmd_flush | CmdFlush | Monitors the number of flush commands executed in the ElastiCache cluster | |
aws_elasticache_cmd_get | CmdGet | Tracks the number of GET commands executed in the cache | |
aws_elasticache_cmd_set | CmdSet | Measures the number of SET commands executed in the cache | |
aws_elasticache_cmd_touch | CmdTouch | Tracks the number of touch commands executed in the cache | |
aws_elasticache_command_authorization_failures | CommandAuthorizationFailures | Monitors the number of command authorization failures in the ElastiCache cluster | |
aws_elasticache_curr_config | CurrConfig | Tracks the current configuration state of the ElastiCache instance | |
aws_elasticache_curr_connections | CurrConnections | Measures the current number of open connections to the ElastiCache instance | |
aws_elasticache_curr_items | CurrItems | Tracks the current number of items in the cache | |
aws_elasticache_curr_volatile_items | CurrVolatileItems | Monitors the number of volatile items in the cache | |
aws_elasticache_db0_average_ttl | DB0AverageTTL | Measures the average time-to-live (TTL) of items in the cache | |
**aws_elasticache_database_capacity_usage_counted_for_evict_percentage | DatabaseCapacityUsageCountedForEvictPercentage** | Tracks the percentage of database capacity usage considered for eviction | |
aws_elasticache_database_capacity_usage_percentage | DatabaseCapacityUsagePercentage | Monitors the overall percentage of database capacity usage | |
aws_elasticache_database_memory_usage_counted_for_evict_percentage | DatabaseMemoryUsageCountedForEvictPercentage** | Tracks the percentage of database memory usage considered for eviction | |
aws_elasticache_database_memory_usage_percentage | DatabaseMemoryUsagePercentage | Measures the overall memory usage percentage in the ElastiCache cluster | |
aws_elasticache_decr_hits | DecrHits | Monitors the number of successful DECR (decrement) operations | |
aws_elasticache_decr_misses | DecrMisses | Tracks the number of DECR operations that failed | |
aws_elasticache_delete_hits | DeleteHits | Measures the number of successful DELETE operations | |
aws_elasticache_delete_misses | DeleteMisses | Tracks the number of DELETE operations that failed | |
aws_elasticache_engine_cpuutilization | EngineCPUUtilization | Monitors the CPU utilization of the ElastiCache engine | |
aws_elasticache_eval_based_cmds | EvalBasedCmds | Tracks the number of EVAL-based commands executed in the cache | |
aws_elasticache_eval_based_cmds_latency | EvalBasedCmdsLatency | Measures the latency of EVAL-based commands in the cache | |
aws_elasticache_evicted_unfetched | EvictedUnfetched | Monitors the number of items evicted before being fetched | |
aws_elasticache_evictions | Evictions | Tracks the total number of evictions in the cache | |
aws_elasticache_expired_unfetched | ExpiredUnfetched | Measures the number of items that expired before being fetched | |
aws_elasticache_freeable_memory | FreeableMemory | Tracks the amount of free memory available in the ElastiCache cluster | |
aws_elasticache_geo_spatial_based_cmds | GeoSpatialBasedCmds | Monitors the number of geospatial commands executed | |
aws_elasticache_geo_spatial_based_cmds_latency | GeoSpatialBasedCmdsLatency | Measures the latency of geospatial commands | |
aws_elasticache_get_hits | GetHits | Tracks the number of successful GET operations in the cache | |
aws_elasticache_get_misses | GetMisses | Measures the number of GET operations that failed | |
aws_elasticache_get_type_cmds | GetTypeCmds | Monitors the number of GET-type commands executed | |
aws_elasticache_get_type_cmds_latency | GetTypeCmdsLatency | Measures the latency of GET-type commands executed | |
aws_elasticache_global_datastore_replication_lag | GlobalDatastoreReplicationLag | ||
aws_elasticache_hash_based_cmds | HashBasedCmds | ||
aws_elasticache_hash_based_cmds_latency | HashBasedCmdsLatency | ||
aws_elasticache_hyper_log_log_based_cmds | HyperLogLogBasedCmds | ||
aws_elasticache_hyper_log_log_based_cmds_latency | HyperLogLogBasedCmdsLatency | ||
aws_elasticache_iam_authentication_expirations | IamAuthenticationExpirations | ||
aws_elasticache_iam_authentication_throttling | IamAuthenticationThrottling | ||
aws_elasticache_incr_hits | IncrHits | ||
aws_elasticache_incr_misses | IncrMisses | ||
aws_elasticache_is_master | IsMaster | ||
aws_elasticache_is_primary | IsPrimary | ||
aws_elasticache_json_based_cmds | JsonBasedCmds | ||
aws_elasticache_json_based_cmds_latency | JsonBasedCmdsLatency | ||
aws_elasticache_json_based_get_cmds | JsonBasedGetCmds | ||
aws_elasticache_key_authorization_failures | KeyAuthorizationFailures | ||
aws_elasticache_key_based_cmds | KeyBasedCmds | ||
aws_elasticache_key_based_cmds_latency | KeyBasedCmdsLatency | ||
aws_elasticache_keys_tracked | KeysTracked | ||
aws_elasticache_keyspace_hits | KeyspaceHits | ||
aws_elasticache_keyspace_misses | KeyspaceMisses | ||
aws_elasticache_list_based_cmds | ListBasedCmds | ||
aws_elasticache_list_based_cmds_latency | ListBasedCmdsLatency | ||
aws_elasticache_master_link_health_status | MasterLinkHealthStatus | ||
aws_elasticache_max_replication_throughput | MaxReplicationThroughput | ||
aws_elasticache_memory_fragmentation_ratio | MemoryFragmentationRatio | ||
aws_elasticache_network_bandwidth_in_allowance_exceeded | NetworkBandwidthInAllowanceExceeded | ||
aws_elasticache_network_bandwidth_out_allowance_exceeded | NetworkBandwidthOutAllowanceExceeded | ||
aws_elasticache_network_bytes_in | NetworkBytesIn | ||
aws_elasticache_network_bytes_out | NetworkBytesOut | ||
aws_elasticache_network_conntrack_allowance_exceeded | NetworkConntrackAllowanceExceeded | ||
aws_elasticache_network_link_local_allowance_exceeded | NetworkLinkLocalAllowanceExceeded | ||
aws_elasticache_network_max_bytes_in | NetworkMaxBytesIn | ||
awselasticache_network_max_bytes_out | NetworkMaxBytesOut | ||
aws_elasticache_network_max_packets_in | NetworkMaxPacketsIn | ||
aws_elasticache_network_max_packets_out | NetworkMaxPacketsOut | ||
aws_elasticache_network_packets_in | NetworkPacketsIn | ||
aws_elasticache_network_packets_out | NetworkPacketsOut | ||
aws_elasticache_network_packets_per_second_allowance_exceeded | NetworkPacketsPerSecondAllowanceExceeded | ||
aws_elasticache_new_connections | NewConnections | ||
aws_elasticache_new_items | NewItems | ||
aws_elasticache_num_items_read_from_disk | NumItemsReadFromDisk | ||
aws_elasticache_num_items_written_to_disk | NumItemsWrittenToDisk | ||
aws_elasticache_primary_link_health_status | PrimaryLinkHealthStatus | ||
aws_elasticache_pub_sub_based_cmds | PubSubBasedCmds | ||
aws_elasticache_pub_sub_based_cmds_latency | PubSubBasedCmdsLatency | ||
aws_elasticache_reclaimed | Reclaimed | ||
aws_elasticache_replication_bytes | ReplicationBytes | ||
aws_elasticache_replication_delayed_write_commands | ReplicationDelayedWriteCommands | ||
aws_elasticache_replication_lag | ReplicationLag | ||
aws_elasticache_save_in_progress | SaveInProgress | ||
aws_elasticache_search_based_cmds | SearchBasedCmds | ||
aws_elasticache_search_based_get_cmds | SearchBasedGetCmds | ||
aws_elasticache_search_based_set_cmds | SearchBasedSetCmds | ||
aws_elasticache_search_number_of_indexed_keys | SearchNumberOfIndexedKeys | ||
aws_elasticache_search_number_of_indexes | SearchNumberOfIndexes | ||
aws_elasticache_search_total_index_size | SearchTotalIndexSize | ||
aws_elasticache_set_based_cmds | SetBasedCmds | ||
aws_elasticache_set_based_cmds_latency | SetBasedCmdsLatency | ||
aws_elasticache_set_type_cmds | SetTypeCmds | ||
aws_elasticache_set_type_cmds_latency | SetTypeCmdsLatency | ||
aws_elasticache_slabs_moved | SlabsMoved | ||
aws_elasticache_sorted_set_based_cmds | SortedSetBasedCmds | ||
aws_elasticache_sorted_set_based_cmds_latency | SortedSetBasedCmdsLatency | ||
aws_elasticache_stream_based_cmds | StreamBasedCmds | ||
aws_elasticache_stream_based_cmds_latency | StreamBasedCmdsLatency | ||
aws_elasticache_string_based_cmds | StringBasedCmds | ||
aws_elasticache_string_based_cmds_latency | StringBasedCmdsLatency | ||
aws_elasticache_swap_usage | SwapUsage | ||
aws_elasticache_touch_hits | TouchHits | ||
aws_elasticache_touch_misses | TouchMisses | ||
aws_elasticache_traffic_management_active | TrafficManagementActive | ||
aws_elasticache_unused_memory | UnusedMemory |
AWS/ElasticBeanstalk
Function: Service to quickly deploy and manage applications in the cloud without provisioning resources
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_elasticbeanstalk_info | ElasticBeanstalk Info | General information about the AWS Elastic Beanstalk environment |
aws_elasticbeanstalk_application_latency_p10 | ApplicationLatencyP10 | Tracks the 10th percentile application latency for Elastic Beanstalk |
aws_elasticbeanstalk_application_latency_p50 | ApplicationLatencyP50 | Measures the median (50th percentile) application latency |
aws_elasticbeanstalk_application_latency_p75 | ApplicationLatencyP75 | Tracks the 75th percentile latency of requests in Elastic Beanstalk |
aws_elasticbeanstalk_application_latency_p85 | ApplicationLatencyP85 | Measures the 85th percentile latency for Elastic Beanstalk applications |
aws_elasticbeanstalk_application_latency_p90 | ApplicationLatencyP90 | Tracks the 90th percentile application latency |
aws_elasticbeanstalk_application_latency_p95 | ApplicationLatencyP95 | Measures the 95th percentile latency for Elastic Beanstalk applications |
aws_elasticbeanstalk_application_latency_p99 | ApplicationLatencyP99 | Tracks the 99th percentile application latency |
aws_elasticbeanstalk_application_latency_p99_9 | ApplicationLatencyP99.9 | Measures the 99.9th percentile application latency in Elastic Beanstalk |
aws_elasticbeanstalk_application_requests2xx | ApplicationRequests2xx | Tracks the number of successful application requests with 2xx status codes |
aws_elasticbeanstalk_application_requests3xx | ApplicationRequests3xx | Measures the number of application requests with 3xx (redirection) status codes |
aws_elasticbeanstalk_application_requests4xx | ApplicationRequests4xx | Tracks the number of client error requests with 4xx status codes |
aws_elasticbeanstalk_application_requests5xx | ApplicationRequests5xx | Measures the number of server error requests with 5xx status codes |
aws_elasticbeanstalk_application_requests_total | ApplicationRequestsTotal | Tracks the total number of application requests received |
aws_elasticbeanstalk_cpuidle | CPUIdle | Measures the idle CPU time of instances within Elastic Beanstalk |
aws_elasticbeanstalk_cpuiowait | CPUIowait | Tracks the CPU time spent waiting for I/O operations to complete |
aws_elasticbeanstalk_cpuirq | CPUIrq | Measures the time spent on interrupt requests (IRQ) on the CPU |
aws_elasticbeanstalk_cpunice | CPUNice | Tracks the CPU time spent on user processes that have been “niced” |
aws_elasticbeanstalk_cpusoftirq | CPUSoftirq | Monitors CPU time used for soft interrupt requests |
aws_elasticbeanstalk_cpusystem | CPUSystem | Tracks the amount of CPU time spent executing system-level tasks |
aws_elasticbeanstalk_cpuuser | CPUUser | Measures the amount of CPU time spent executing user processes |
aws_elasticbeanstalk_environment_health | EnvironmentHealth | Monitors the overall health status of the Elastic Beanstalk environment |
aws_elasticbeanstalk_instance_health | InstanceHealth | Tracks the health status of individual instances in Elastic Beanstalk |
aws_elasticbeanstalk_instances_degraded | InstancesDegraded | Monitors the number of instances with degraded health |
aws_elasticbeanstalk_instances_info | InstancesInfo | Provides general information about the state of instances in Elastic Beanstalk |
aws_elasticbeanstalk_instances_no_data | InstancesNoData | Tracks the number of instances reporting no data |
aws_elasticbeanstalk_instances_ok | InstancesOk | Monitors the number of healthy instances in the environment |
aws_elasticbeanstalk_instances_pending | InstancesPending | Measures the number of instances in a pending state |
aws_elasticbeanstalk_instances_severe | InstancesSevere | Tracks the number of instances with severe health problems |
aws_elasticbeanstalk_instances_unknown | InstancesUnknown | Monitors the number of instances with unknown health status |
aws_elasticbeanstalk_instances_warning | InstancesWarning | Tracks the number of instances in warning status |
aws_elasticbeanstalk_load_average1min | LoadAverage1min | Measures the system load average over the last 1 minute |
aws_elasticbeanstalk_load_average5min | LoadAverage5min | Tracks the system load average over the last 5 minutes |
aws_elasticbeanstalk_root_filesystem_util | RootFilesystemUtil | Monitors the usage of the root file system |
AWS/ElasticMapReduce
Function: Managed big data platform for processing large amounts of data using Hadoop
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_elasticmapreduce_info | ElasticMapReduce Info | General information about the state of the AWS Elastic MapReduce cluster |
aws_elasticmapreduce_apps_completed | AppsCompleted | Tracks the number of applications that have successfully completed |
aws_elasticmapreduce_apps_failed | AppsFailed | Monitors the number of applications that have failed |
aws_elasticmapreduce_apps_killed | AppsKilled | Tracks the number of applications that were terminated or killed |
aws_elasticmapreduce_apps_pending | AppsPending | Measures the number of applications that are in the pending state |
aws_elasticmapreduce_apps_running | AppsRunning | Tracks the number of applications currently running |
aws_elasticmapreduce_apps_submitted | AppsSubmitted | Measures the total number of applications that have been submitted |
aws_elasticmapreduce_backup_failed | BackupFailed | Tracks the number of backup attempts that failed |
aws_elasticmapreduce_capacity_remaining_gb | CapacityRemainingGB | Measures the remaining storage capacity in gigabytes within the cluster |
aws_elasticmapreduce_cluster_status | ClusterStatus | Monitors the overall status of the Elastic MapReduce cluster |
aws_elasticmapreduce_container_allocated | ContainerAllocated | Tracks the number of containers allocated for running tasks |
aws_elasticmapreduce_container_pending | ContainerPending | Measures the number of containers pending allocation |
aws_elasticmapreduce_container_pending_ratio | ContainerPendingRatio | Tracks the ratio of pending containers to total containers |
aws_elasticmapreduce_container_reserved | ContainerReserved | Monitors the number of containers reserved for future tasks |
aws_elasticmapreduce_core_nodes_pending | CoreNodesPending | Tracks the number of core nodes that are pending |
aws_elasticmapreduce_core_nodes_running | CoreNodesRunning | Measures the number of core nodes that are currently running |
aws_elasticmapreduce_corrupt_blocks | CorruptBlocks | Monitors the number of blocks that are identified as corrupt |
aws_elasticmapreduce_dfs_pending_replication_blocks | DfsPendingReplicationBlocks | Tracks the number of HDFS blocks that are pending replication |
aws_elasticmapreduce_hbase | HBase | Monitors the health and activity of the HBase database in the cluster |
aws_elasticmapreduce_hdfsbytes_read | HDFSBytesRead | Measures the number of bytes read from HDFS in the cluster |
aws_elasticmapreduce_hdfsbytes_written | HDFSBytesWritten | Tracks the number of bytes written to HDFS |
aws_elasticmapreduce_hdfsutilization | HDFSUtilization | Monitors the utilization of HDFS in the cluster |
aws_elasticmapreduce_hbase_backup_failed | HbaseBackupFailed | Tracks the number of failed backups for HBase in the cluster |
aws_elasticmapreduce_io | IO | Monitors input/output (I/O) operations in the cluster |
aws_elasticmapreduce_is_idle | IsIdle | Tracks if the cluster or a node is currently idle |
aws_elasticmapreduce_jobs_failed | JobsFailed | Measures the number of failed jobs in the cluster |
aws_elasticmapreduce_jobs_running | JobsRunning | Tracks the number of currently running jobs |
aws_elasticmapreduce_live_data_nodes | LiveDataNodes | Monitors the number of live data nodes in the cluster |
aws_elasticmapreduce_live_task_trackers | LiveTaskTrackers | Tracks the number of live task trackers |
aws_elasticmapreduce_mractive_nodes | MRActiveNodes | Measures the number of active MapReduce nodes in the cluster |
aws_elasticmapreduce_mrdecommissioned_nodes | MRDecommissionedNodes | Tracks the number of decommissioned MapReduce nodes |
aws_elasticmapreduce_mrlost_nodes | MRLostNodes | Monitors the number of lost MapReduce nodes in the cluster |
aws_elasticmapreduce_mrrebooted_nodes | MRRebootedNodes | Measures the number of rebooted MapReduce nodes |
aws_elasticmapreduce_mrtotal_nodes | MRTotalNodes | Tracks the total number of MapReduce nodes |
aws_elasticmapreduce_mrunhealthy_nodes | MRUnhealthyNodes | Monitors the number of unhealthy MapReduce nodes |
aws_elasticmapreduce_map_reduce | Map/Reduce | General metric for MapReduce activity in the cluster |
aws_elasticmapreduce_map_slots_open | MapSlotsOpen | Tracks the number of open Map slots in the cluster |
aws_elasticmapreduce_map_tasks_remaining | MapTasksRemaining | Monitors the number of remaining Map tasks |
aws_elasticmapreduce_map_tasks_running | MapTasksRunning | Tracks the number of Map tasks currently running |
aws_elasticmapreduce_memory_allocated_mb | MemoryAllocatedMB | Measures the memory allocated in MB in the cluster |
aws_elasticmapreduce_memory_available_mb | MemoryAvailableMB | Tracks the available memory in MB in the cluster |
aws_elasticmapreduce_memory_reserved_mb | MemoryReservedMB | Monitors the memory reserved for future tasks in MB |
aws_elasticmapreduce_memory_total_mb | MemoryTotalMB | Tracks the total memory available in MB in the cluster |
aws_elasticmapreduce_missing_blocks | MissingBlocks | Measures the number of missing HDFS blocks in the cluster |
aws_elasticmapreduce_most_recent_backup_duration | MostRecentBackupDuration | Tracks the duration of the most recent backup |
aws_elasticmapreduce_node_status | NodeStatus | Monitors the overall status of the nodes in the cluster |
aws_elasticmapreduce_pending_deletion_blocks | PendingDeletionBlocks | Tracks the number of HDFS blocks pending deletion |
aws_elasticmapreduce_reduce_slots_open | ReduceSlotsOpen | Measures the number of open Reduce slots in the cluster |
aws_elasticmapreduce_reduce_tasks_remaining | ReduceTasksRemaining | Monitors the number of remaining Reduce tasks |
aws_elasticmapreduce_reduce_tasks_running | ReduceTasksRunning | Tracks the number of currently running Reduce tasks |
aws_elasticmapreduce_remaining_map_tasks_per_slot | RemainingMapTasksPerSlot | Measures the remaining Map tasks per slot |
aws_elasticmapreduce_s3_bytes_read | S3BytesRead | Tracks the number of bytes read from S3 during the cluster operation |
aws_elasticmapreduce_s3_bytes_written | S3BytesWritten | Measures the number of bytes written to S3 during the cluster operation |
aws_elasticmapreduce_task_nodes_pending | TaskNodesPending | Tracks the number of task nodes that are pending allocation |
aws_elasticmapreduce_task_nodes_running | TaskNodesRunning | Monitors the number of running task nodes in the cluster |
aws_elasticmapreduce_time_since_last_successful_backup | TimeSinceLastSuccessfulBackup | Measures the time elapsed since the last successful backup |
aws_elasticmapreduce_total_load | TotalLoad | Tracks the total computational load on the cluster |
aws_elasticmapreduce_under_replicated_blocks | UnderReplicatedBlocks | Monitors the number of under-replicated HDFS blocks in the cluster |
aws_elasticmapreduce_yarnmemory_available_percentage | YARNMemoryAvailablePercentage | Tracks the percentage of available YARN memory in the cluster |
AWS/Events
Function: Delivers a near real-time stream of system events for building reactive applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_events_info | General information about AWS Events | |
aws_events_dead_letter_invocations | DeadLetterInvocations | Tracks the number of times a message is sent to the dead letter queue |
aws_events_events Events | Monitors the total number of events received by AWS Events | |
aws_events_failed_invocations | FailedInvocations | Tracks the number of invocation failures |
aws_events_ingestionto_invocation_complete_latency | IngestiontoInvocationCompleteLatency | Measures the latency from event ingestion to invocation completion |
aws_events_ingestionto_invocation_start_latency | IngestiontoInvocationStartLatency | Measures the latency from event ingestion to invocation start |
aws_events_invocation_attempts | InvocationAttempts | Tracks the total number of invocation attempts |
aws_events_invocations | Invocations | Tracks the total number of invocations |
aws_events_invocations_created | InvocationsCreated | Monitors the number of invocations created |
aws_events_invocations_failed_to_be_sent_to_dlq | InvocationsFailedToBeSentToDlq | Tracks the number of invocations that failed to be sent to the dead letter queue |
aws_events_invocations_sent_to_dlq | InvocationsSentToDlq | Tracks the number of invocations successfully sent to the dead letter queue |
aws_events_matched_events | MatchedEvents | Monitors the number of events that matched event rules |
aws_events_put_events_approximate_call_count | PutEventsApproximateCallCount | Measures the approximate number of PutEvents API call requests |
aws_events_put_events_approximate_failed_count | PutEventsApproximateFailedCount | Tracks the approximate number of PutEvents API call failures |
aws_events_put_events_approximate_success_count | PutEventsApproximateSuccessCount | Monitors the approximate number of successful PutEvents API call requests |
aws_events_put_events_approximate_throttled_count | PutEventsApproximateThrottledCount | Tracks the approximate number of throttled PutEvents API call requests |
aws_events_put_events_entries_count | PutEventsEntriesCount | Measures the number of event entries in PutEvents requests |
aws_events_put_events_failed_entries_count | PutEventsFailedEntriesCount | Tracks the number of failed event entries in PutEvents requests |
aws_events_put_events_latency | PutEventsLatency | Monitors the latency of PutEvents API requests |
aws_events_put_events_request_size | PutEventsRequestSize | Measures the size of PutEvents API requests |
aws_events_put_partner_events_approximate_call_count | PutPartnerEventsApproximateCallCount | Monitors the approximate number of PutPartnerEvents API call requests |
aws_events_put_partner_events_approximate_failed_count | PutPartnerEventsApproximateFailedCount | Tracks the approximate number of failed PutPartnerEvents API call requests |
aws_events_put_partner_events_approximate_success_count | PutPartnerEventsApproximateSuccessCount | Measures the approximate number of successful PutPartnerEvents API call requests |
aws_events_put_partner_events_approximate_throttled_count | PutPartnerEventsApproximateThrottledCount | Tracks the approximate number of throttled PutPartnerEvents API call requests |
aws_events_put_partner_events_entries_count | PutPartnerEventsEntriesCount | Measures the number of event entries in PutPartnerEvents requests |
aws_events_put_partner_events_failed_entries_count | PutPartnerEventsFailedEntriesCount | Monitors the number of failed event entries in PutPartnerEvents requests |
aws_events_put_partner_events_latency | PutPartnerEventsLatency | Tracks the latency of PutPartnerEvents API requests |
aws_events_retry_invocation_attempts | RetryInvocationAttempts | Measures the number of retry invocation attempts |
aws_events_successful_invocation_attempts | SuccessfulInvocationAttempts | Tracks the number of successful invocation attempts |
aws_events_throttled_rules | ThrottledRules | Monitors the number of rules that were throttled |
aws_events_triggered_rules | TriggeredRules | Tracks the number of event rules that were triggered |
AWS/FSx
Function: Managed file systems optimized for specific workloads like Windows and Lustre
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_fsx_info | General information about FSx | |
aws_fsx_cpuutilization | CPUUtilization | Measures the percentage of CPU utilization on the FSx file system |
aws_fsx_client_connections | ClientConnections | Tracks the number of active client connections to the FSx file system |
aws_fsx_data_read_bytes | DataReadBytes | Monitors the total bytes read from the file system |
aws_fsx_data_read_operations | DataReadOperations | Measures the number of data read operations |
aws_fsx_data_write_bytes | DataWriteBytes | Tracks the total bytes written to the file system |
aws_fsx_data_write_operations | DataWriteOperations | Monitors the number of data write operations |
aws_fsx_deduplication_saved_storage | DeduplicationSavedStorage | Measures the amount of storage saved through data deduplication |
aws_fsx_disk_iops_utilization | DiskIopsUtilization | Tracks the percentage of disk IOPS (Input/Output Operations Per Second) utilization |
aws_fsx_disk_read_bytes | DiskReadBytes | Monitors the total bytes read from the disk |
aws_fsx_disk_read_operations | DiskReadOperations | Measures the number of disk read operations |
aws_fsx_disk_throughput_balance | DiskThroughputBalance | Tracks the balance of disk throughput usage |
aws_fsx_disk_throughput_utilization | DiskThroughputUtilization | Measures the percentage of disk throughput utilization |
aws_fsx_disk_write_bytes | DiskWriteBytes | Tracks the total bytes written to the disk |
aws_fsx_disk_write_operations | DiskWriteOperations | Monitors the number of disk write operations |
aws_fsx_file_server_disk_iops_balance | FileServerDiskIopsBalance | Measures the balance of IOPS utilization on the file server |
aws_fsx_file_server_disk_iops_utilization | FileServerDiskIopsUtilization | Tracks the percentage of IOPS utilization on the file server |
aws_fsx_file_server_disk_throughput_balance | FileServerDiskThroughputBalance | Measures the balance of disk throughput on the file server |
aws_fsx_file_server_disk_throughput_utilization | FileServerDiskThroughputUtilization | Monitors the percentage of disk throughput utilization on the file server |
aws_fsx_free_data_storage_capacity | FreeDataStorageCapacity | Tracks the amount of free data storage capacity available |
aws_fsx_free_storage_capacity | FreeStorageCapacity | Measures the total amount of free storage capacity available |
aws_fsx_memory_utilization | MemoryUtilization | Monitors the percentage of memory utilization on the file system |
aws_fsx_metadata_operations | MetadataOperations | Tracks the number of metadata operations (like file system metadata lookups) |
aws_fsx_network_throughput_utilization | NetworkThroughputUtilization | Measures the percentage of network throughput utilization |
aws_fsx_storage_capacity_utilization | StorageCapacityUtilization | Tracks the percentage of storage capacity utilization |
AWS/Firehose
Function: Service to reliably load streaming data into AWS data stores like S3 and Redshift
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_firehose_info | General information about Firehose | ||
aws_firehose_active_partitions_limit | ActivePartitionsLimit | Tracks the limit of active partitions | |
aws_firehose_backup_to_s3_bytes | BackupToS3.Bytes | Measures the amount of data backed up to S3 in bytes | |
aws_firehose_backup_to_s3_data_freshness | BackupToS3.DataFreshness | Monitors the data freshness of backups to S3 | |
aws_firehose_backup_to_s3_records | BackupToS3.Records | Tracks the number of records backed up to S3 | |
aws_firehose_backup_to_s3_success | BackupToS3.Success | Measures the success rate of data backup to S3 | |
aws_firehose_bytes_per_second_limit | BytesPerSecondLimit | Monitors the bytes per second limit for data delivery | |
aws_firehose_data_read_from_kinesis_stream_bytes | DataReadFromKinesisStream.Bytes | Tracks the amount of data read from a Kinesis stream in bytes | |
aws_firehose_data_read_from_kinesis_stream_records | DataReadFromKinesisStream.Records | Tracks the number of records read from a Kinesis stream | |
aws_firehose_data_read_from_source_backpressured | DataReadFromSource.Backpressured | Measures if the data source is backpressured | |
aws_firehose_data_read_from_source_bytes | DataReadFromSource.Bytes | Monitors the amount of data read from the source in bytes | |
aws_firehose_data_read_from_source_records | DataReadFromSource.Records | Tracks the number of records read from the source | |
aws_firehose_delivery_to_amazon_open_search_serverless_auth_failure | DeliveryToAmazonOpenSearchServerless.AuthFailure | Tracks authorization failures during delivery to Amazon OpenSearch Serverless | |
aws_firehose_delivery_to_amazon_open_search_serverless_bytes | DeliveryToAmazonOpenSearchServerless.Bytes | Measures the amount of data delivered to Amazon OpenSearch Serverless in bytes | |
aws_firehose_delivery_to_amazon_open_search_serverless_data_freshness | DeliveryToAmazonOpenSearchServerless.DataFreshness | Monitors the data freshness during delivery to Amazon OpenSearch Serverless | |
aws_firehose_delivery_to_amazon_open_search_serverless_delivery_rejected | DeliveryToAmazonOpenSearchServerless.DeliveryRejected | Tracks the number of rejected deliveries to Amazon OpenSearch Serverless | |
aws_firehose_delivery_to_amazon_open_search_serverless_records | DeliveryToAmazonOpenSearchServerless.Records | Measures the number of records delivered to Amazon OpenSearch Serverless | |
aws_firehose_delivery_to_amazon_open_search_serverless_success | DeliveryToAmazonOpenSearchServerless.Success | Tracks the success rate of delivery to Amazon OpenSearch Serverless | |
aws_firehose_delivery_to_amazon_open_search_service_auth_failure | DeliveryToAmazonOpenSearchService.AuthFailure | Monitors authorization failures during delivery to Amazon OpenSearch Service | |
aws_firehose_delivery_to_amazon_open_search_service_bytes | DeliveryToAmazonOpenSearchService.Bytes | Tracks the amount of data delivered to Amazon OpenSearch Service in bytes | |
aws_firehose_delivery_to_amazon_open_search_service_data_freshness | DeliveryToAmazonOpenSearchService.DataFreshness | Monitors the data freshness during delivery to Amazon OpenSearch Service | |
aws_firehose_delivery_to_amazon_open_search_service_delivery_rejected | DeliveryToAmazonOpenSearchService.DeliveryRejected | Tracks the number of rejected deliveries to Amazon OpenSearch Service | |
aws_firehose_delivery_to_amazon_open_search_service_records | DeliveryToAmazonOpenSearchService.Records | Measures the number of records delivered to Amazon OpenSearch Service | |
aws_firehose_delivery_to_amazon_open_search_service_success | DeliveryToAmazonOpenSearchService.Success | Tracks the success rate of delivery to Amazon OpenSearch Service | |
aws_firehose_delivery_to_elasticsearch_bytes | DeliveryToElasticsearch.Bytes | Measures the amount of data delivered to Elasticsearch in bytes | |
aws_firehose_delivery_to_elasticsearch_records | DeliveryToElasticsearch.Records | Tracks the number of records delivered to Elasticsearch | |
aws_firehose_delivery_to_elasticsearch_success | DeliveryToElasticsearch.Success | Monitors the success rate of delivery to Elasticsearch | |
aws_firehose_delivery_to_http_endpoint_bytes | DeliveryToHttpEndpoint.Bytes | Measures the amount of data delivered to an HTTP endpoint in bytes | |
aws_firehose_delivery_to_http_endpoint_data_freshness | DeliveryToHttpEndpoint.DataFreshness | Monitors the data freshness during delivery to an HTTP endpoint | |
aws_firehose_delivery_to_http_endpoint_processed_bytes | DeliveryToHttpEndpoint.ProcessedBytes | Tracks the amount of data processed at an HTTP endpoint | |
aws_firehose_delivery_to_http_endpoint_processed_records | DeliveryToHttpEndpoint.ProcessedRecords | Monitors the number of records processed at an HTTP endpoint | |
aws_firehose_delivery_to_http_endpoint_records | DeliveryToHttpEndpoint.Records | Tracks the number of records delivered to an HTTP endpoint | |
aws_firehose_delivery_to_http_endpoint_success | DeliveryToHttpEndpoint.Success | Measures the success rate of delivery to an HTTP endpoint | |
aws_firehose_delivery_to_redshift_bytes | DeliveryToRedshift.Bytes | Tracks the amount of data delivered to Redshift in bytes | |
aws_firehose_delivery_to_redshift_records | DeliveryToRedshift.Records | Monitors the number of records delivered to Redshift | |
aws_firehose_delivery_to_redshift_success | DeliveryToRedshift.Success | Measures the success rate of delivery to Redshift | |
aws_firehose_delivery_to_s3_bytes | DeliveryToS3.Bytes | Tracks the amount of data delivered to S3 in bytes | |
aws_firehose_delivery_to_s3_data_freshness | DeliveryToS3.DataFreshness | Monitors the data freshness during delivery to S3 | |
aws_firehose_delivery_to_s3_object_count | DeliveryToS3.ObjectCount | Tracks the number of objects delivered to S3 | |
aws_firehose_delivery_to_s3_records | DeliveryToS3.Records | Monitors the number of records delivered to S3 | |
aws_firehose_delivery_to_s3_success | DeliveryToS3.Success | Measures the success rate of delivery to S3 | |
aws_firehose_delivery_to_snowflake_bytes | DeliveryToSnowflake.Bytes | Tracks the amount of data delivered to Snowflake in bytes | |
aws_firehose_delivery_to_snowflake_data_commit_latency | DeliveryToSnowflake.DataCommitLatency | Measures the latency for data commit during delivery to Snowflake | |
aws_firehose_delivery_to_snowflake_data_freshness | DeliveryToSnowflake.DataFreshness | Monitors the data freshness during delivery to Snowflake | |
aws_firehose_delivery_to_snowflake_records | DeliveryToSnowflake.Records | Tracks the number of records delivered to Snowflake | |
aws_firehose_delivery_to_snowflake_success | DeliveryToSnowflake.Success | Measures the success rate of delivery to Snowflake | |
aws_firehose_delivery_to_splunk_bytes DeliveryToSplunk.Bytes | Tracks the amount of data delivered to Splunk in bytes | ||
aws_firehose_delivery_to_splunk_data_ack_latency | DeliveryToSplunk.DataAckLatency | Measures the acknowledgment latency during delivery to Splunk | |
aws_firehose_delivery_to_splunk_data_freshness | DeliveryToSplunk.DataFreshness | Monitors the data freshness during delivery to Splunk | |
aws_firehose_delivery_to_splunk_records | DeliveryToSplunk.Records | Tracks the number of records delivered to Splunk | |
aws_firehose_delivery_to_splunk_success | DeliveryToSplunk.Success | Measures the success rate of delivery to Splunk | |
aws_firehose_describe_delivery_stream_latency | DescribeDeliveryStream.Latency | Tracks the latency for describing a delivery stream | |
aws_firehose_describe_delivery_stream_requests | DescribeDeliveryStream.Requests | Measures the number of requests to describe a delivery stream | |
aws_firehose_execute_processing_duration | ExecuteProcessing.Duration | Tracks the duration of data processing during delivery | |
aws_firehose_execute_processing_success | ExecuteProcessing.Success | Measures the success rate of data processing during delivery | |
aws_firehose_failed_conversion_bytes | FailedConversion.Bytes | Tracks the number of bytes that failed during conversion | |
aws_firehose_failed_conversion_records | FailedConversion.Records | Monitors the number of records that failed during conversion | |
aws_firehose_failed_validation_bytes | FailedValidation.Bytes | Tracks the number of bytes that failed during validation | |
aws_firehose_failed_validation_records | FailedValidation.Records | Monitors the number of records that failed during validation | |
aws_firehose_incoming_bytes | IncomingBytes | Tracks the amount of incoming data in bytes | |
aws_firehose_incoming_put_requests | IncomingPutRequests | Measures the number of incoming put requests | |
aws_firehose_incoming_records | IncomingRecords | Monitors the number of incoming records | |
aws_firehose_jqprocessing_duration | JQProcessing.Duration | Tracks the duration of JQ (JSON Query) processing | |
aws_firehose_kmskey_access_denied | KMSKeyAccessDenied | Monitors instances where access to the KMS (Key Management Service) key is denied | |
aws_firehose_kmskey_disabled | KMSKeyDisabled | Tracks the instances where the KMS key is disabled | |
aws_firehose_kmskey_invalid_state | KMSKeyInvalidState | Monitors the instances where the KMS key is in an invalid state | |
aws_firehose_kmskey_not_found | KMSKeyNotFound | Tracks the instances where the KMS key is not found | |
aws_firehose_kafka_offset_lag | KafkaOffsetLag | Monitors the lag in Kafka offset | |
aws_firehose_kinesis_millis_behind_latest | KinesisMillisBehindLatest | Tracks the time lag (in milliseconds) behind the latest record in Kinesis | |
aws_firehose_list_delivery_streams_latency | ListDeliveryStreams.Latency | Measures the latency in listing delivery streams | |
aws_firehose_list_delivery_streams_requests | ListDeliveryStreams.Requests | Tracks the number of requests for listing delivery streams | |
aws_firehose_output_decompressed_bytes_failed | OutputDecompressedBytes.Failed | Measures the number of decompressed bytes that failed | |
aws_firehose_output_decompressed_bytes_success | OutputDecompressedBytes.Success | Tracks the number of decompressed bytes that succeeded | |
aws_firehose_output_decompressed_records_failed | OutputDecompressedRecords.Failed | Monitors the number of decompressed records that failed | |
aws_firehose_output_decompressed_records_success | OutputDecompressedRecords.Success | Tracks the number of decompressed records that succeeded | |
aws_firehose_partition_count | PartitionCount | Measures the count of partitions during data delivery | |
aws_firehose_partition_count_exceeded | PartitionCountExceeded | Monitors instances where partition count exceeds limits | |
aws_firehose_per_partition_throughput | PerPartitionThroughput | Measures the throughput per partition during data delivery | |
aws_firehose_put_record_bytes | PutRecord.Bytes | Tracks the number of bytes delivered via PutRecord API | |
aws_firehose_put_record_latency | PutRecord.Latency | Measures the latency in PutRecord API calls | |
aws_firehose_put_record_requests | PutRecord.Requests | Monitors the number of requests via PutRecord API | |
aws_firehose_put_record_batch_bytes | PutRecordBatch.Bytes | Tracks the number of bytes delivered via PutRecordBatch API | |
aws_firehose_put_record_batch_latency | PutRecordBatch.Latency | Measures the latency in PutRecordBatch API calls | |
aws_firehose_put_record_batch_records | PutRecordBatch.Records | Monitors the number of records delivered via PutRecordBatch | API |
aws_firehose_put_record_batch_requests | PutRecordBatch.Requests | Measures the number of requests via PutRecordBatch API | |
aws_firehose_put_requests_per_second_limit | PutRequestsPerSecondLimit | Monitors the limit on PutRecord requests per second | |
aws_firehose_records_per_second_limit | RecordsPerSecondLimit | Tracks the limit on records processed per second | |
aws_firehose_resource_count | ResourceCount | Monitors the count of resources in the data delivery stream | |
aws_firehose_source_throttled_delay | SourceThrottled.Delay | Measures the delay caused by throttling on the data source | |
aws_firehose_succeed_conversion_bytes | SucceedConversion.Bytes | Tracks the number of bytes successfully converted | |
aws_firehose_succeed_conversion_records | SucceedConversion.Records | Monitors the number of records successfully converted | |
aws_firehose_succeed_processing_bytes | SucceedProcessing.Bytes | Measures the number of bytes successfully processed | |
aws_firehose_succeed_processing_records | SucceedProcessing.Records | Tracks the number of records successfully processed | |
aws_firehose_throttled_describe_stream | ThrottledDescribeStream | Monitors instances of throttled DescribeStream API calls | |
aws_firehose_throttled_get_records | ThrottledGetRecords | Measures instances of throttled GetRecords API calls | |
aws_firehose_throttled_get_shard_iterator | ThrottledGetShardIterator | Tracks instances of throttled GetShardIterator API calls | |
aws_firehose_throttled_records | ThrottledRecords | Measures instances where records are throttled | |
aws_firehose_update_delivery_stream_latency | UpdateDeliveryStream.Latency | Measures the latency in updating delivery streams | |
aws_firehose_update_delivery_stream_requests | UpdateDeliveryStream.Requests | Tracks the number of requests for updating delivery streams |
AWS/GameLift
Function: Managed service for deploying, operating, and scaling dedicated game servers
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_gamelift_info | General information about GameLift | |
aws_gamelift_activating_game_sessions | ActivatingGameSessions | Tracks the number of game sessions currently being activated |
aws_gamelift_active_game_sessions | ActiveGameSessions | Monitors the number of active game sessions |
aws_gamelift_active_instances | ActiveInstances | Tracks the number of active GameLift instances |
aws_gamelift_active_server_processes | ActiveServerProcesses | Monitors the number of active server processes |
aws_gamelift_available_game_servers | AvailableGameServers | Tracks the number of available game servers |
aws_gamelift_available_game_sessions | AvailableGameSessions | Monitors the number of available game sessions |
aws_gamelift_average_wait_time | AverageWaitTime | Tracks the average wait time for players |
aws_gamelift_current_player_sessions | CurrentPlayerSessions | Monitors the number of current active player sessions |
aws_gamelift_current_tickets | CurrentTickets | Tracks the number of current active matchmaking tickets |
aws_gamelift_desired_instances | DesiredInstances | Tracks the number of desired instances for the fleet |
aws_gamelift_draining_available_game_servers | DrainingAvailableGameServers | Monitors the number of available game servers that are draining |
aws_gamelift_draining_utilized_game_servers | DrainingUtilizedGameServers | Tracks the number of utilized game servers that are draining |
aws_gamelift_first_choice_not_viable | FirstChoiceNotViable | Monitors the number of times the first placement choice was not viable |
aws_gamelift_first_choice_out_of_capacity | FirstChoiceOutOfCapacity | Tracks the number of times the first placement choice ran out of capacity |
aws_gamelift_game_session_interruptions | GameSessionInterruptions | Monitors the number of game session interruptions |
aws_gamelift_healthy_server_processes | HealthyServerProcesses | Tracks the number of healthy server processes |
aws_gamelift_idle_instances | IdleInstances | Monitors the number of idle instances in the fleet |
aws_gamelift_instance_interruptions | InstanceInterruptions | Tracks the number of GameLift instance interruptions |
aws_gamelift_lowest_latency_placement | LowestLatencyPlacement | Monitors placements based on the lowest latency |
aws_gamelift_lowest_price_placement | LowestPricePlacement | Tracks placements based on the lowest price |
aws_gamelift_match_acceptances_timed_out | MatchAcceptancesTimedOut | Monitors the number of match acceptance timeouts |
aws_gamelift_matches_accepted | MatchesAccepted | Tracks the number of matches that have been accepted |
aws_gamelift_matches_created | MatchesCreated | Monitors the number of matches that have been created |
aws_gamelift_matches_placed | MatchesPlaced | Tracks the number of matches successfully placed |
aws_gamelift_matches_rejected | MatchesRejected | Monitors the number of rejected matches |
aws_gamelift_max_instances | MaxInstances | Tracks the maximum number of instances |
aws_gamelift_min_instances | MinInstances | Monitors the minimum number of instances |
aws_gamelift_percent_available_game_sessions | PercentAvailableGameSessions | Tracks the percentage of available game sessions |
aws_gamelift_percent_healthy_server_processes | PercentHealthyServerProcesses | Monitors the percentage of healthy server processes |
aws_gamelift_percent_idle_instances | PercentIdleInstances | Tracks the percentage of idle instances |
aws_gamelift_placement | Placement | Monitors the match placement process |
aws_gamelift_placements_canceled | PlacementsCanceled | Tracks the number of canceled placements |
aws_gamelift_placements_failed | PlacementsFailed | Monitors the number of failed placements |
aws_gamelift_placements_started | PlacementsStarted | Tracks the number of placement processes started |
aws_gamelift_placements_succeeded | PlacementsSucceeded | Monitors the number of successful placements |
aws_gamelift_placements_timed_out | PlacementsTimedOut | Tracks the number of timed-out placements |
aws_gamelift_player_session_activations | PlayerSessionActivations | Monitors the number of activated player sessions |
aws_gamelift_players_started | PlayersStarted | Tracks the number of players who have started their sessions |
aws_gamelift_queue_depth | QueueDepth | Monitors the depth of the matchmaking queue |
aws_gamelift_rule_evaluations_failed | RuleEvaluationsFailed | Tracks the number of failed rule evaluations during matchmaking |
aws_gamelift_rule_evaluations_passed | RuleEvaluationsPassed | Monitors the number of passed rule evaluations during matchmaking |
aws_gamelift_server_process_abnormal_terminations | ServerProcessAbnormalTerminations | Tracks the number of abnormal terminations of server processes |
aws_gamelift_server_process_activations | ServerProcessActivations | Monitors the number of server process activations |
aws_gamelift_server_process_terminations | ServerProcessTerminations | Tracks the number of server process terminations |
aws_gamelift_tickets_failed | TicketsFailed | Monitors the number of failed matchmaking tickets |
aws_gamelift_tickets_started | TicketsStarted | Tracks the number of matchmaking tickets that have started |
aws_gamelift_tickets_timed_out | TicketsTimedOut | Monitors the number of matchmaking tickets that have timed out |
aws_gamelift_time_to_match | TimeToMatch | Tracks the average time taken to find a match |
aws_gamelift_time_to_ticket_success | TimeToTicketSuccess | Monitors the time taken to successfully complete a matchmaking ticket |
aws_gamelift_utilized_game_servers | UtilizedGameServers | Tracks the number of utilized game servers |
AWS/GlobalAccelerator
Function: Provides static IP addresses to improve availability and performance for global applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_globalaccelerator_info | General information about Global Accelerator | |
aws_globalaccelerator_healthy_endpoint_count | HealthyEndpointCount | Monitors the number of healthy endpoints in the accelerator |
aws_globalaccelerator_new_flow_count | NewFlowCount | Tracks the number of new network flows being processed |
aws_globalaccelerator_processed_bytes_in | ProcessedBytesIn | Monitors the volume of incoming traffic processed by the accelerator |
aws_globalaccelerator_processed_bytes_out | ProcessedBytesOut | Tracks the volume of outgoing traffic processed by the accelerator |
aws_globalaccelerator_unhealthy_endpoint_count | UnhealthyEndpointCount |
AWS/Glue
Function: Managed ETL service that prepares and loads data for analytics
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_glue_info | General information about AWS Glue | ||
aws_glue_all_disk_available_gb | glue.ALL.disk.available_GB | Tracks the available disk space in gigabytes for all Glue resources | |
aws_glue_all_disk_used_percentage | glue.ALL.disk.used.percentage | Measures the percentage of disk space used across all Glue resources | |
aws_glue_all_disk_used_gb | glue.ALL.disk.used_GB | Tracks the used disk space in gigabytes for all Glue resources | |
aws_glue_all_jvm_heap_usage | glue.ALL.jvm.heap.usage | Monitors the JVM heap usage for all Glue resources | |
aws_glue_all_jvm_heap_used | glue.ALL.jvm.heap.used | Measures the amount of JVM heap used across all Glue resources | |
aws_glue_all_memory_heap_available | glue.ALL.memory.heap.available | Tracks the available memory heap for all Glue resources | |
aws_glue_all_memory_heap_used | glue.ALL.memory.heap.used | Measures the used memory heap for all Glue resources | |
aws_glue_all_memory_heap_used_percentage | glue.ALL.memory.heap.used.percentage | Measures the percentage of memory heap used across all Glue resources | |
aws_glue_all_memory_non_heap_available | glue.ALL.memory.non-heap.available | Monitors the available non-heap memory for all Glue resources | |
aws_glue_all_memory_non_heap_percentage | glue.ALL.memory.non-heap.percentage | Tracks the percentage of non-heap memory used | |
aws_glue_all_memory_non_heap_used | glue.ALL.memory.non-heap.used | Measures the used non-heap memory across all Glue resources | |
aws_glue_all_memory_total_available | glue.ALL.memory.total.available | Tracks the total available memory for all Glue resources | |
aws_glue_all_memory_total_used | glue.ALL.memory.total.used | Measures the total used memory for all Glue resources | |
aws_glue_all_memory_total_used_percentage | glue.ALL.memory.total.used.percentage | Measures the total percentage of memory used | |
aws_glue_all_s3_filesystem_read_bytes | glue.ALL.s3.filesystem.read_bytes | Tracks the total number of bytes read from S3 filesystems | |
aws_glue_all_s3_filesystem_write_bytes | glue.ALL.s3.filesystem.write_bytes | Tracks the total number of bytes written to S3 filesystems | |
aws_glue_all_system_cpu_system_load | glue.ALL.system.cpuSystemLoad | Monitors the system CPU load across all Glue resources | |
aws_glue_driver_block_manager_disk_disk_space_used_mb | glue.driver.BlockManager.disk.diskSpaceUsed_MB | Measures the disk space used by the block manager in megabytes | |
aws_glue_driver_executor_allocation_manager_executors_number_all_executors | glue.driver.ExecutorAllocationManager.executors.numberAllExecutors | Tracks the number of executors across all Glue drivers | |
aws_glue_driver_executor_allocation_manager_executors_number_max_needed_executors | glue.driver.ExecutorAllocationManager.executors.numberMaxNeededExecutors | Tracks the maximum number of executors needed | |
aws_glue_driver_aggregate_bytes_read | glue.driver.aggregate.bytesRead | Tracks the total bytes read across all Glue driver instances | |
aws_glue_driver_aggregate_elapsed_time | glue.driver.aggregate.elapsedTime | Measures the total elapsed time for tasks | |
aws_glue_driver_aggregate_num_completed_stages | glue.driver.aggregate.numCompletedStages | Tracks the total number of completed stages | |
aws_glue_driver_aggregate_num_completed_tasks | glue.driver.aggregate.numCompletedTasks | Tracks the total number of completed tasks | |
aws_glue_driver_aggregate_num_failed_tasks | glue.driver.aggregate.numFailedTasks | Measures the number of failed tasks | |
aws_glue_driver_aggregate_num_killed_tasks | glue.driver.aggregate.numKilledTasks | Tracks the number of killed tasks | |
aws_glue_driver_aggregate_records_read | glue.driver.aggregate.recordsRead | Tracks the total number of records read by drivers | |
aws_glue_driver_aggregate_shuffle_bytes_written | glue.driver.aggregate.shuffleBytesWritten | Measures the number of shuffle bytes written | |
aws_glue_driver_aggregate_shuffle_local_bytes_read | glue.driver.aggregate.shuffleLocalBytesRead | Tracks the number of shuffle bytes read locally | |
aws_glue_driver_bytes_read | glue.driver.bytesRead | Measures the total bytes read by drivers | |
aws_glue_driver_bytes_written | glue.driver.bytesWritten | Measures the total bytes written by drivers | |
aws_glue_driver_disk_available_gb | glue.driver.disk.available_GB | Tracks the available disk space for Glue drivers | |
aws_glue_driver_disk_used_percentage | glue.driver.disk.used.percentage | Measures the percentage of disk space used by Glue drivers | |
aws_glue_driver_disk_used_gb | glue.driver.disk.used_GB | Measures the used disk space in gigabytes for Glue drivers | |
aws_glue_driver_files_read | glue.driver.filesRead | Tracks the total number of files read | |
aws_glue_driver_files_written | glue.driver.filesWritten | Measures the total number of files written | |
aws_glue_driver_jvm_heap_usage | glue.driver.jvm.heap.usage | Monitors the JVM heap usage of Glue drivers | |
aws_glue_driver_jvm_heap_used | glue.driver.jvm.heap.used | Measures the used JVM heap for Glue drivers | |
aws_glue_driver_memory_heap_available | glue.driver.memory.heap.available | Tracks the available heap memory for Glue drivers | |
aws_glue_driver_memory_heap_used | glue.driver.memory.heap.used | Measures the used heap memory for Glue drivers | |
aws_glue_driver_memory_heap_used_percentage | glue.driver.memory.heap.used.percentage | Measures the percentage of heap memory used | |
aws_glue_driver_memory_non_heap_available | glue.driver.memory.non-heap.available | Tracks the available non-heap memory for Glue drivers | |
aws_glue_driver_memory_non_heap_percentage | glue.driver.memory.non-heap.percentage | Measures the percentage of non-heap memory used | |
aws_glue_driver_memory_non_heap_used | glue.driver.memory.non-heap.used | Tracks the non-heap memory used by Glue drivers | |
aws_glue_driver_memory_total_available | glue.driver.memory.total.available | Tracks the total available memory for Glue drivers | |
aws_glue_driver_memory_total_used | glue.driver.memory.total.used | Measures the total memory used by Glue drivers | |
aws_glue_driver_memory_total_used_percentage | glue.driver.memory.total.used.percentage | Tracks the percentage of total memory used | |
aws_glue_driver_partitions_read | glue.driver.partitionsRead | Tracks the number of partitions read by drivers | |
aws_glue_driver_records_read | glue.driver.recordsRead | Tracks the number of records read by Glue drivers | |
aws_glue_driver_records_written | glue.driver.recordsWritten | Measures the number of records written by Glue drivers | |
aws_glue_driver_s3_filesystem_read_bytes | glue.driver.s3.filesystem.read_bytes | Measures the bytes read from S3 filesystem by drivers | |
aws_glue_driver_s3_filesystem_write_bytes | glue.driver.s3.filesystem.write_bytes | Tracks the bytes written to S3 filesystem by drivers | |
aws_glue_driver_skewness_job | glue.driver.skewness.job | Tracks skewness in job execution | |
aws_glue_driver_skewness_stage | glue.driver.skewness.stage | Tracks skewness in stages of execution | |
aws_glue_driver_streaming_batch_processing_time_in_ms | glue.driver.streaming.batchProcessingTimeInMs | Measures the batch processing time in milliseconds for streaming jobs | |
aws_glue_driver_streaming_num_records | glue.driver.streaming.numRecords | Tracks the number of records processed in streaming jobs | |
aws_glue_driver_system_cpu_system_load | glue.driver.system.cpuSystemLoad | Monitors the CPU system load on Glue drivers | |
aws_glue_driver_worker_utilization | glue.driver.workerUtilization | Tracks the worker utilization rate | |
aws_glue_error_all | glue.error.ALL | Tracks all errors occurring in Glue | |
aws_glue_succeed_all | glue.succeed.ALL | Measures the success rate of all Glue jobs |
AWS/IoT
Function: Provides cloud services to connect IoT devices to the cloud and manage IoT workloads
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_iot_info | General information about AWS IoT | |
aws_iot_canceled_job_execution_count | CanceledJobExecutionCount | Tracks the count of canceled job executions |
aws_iot_canceled_job_execution_total_count | CanceledJobExecutionTotalCount | Tracks the total count of canceled job executions |
aws_iot_client_error | ClientError | Monitors the client error count |
aws_iot_connect_auth_error | Connect.AuthError | Tracks authentication errors during connection attempts |
aws_iot_connect_client_error | Connect.ClientError | Measures client-side errors during connection attempts |
aws_iot_connect_server_error | Connect.ServerError | Tracks server-side errors during connection attempts |
aws_iot_connect_success | Connect.Success | Measures successful connection attempts |
aws_iot_connect_throttle | Connect.Throttle | Monitors throttled connection attempts |
aws_iot_delete_thing_shadow_accepted | DeleteThingShadow.Accepted | Tracks successful shadow deletions |
aws_iot_failed_job_execution_count | FailedJobExecutionCount | Tracks the count of failed job executions |
aws_iot_failed_job_execution_total_count | FailedJobExecutionTotalCount | Measures the total count of failed job executions |
aws_iot_failure | Failure | Tracks overall failure events |
aws_iot_get_thing_shadow_accepted** | GetThingShadow.Accepted | Measures the number of successful shadow retrievals |
aws_iot_in_progress_job_execution_count | InProgressJobExecutionCount | Tracks the count of in-progress job executions |
aws_iot_in_progress_job_execution_total_count | InProgressJobExecutionTotalCount | Measures the total count of in-progress job executions |
aws_iot_non_compliant_resources | NonCompliantResources | Tracks the count of non-compliant resources |
aws_iot_num_log_batches_failed_to_publish_throttled | NumLogBatchesFailedToPublishThrottled | Monitors log batches that failed to publish due to throttling |
aws_iot_num_log_events_failed_to_publish_throttled | NumLogEventsFailedToPublishThrottled | Measures log events that failed to publish due to throttling |
aws_iot_parse_error | ParseError | Tracks the number of message parse errors |
aws_iot_ping_success | Ping.Success | Measures successful ping operations |
aws_iot_publish_in_auth_error | PublishIn.AuthError | Tracks authentication errors during inbound publish operations |
aws_iot_publish_in_client_error | PublishIn.ClientError | Monitors client-side errors during inbound publish operations |
aws_iot_publish_in_server_error | PublishIn.ServerError | Tracks server-side errors during inbound publish operations |
aws_iot_publish_in_success | PublishIn.Success | Measures successful inbound publish operations |
aws_iot_publish_in_throttle | PublishIn.Throttle | Tracks throttled inbound publish operations |
aws_iot_publish_out_auth_error | PublishOut.AuthError | Tracks authentication errors during outbound publish operations |
aws_iot_publish_out_client_error | PublishOut.ClientError | Monitors client-side errors during outbound publish operations |
aws_iot_publish_out_success | PublishOut.Success | Measures successful outbound publish operations |
aws_iot_queued_job_execution_count | QueuedJobExecutionCount | Tracks the count of job executions in the queue |
aws_iot_queued_job_execution_total_count | QueuedJobExecutionTotalCount | Measures the total count of queued job executions |
aws_iot_rejected_job_execution_count | RejectedJobExecutionCount | Tracks the count of rejected job executions |
aws_iot_rejected_job_execution_total_count | RejectedJobExecutionTotalCount | Measures the total count of rejected job executions |
aws_iot_removed_job_execution_count | RemovedJobExecutionCount | Tracks the count of removed job executions |
aws_iot_removed_job_execution_total_count | RemovedJobExecutionTotalCount | Measures the total count of removed job executions |
aws_iot_resources_evaluated | ResourcesEvaluated | Measures the number of resources evaluated |
aws_iot_rule_message_throttled | RuleMessageThrottled | Tracks the number of rule messages throttled |
aws_iot_rule_not_found | RuleNotFound | Measures instances where rules were not found |
aws_iot_rules_executed | RulesExecuted | Tracks the number of executed rules |
aws_iot_server_error | ServerError | Monitors server-side errors |
aws_iot_subscribe_auth_error | Subscribe.AuthError | Tracks authentication errors during subscription attempts |
aws_iot_subscribe_client_error | Subscribe.ClientError | Measures client-side errors during subscription attempts |
aws_iot_subscribe_server_error | Subscribe.ServerError | Tracks server-side errors during subscription attempts |
aws_iot_subscribe_success | Subscribe.Success | Measures successful subscription attempts |
aws_iot_subscribe_throttle | Subscribe.Throttle | Monitors throttled subscription attempts |
aws_iot_succeeded_job_execution_count | SucceededJobExecutionCount | Tracks the count of successful job executions |
aws_iot_succeeded_job_execution_total_count | SucceededJobExecutionTotalCount | Measures the total count of successful job executions |
aws_iot_success | Success | Tracks overall successful operations |
aws_iot_topic_match | TopicMatch | Measures the number of successful topic matches |
aws_iot_unsubscribe_client_error | Unsubscribe.ClientError | Monitors client-side errors during unsubscribe operations |
aws_iot_unsubscribe_server_error | Unsubscribe.ServerError | Tracks server-side errors during unsubscribe operations |
aws_iot_unsubscribe_success | Unsubscribe.Success | Measures successful unsubscribe operations |
aws_iot_unsubscribe_throttle | Unsubscribe.Throttle | Monitors throttled unsubscribe operations |
aws_iot_update_thing_shadow_accepted | UpdateThingShadow.Accepted | Measures successful shadow update operations |
aws_iot_violations | Violations | Tracks policy violations |
aws_iot_violations_cleared | ViolationsCleared | Measures cleared violations |
aws_iot_violations_invalidated | ViolationsInvalidated | Tracks invalidated violations |
AWS/Kafka
Function: Managed Apache Kafka service for building real-time streaming applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_kafka_info | General information about AWS Kafka cluster | |
aws_kafka_active_controller_count | ActiveControllerCount | Indicates how many active controllers are in the Kafka cluster |
aws_kafka_burst_balance | BurstBalance | Measures the burst balance remaining for the Kafka broker instances |
aws_kafka_bw_in_allowance_exceeded | BwInAllowanceExceeded | Tracks the instances where incoming bandwidth allowance has been exceeded |
aws_kafka_bw_out_allowance_exceeded | BwOutAllowanceExceeded | Tracks the instances where outgoing bandwidth allowance has been exceeded |
aws_kafka_bytes_in_per_sec | BytesInPerSec | Measures the rate of incoming bytes per second into the Kafka cluster |
aws_kafka_bytes_out_per_sec | BytesOutPerSec | Measures the rate of outgoing bytes per second from the Kafka cluster |
aws_kafka_cpucredit_balance | CPUCreditBalance | Shows the remaining CPU credits for instances running in burstable performance mode |
aws_kafka_client_connection_count | ClientConnectionCount | Indicates the total number of client connections to the Kafka brokers |
aws_kafka_conn_track_allowance_exceeded | ConnTrackAllowanceExceeded | Tracks instances where the connection tracking allowance is exceeded |
aws_kafka_connection_close_rate | ConnectionCloseRate | Monitors the rate at which connections are being closed |
aws_kafka_connection_count | ConnectionCount | Displays the number of open connections to the Kafka brokers |
aws_kafka_connection_creation_rate | ConnectionCreationRate | Tracks the rate of new connections being created to the Kafka brokers |
aws_kafka_cpu_credit_usage | CPUCreditUsage | Shows the CPU credits consumed by the Kafka instances running in burstable mode |
aws_kafka_cpu_idle | CPUIdle | Indicates the percentage of idle CPU resources on Kafka instances |
aws_kafka_cpu_io_wait | CpuIoWait | Measures the time instances spend waiting for I/O operations to complete |
aws_kafka_cpu_system | CpuSystem | Tracks CPU usage by the system processes on Kafka instances |
aws_kafka_cpu_user | CpuUser | Shows CPU usage by user processes on Kafka instances |
aws_kafka_estimated_max_time_lag | EstimatedMaxTimeLag | Measures the maximum estimated time lag in replication |
aws_kafka_estimated_time_lag | EstimatedTimeLag | Monitors the estimated time lag in replication between Kafka brokers |
aws_kafka_fetch_consumer_local_time_ms_mean | FetchConsumerLocalTimeMsMean | Measures the average time it takes to fetch messages locally by the consumer |
aws_kafka_fetch_consumer_request_queue_time_ms_mean | FetchConsumerRequestQueueTimeMsMean | Indicates the average time messages spend in the consumer request queue |
aws_kafka_fetch_consumer_response_queue_time_ms_mean | FetchConsumerResponseQueueTimeMsMean | Tracks the average time it takes for a consumer to queue a response |
aws_kafka_fetch_consumer_response_send_time_ms_mean | FetchConsumerResponseSendTimeMsMean | Measures the average time taken to send a consumer response |
aws_kafka_fetch_consumer_total_time_ms_mean | FetchConsumerTotalTimeMsMean | Tracks the total time spent processing a consumer fetch request |
aws_kafka_fetch_follower_local_time_ms_mean | FetchFollowerLocalTimeMsMean | Measures the average time it takes for a Kafka broker follower to fetch messages locally |
aws_kafka_fetch_follower_request_queue_time_ms_mean | FetchFollowerRequestQueueTimeMsMean | Measures the time follower fetch requests spend in the queue |
aws_kafka_fetch_follower_response_queue_time_ms_mean | FetchFollowerResponseQueueTimeMsMean | Tracks the time follower fetch responses spend in the response queue |
aws_kafka_fetch_follower_response_send_time_ms_mean | FetchFollowerResponseSendTimeMsMean | Measures the time it takes for a Kafka broker follower to send a fetch response |
aws_kafka_fetch_follower_total_time_ms_mean | FetchFollowerTotalTimeMsMean | Tracks the total time for a Kafka broker follower to fetch messages |
aws_kafka_fetch_message_conversions_per_sec | FetchMessageConversionsPerSec | Monitors the rate of message format conversions during fetching |
aws_kafka_fetch_throttle_byte_rate | FetchThrottleByteRate | Measures the rate at which fetching is throttled due to byte rate limits |
aws_kafka_fetch_throttle_queue_size | FetchThrottleQueueSize | Indicates the number of messages in the fetch throttle queue |
aws_kafka_fetch_throttle_time | FetchThrottleTime | Tracks the total time Kafka throttles fetch requests |
aws_kafka_global_partition_count | GlobalPartitionCount | Displays the total number of partitions in the Kafka cluster |
aws_kafka_global_topic_count | GlobalTopicCount | Shows the total number of topics in the Kafka cluster |
aws_kafka_heap_memory_after_gc | HeapMemoryAfterGC | Tracks the amount of heap memory remaining after garbage collection |
aws_kafka_app_logs_disk_used | KafkaAppLogsDiskUsed | Measures the amount of disk space used by Kafka application logs |
aws_kafka_data_logs_disk_used | KafkaDataLogsDiskUsed | Measures the disk space used by Kafka data logs |
aws_kafka_leader_count | LeaderCount | Shows the number of partition leaders in the Kafka cluster |
aws_kafka_max_offset_lag | MaxOffsetLag | Measures the maximum offset lag between Kafka brokers |
aws_kafka_memory_buffered | MemoryBuffered | Indicates the amount of memory currently buffered by Kafka |
aws_kafka_memory_cached | MemoryCached | Shows the amount of memory cached by Kafka |
aws_kafka_memory_free | MemoryFree | Displays the amount of free memory on Kafka brokers |
aws_kafka_memory_used | MemoryUsed | Measures the total amount of memory being used by Kafka brokers |
aws_kafka_messages_in_per_sec | MessagesInPerSec | Tracks the number of messages produced per second in the Kafka cluster |
aws_kafka_network_processor_avg_idle_percent | NetworkProcessorAvgIdlePercent | Measures the idle percentage of the network processors |
aws_kafka_network_rx_dropped | NetworkRxDropped | Shows the number of dropped incoming network packets |
aws_kafka_network_rx_errors | NetworkRxErrors | Tracks the number of errors on received network packets |
aws_kafka_network_rx_packets | NetworkRxPackets | Measures the number of network packets received |
aws_kafka_network_tx_dropped | NetworkTxDropped | Tracks the number of dropped outgoing network packets |
aws_kafka_network_tx_errors | NetworkTxErrors | Shows the number of errors on transmitted network packets |
aws_kafka_network_tx_packets | NetworkTxPackets | Tracks the number of network packets transmitted |
aws_kafka_offline_partitions_count | OfflinePartitionsCount | Monitors the number of Kafka partitions that are offline |
aws_kafka_offset_lag | OffsetLag | Measures the current offset lag in Kafka replication |
aws_kafka_partition_count | PartitionCount | Displays the total number of partitions in the Kafka cluster |
aws_kafka_pps_allowance_exceeded | PpsAllowanceExceeded | Tracks instances where the packets-per-second allowance has been exceeded |
aws_kafka_produce_local_time_ms_mean | ProduceLocalTimeMsMean | Measures the average time taken to produce messages locally |
aws_kafka_produce_message_conversions_per_sec | ProduceMessageConversionsPerSec | Monitors the rate of message conversions during production |
aws_kafka_produce_message_conversions_time_ms_mean | ProduceMessageConversionsTimeMsMean | Tracks the time taken to convert messages during production |
aws_kafka_produce_request_queue_time_ms_mean | ProduceRequestQueueTimeMsMean | Measures the time produce requests spend in the queue |
aws_kafka_produce_response_queue_time_ms_mean | ProduceResponseQueueTimeMsMean | Monitors the time produce responses spend in the queue |
aws_kafka_produce_response_send_time_ms_mean | ProduceResponseSendTimeMsMean | Tracks the time it takes to send produce responses |
aws_kafka_produce_throttle_byte_rate | ProduceThrottleByteRate | Measures the rate at which production is throttled due to byte rate limits |
aws_kafka_produce_throttle_queue_size | ProduceThrottleQueueSize | Tracks the size of the production throttle queue |
aws_kafka_produce_throttle_time | ProduceThrottleTime | Measures the total time Kafka throttles produce requests |
aws_kafka_produce_total_time_ms_mean | ProduceTotalTimeMsMean | Tracks the total time spent on producing messages |
aws_kafka_remote_copy_bytes_per_sec | RemoteCopyBytesPerSec | Measures the rate of bytes copied remotely |
aws_kafka_remote_copy_errors_per_sec | RemoteCopyErrorsPerSec | Tracks the rate of errors during remote copying |
aws_kafka_remote_copy_lag_bytes | RemoteCopyLagBytes | Monitors the lag in bytes during remote copying |
aws_kafka_remote_fetch_bytes_per_sec | RemoteFetchBytesPerSec | Tracks the rate of bytes fetched remotely |
aws_kafka_remote_fetch_errors_per_sec | RemoteFetchErrorsPerSec | Measures the rate of errors during remote fetching |
aws_kafka_remote_fetch_requests_per_sec | RemoteFetchRequestsPerSec | Tracks the number of remote fetch requests per second |
aws_kafka_remote_log_manager_tasks_avg_idle_percent | RemoteLogManagerTasksAvgIdlePercent | Monitors the idle percentage of remote log manager tasks |
aws_kafka_remote_log_reader_avg_idle_percent | RemoteLogReaderAvgIdlePercent | Tracks the idle percentage of remote log reader tasks |
aws_kafka_remote_log_reader_task_queue_size | RemoteLogReaderTaskQueueSize | Measures the size of the remote log reader task queue |
aws_kafka_replication_bytes_in_per_sec | ReplicationBytesInPerSec | Tracks the rate of incoming replication bytes |
aws_kafka_replication_bytes_out_per_sec | ReplicationBytesOutPerSec | Measures the rate of outgoing replication bytes |
aws_kafka_request_bytes_mean | RequestBytesMean | Tracks the average size of Kafka requests |
aws_kafka_request_exempt_from_throttle_time | RequestExemptFromThrottleTime | Tracks the time requests are exempt from throttling |
aws_kafka_request_handler_avg_idle_percent | RequestHandlerAvgIdlePercent | Measures the idle percentage of request handlers |
aws_kafka_request_throttle_queue_size | RequestThrottleQueueSize | Tracks the size of the request throttle queue |
aws_kafka_request_throttle_time | RequestThrottleTime | Measures the time requests are throttled in Kafka |
aws_kafka_request_time | RequestTime | Monitors the overall time spent handling requests in Kafka |
aws_kafka_root_disk_used | RootDiskUsed | Tracks the amount of disk space used by the root partition |
aws_kafka_sum_offset_lag | SumOffsetLag | Measures the total offset lag across all partitions |
aws_kafka_swap_free | SwapFree | Tracks the amount of free swap memory available on Kafka brokers |
aws_kafka_swap_used | SwapUsed | Measures the amount of swap memory used by Kafka brokers |
aws_kafka_tcpconnections | TCPConnections | Tracks the total number of TCP connections on the Kafka cluster |
aws_kafka_tcp_connections | TcpConnections | Monitors the active TCP connections in the Kafka cluster |
aws_kafka_traffic_bytes | TrafficBytes | Measures the total traffic in bytes on Kafka brokers |
aws_kafka_traffic_shaping | TrafficShaping | Tracks instances where traffic shaping is applied to Kafka brokers |
aws_kafka_under_min_isr_partition_count | UnderMinIsrPartitionCount | Tracks the number of partitions below the minimum in-sync replicas |
aws_kafka_under_replicated_partitions | UnderReplicatedPartitions | Measures the number of under-replicated partitions in the Kafka cluster |
aws_kafka_volume_queue_length | VolumeQueueLength | Tracks the queue length for disk I/O operations |
aws_kafka_volume_read_bytes | VolumeReadBytes | Measures the number of bytes read from disk |
aws_kafka_volume_read_ops | VolumeReadOps | Tracks the number of read operations on the disk |
aws_kafka_volume_total_read_time | VolumeTotalReadTime | Measures the total time spent on disk read operations |
aws_kafka_volume_total_write_time | VolumeTotalWriteTime | Measures the total time spent on disk write operations |
aws_kafka_volume_write_bytes | VolumeWriteBytes | Tracks the number of bytes written to disk |
aws_kafka_volume_write_ops | VolumeWriteOps | Measures the number of write operations on the disk |
aws_kafka_zoo_keeper_request_latency_ms_mean | ZooKeeperRequestLatencyMsMean | Measures the average latency of requests to ZooKeeper |
aws_kafka_zoo_keeper_session_state | ZooKeeperSessionState | Tracks the current session state of ZooKeeper |
AWS/Kinesis
Function: Managed service for real-time data processing and analytics
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_kinesis_info | ||
aws_kinesis_get_records_bytes | GetRecords.Bytes | Measures the total number of bytes retrieved by the GetRecords call. |
aws_kinesis_get_records_iterator_age | GetRecords.IteratorAge | Measures the age of the last record retrieved using the iterator. |
aws_kinesis_get_records_iterator_age_milliseconds | GetRecords.IteratorAgeMilliseconds | Measures the age of the iterator in milliseconds for the GetRecords call. |
aws_kinesis_get_records_latency | GetRecords.Latency | Tracks the latency of the GetRecords call to retrieve data from a stream. |
aws_kinesis_get_records_records | GetRecords.Records | Tracks the total number of records retrieved by the GetRecords call. |
aws_kinesis_get_records_success | GetRecords.Success | Measures the success rate of the GetRecords call. |
aws_kinesis_incoming_bytes | IncomingBytes | Tracks the number of incoming bytes written to the stream. |
aws_kinesis_incoming_records | IncomingRecords | Measures the total number of records being written to the stream. |
aws_kinesis_iterator_age_milliseconds | IteratorAgeMilliseconds | Tracks the age of the iterator used in GetRecords, measured in milliseconds. |
aws_kinesis_outgoing_bytes | OutgoingBytes | Tracks the total number of outgoing bytes from the stream. |
aws_kinesis_outgoing_records | OutgoingRecords | Measures the total number of outgoing records from the stream. |
aws_kinesis_put_record_bytes | PutRecord.Bytes | Measures the total number of bytes in the PutRecord call. |
aws_kinesis_put_record_latency | PutRecord.Latency | Tracks the latency of PutRecord requests to write data to the stream. |
aws_kinesis_put_record_success | PutRecord.Success | Measures the success rate of the PutRecord call. |
aws_kinesis_put_records_bytes | PutRecords.Bytes | Measures the total number of bytes written using the PutRecords call. |
aws_kinesis_put_records_failed_records | PutRecords.FailedRecords | Tracks the number of failed records in the PutRecords call. |
aws_kinesis_put_records_latency | PutRecords.Latency | Measures the latency of PutRecords requests to the stream. |
aws_kinesis_put_records_records | PutRecords.Records | Tracks the total number of records written using the PutRecords call. |
aws_kinesis_put_records_success | PutRecords.Success | Measures the success rate of the PutRecords call. |
aws_kinesis_put_records_successful_records | PutRecords.SuccessfulRecords | Measures the total number of successful records in the PutRecords call. |
aws_kinesis_put_records_throttled_records | PutRecords.ThrottledRecords | Tracks the number of throttled records in the PutRecords call due to exceeding throughput limits. |
aws_kinesis_put_records_total_records | PutRecords.TotalRecords | Measures the total number of records submitted via PutRecords. |
aws_kinesis_read_provisioned_throughput_exceeded | ReadProvisionedThroughputExceeded | Tracks the number of times read requests exceeded the provisioned throughput. |
aws_kinesis_subscribe_to_shard_rate_exceeded | SubscribeToShard.RateExceeded | Tracks the number of times the rate for SubscribeToShard exceeded limits. |
aws_kinesis_subscribe_to_shard_success | SubscribeToShard.Success | Measures the success rate of SubscribeToShard operations. |
aws_kinesis_subscribe_to_shard_event_bytes | SubscribeToShardEvent.Bytes | Tracks the number of bytes received in shard events during SubscribeToShard operations. |
aws_kinesis_subscribe_to_shard_event_millis_behind_latest | SubscribeToShardEvent.MillisBehindLatest | Tracks how far behind the latest event the shard event is during SubscribeToShard operations. |
aws_kinesis_subscribe_to_shard_event_records | SubscribeToShardEvent.Records | Measures the number of records received in shard events during SubscribeToShard operations. |
aws_kinesis_subscribe_to_shard_event_success | SubscribeToShardEvent.Success | Tracks the success rate of SubscribeToShard events. |
aws_kinesis_write_provisioned_throughput_exceeded | WriteProvisionedThroughputExceeded | Measures the number of times write operations exceeded the provisioned throughput limits. |
AWS/KinesisAnalytics
Function: Processes streaming data in real time using SQL
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_kinesisanalytics_bytes | Bytes | Tracks the total amount of data processed by Kinesis Analytics. |
aws_kinesisanalytics_input_processing_dropped_records | InputProcessing.DroppedRecords | Measures the number of dropped records during input processing. |
aws_kinesisanalytics_input_processing_duration | InputProcessing.Duration | Tracks the duration of input processing. |
aws_kinesisanalytics_input_processing_ok_bytes | InputProcessing.OkBytes | Measures the number of bytes successfully processed during input. |
aws_kinesisanalytics_input_processing_ok_records | InputProcessing.OkRecords | Tracks the number of records successfully processed during input. |
aws_kinesisanalytics_input_processing_processing_failed_records | InputProcessing.ProcessingFailedRecords | Measures the number of records that failed during input processing. |
aws_kinesisanalytics_input_processing_success | InputProcessing.Success | Tracks the success rate of input processing operations. |
aws_kinesisanalytics_kpus KPUs | Monitors the number of Kinesis Processing Units (KPUs) used. | |
aws_kinesisanalytics_lambda_delivery_delivery_failed_records | LambdaDelivery.DeliveryFailedRecords | Measures the number of failed records delivered to AWS Lambda by Kinesis Analytics. |
aws_kinesisanalytics_lambda_delivery_duration | LambdaDelivery.Duration | Tracks the duration of record delivery to AWS Lambda. |
aws_kinesisanalytics_lambda_delivery_ok_records | LambdaDelivery.OkRecords | Measures the number of records successfully delivered to AWS Lambda. |
aws_kinesisanalytics_millis_behind_latest | MillisBehindLatest | Tracks the time Kinesis Analytics is behind the latest record in milliseconds. |
aws_kinesisanalytics_records | Records | Measures the total number of records processed by Kinesis Analytics. |
aws_kinesisanalytics_success | Success | Tracks the success rate of all Kinesis Analytics operations. |
aws_kinesisanalytics_back_pressured_time_ms_per_second | backPressuredTimeMsPerSecond | Measures the amount of time in milliseconds Kinesis Analytics was back-pressured. |
aws_kinesisanalytics_busy_time_ms_per_second | busyTimeMsPerSecond | Tracks the time Kinesis Analytics spent in a busy state, processing data. |
aws_kinesisanalytics_bytes_requested_per_fetch | bytesRequestedPerFetch | Measures the number of bytes requested in each fetch operation. |
aws_kinesisanalytics_bytes_consumed_rate | bytes_consumed_rate | Tracks the rate at which bytes are consumed from the stream. |
aws_kinesisanalytics_commits_failed | commitsFailed | Measures the number of failed commit operations. |
aws_kinesisanalytics_commits_succeeded | commitsSucceeded | Tracks the number of successful commit operations. |
aws_kinesisanalytics_committedoffsets | committedoffsets | Monitors the committed offsets of records processed. |
aws_kinesisanalytics_container_cpuutilization | containerCPUUtilization | Tracks the CPU utilization of the Kinesis Analytics container. |
aws_kinesisanalytics_container_disk_utilization | containerDiskUtilization | Monitors the disk utilization of the Kinesis Analytics container. |
aws_kinesisanalytics_container_memory_utilization | containerMemoryUtilization | Measures the memory utilization of the Kinesis Analytics container. |
aws_kinesisanalytics_cpu_utilization | cpuUtilization | Tracks the overall CPU utilization of Kinesis Analytics. |
aws_kinesisanalytics_current_input_watermark | currentInputWatermark | Monitors the current watermark for input data. |
aws_kinesisanalytics_current_output_watermark | currentOutputWatermark | Tracks the current watermark for output data. |
aws_kinesisanalytics_currentoffsets | currentoffsets | Measures the current offsets for processed records. |
aws_kinesisanalytics_downtime | downtime | Tracks the total downtime of the Kinesis Analytics application. |
aws_kinesisanalytics_full_restarts | fullRestarts | Measures the number of full restarts of the Kinesis Analytics application. |
aws_kinesisanalytics_heap_memory_utilization | heapMemoryUtilization | Monitors the heap memory utilization. |
aws_kinesisanalytics_idle_time_ms_per_second | idleTimeMsPerSecond | Tracks the idle time of Kinesis Analytics in milliseconds per second. |
aws_kinesisanalytics_last_checkpoint_duration | lastCheckpointDuration | Measures the duration of the last checkpoint process. |
aws_kinesisanalytics_last_checkpoint_size | lastCheckpointSize | Monitors the size of the last checkpoint. |
aws_kinesisanalytics_managed_memory_total | managedMemoryTotal | Tracks the total managed memory available. |
aws_kinesisanalytics_managed_memory_used | managedMemoryUsed | Measures the amount of managed memory currently in use. |
aws_kinesisanalytics_managed_memory_utilization | managedMemoryUtilization | Tracks the utilization of managed memory. |
aws_kinesisanalytics_num_late_records_dropped | numLateRecordsDropped | Measures the number of late records dropped by Kinesis Analytics. |
aws_kinesisanalytics_num_records_in | numRecordsIn | Tracks the number of records ingested by Kinesis Analytics. |
aws_kinesisanalytics_num_records_in_per_second | numRecordsInPerSecond | Monitors the rate of incoming records per second. |
aws_kinesisanalytics_num_records_out | numRecordsOut | Measures the number of records output by Kinesis Analytics. |
aws_kinesisanalytics_num_records_out_per_second | numRecordsOutPerSecond | Tracks the rate of outgoing records per second. |
aws_kinesisanalytics_number_of_failed_checkpoints | numberOfFailedCheckpoints | Measures the number of failed checkpoints in Kinesis Analytics. |
aws_kinesisanalytics_old_generation_gccount | oldGenerationGCCount | Tracks the count of garbage collection events in the old generation heap space. |
aws_kinesisanalytics_old_generation_gctime | oldGenerationGCTime | Measures the time spent in garbage collection for the old generation heap. |
aws_kinesisanalytics_records_lag_max | records_lag_max | Tracks the maximum lag of records being processed by Kinesis Analytics. |
aws_kinesisanalytics_thread_count | threadCount | Monitors the number of active threads in the Kinesis Analytics application. |
aws_kinesisanalytics_uptime uptime | Measures the uptime of the Kinesis Analytics application. | |
aws_kinesisanalytics_zeppelin_cpu_utilization | zeppelinCpuUtilization | Tracks the CPU utilization of the Zeppelin server used by Kinesis Analytics. |
aws_kinesisanalytics_zeppelin_heap_memory_utilization | zeppelinHeapMemoryUtilization | Monitors the heap memory utilization of the Zeppelin server. |
aws_kinesisanalytics_zeppelin_server_uptime | zeppelinServerUptime | Tracks the uptime of the Zeppelin server. |
aws_kinesisanalytics_zeppelin_thread_count | zeppelinThreadCount | Monitors the number of active threads in the Zeppelin server. |
aws_kinesisanalytics_zeppelin_waiting_jobs | zeppelinWaitingJobs | Measures the number of jobs waiting to be processed by the Zeppelin server. |
AWS/Lambda
Function: Serverless compute service that runs code in response to events
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_lambda_info | ||
aws_lambda_invocations | Invocations | Tracks the number of times your AWS Lambda function is invoked. |
aws_lambda_errors | Errors | Monitors the number of invocations that result in an error. |
aws_lambda_throttles | Throttles | Measures the number of times your Lambda function is throttled due to exceeding the concurrency limit. |
aws_lambda_duration | Duration | Tracks the amount of time a Lambda function takes to execute. |
aws_lambda_async_event_age | AsyncEventAge | Measures the age of an asynchronous event when Lambda begins executing the associated function. |
aws_lambda_async_events_dropped | AsyncEventsDropped | Monitors the number of asynchronous events dropped due to Lambda service errors or throttling. |
aws_lambda_async_events_received | AsyncEventsReceived | Tracks the number of asynchronous events received by the Lambda function. |
aws_lambda_claimed_account_concurrency | ClaimedAccountConcurrency | Monitors the number of reserved concurrent executions for your account. |
aws_lambda_concurrent_executions | ConcurrentExecutions | Tracks the number of concurrent executions across all Lambda functions in your account. |
aws_lambda_dead_letter_errors | DeadLetterErrors | Measures the number of failed invocations that couldn’t be sent to the Dead Letter Queue. |
aws_lambda_destination_delivery_failures | DestinationDeliveryFailures | Tracks the number of failures when delivering function results to a destination service. |
aws_lambda_iterator_age | IteratorAge | Measures the age of the last record in the event source before Lambda starts processing. |
aws_lambda_offset_lag | OffsetLag | Tracks the offset lag for Kinesis or DynamoDB streams when invoking Lambda functions. |
aws_lambda_oversized_record_count | OversizedRecordCount | Measures the number of records that exceeded the maximum size supported by Lambda. |
aws_lambda_post_runtime_extensions_duration | PostRuntimeExtensionsDuration | Tracks the time taken by post-runtime extensions after Lambda function execution. |
aws_lambda_provisioned_concurrency_invocations | ProvisionedConcurrencyInvocations | Measures the number of invocations served by functions with provisioned concurrency. |
aws_lambda_provisioned_concurrency_spillover_invocations | ProvisionedConcurrencySpilloverInvocations | Tracks the number of invocations that were served by standard concurrency when provisioned concurrency was exhausted. |
aws_lambda_provisioned_concurrency_utilization | ProvisionedConcurrencyUtilization | Measures the percentage of provisioned concurrency that is being used by your Lambda function. |
aws_lambda_provisioned_concurrent_executions | ProvisionedConcurrentExecutions | Tracks the number of concurrent executions using provisioned concurrency. |
aws_lambda_recursive_invocations_dropped | RecursiveInvocationsDropped | Measures the number of recursive invocations that were dropped. |
aws_lambda_unreserved_concurrent_executions | UnreservedConcurrentExecutions | Tracks the number of concurrent executions that are not using provisioned concurrency. |
AWS/Logs
Function: Centralized logging service for monitoring and troubleshooting applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_logs_info | ||
aws_logs_delivery_errors | DeliveryErrors | Tracks the number of errors that occurred while attempting to deliver log data to the CloudWatch Logs destination. |
aws_logs_delivery_throttling | DeliveryThrottling | Measures the number of times log delivery was throttled due to exceeding the delivery limits. |
aws_logs_forwarded_bytes | ForwardedBytes | Monitors the total volume of log data in bytes that was successfully forwarded to the CloudWatch Logs destination. |
aws_logs_forwarded_log_events | ForwardedLogEvents | Tracks the number of log events successfully forwarded to the CloudWatch Logs destination. |
aws_logs_incoming_bytes | IncomingBytes | Measures the total volume of incoming log data in bytes received by CloudWatch Logs. |
aws_logs_incoming_log_events | IncomingLogEvents | Tracks the number of log events received by CloudWatch Logs. |
AWS/MWAA
Function: Managed service for Apache Airflow to manage workflows and orchestration
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_mwaa_active_connection_count | ActiveConnectionCount | Tracks the number of active connections to the Managed Workflows for Apache Airflow (MWAA) environment. |
aws_mwaa_approximate_age_of_oldest_task | ApproximateAgeOfOldestTask | Measures the age of the oldest running task in the MWAA environment. |
aws_mwaa_cpuutilization | CPUUtilization | Monitors the percentage of CPU utilization in the MWAA environment. |
aws_mwaa_database_connections | DatabaseConnections | Tracks the number of connections to the database used by MWAA. |
aws_mwaa_disk_queue_depth | DiskQueueDepth | Measures the depth of the disk queue, indicating the number of IO operations waiting to be processed. |
aws_mwaa_freeable_memory | FreeableMemory | Monitors the amount of free memory available in the MWAA environment. |
aws_mwaa_memory_utilization | MemoryUtilization | Tracks the percentage of memory utilized in the MWAA environment. |
aws_mwaa_queued_tasks | QueuedTasks | Measures the number of tasks waiting to be executed in the MWAA environment. |
aws_mwaa_running_tasks | RunningTasks | Tracks the number of tasks currently running in the MWAA environment. |
aws_mwaa_volume_write_iops | VolumeWriteIOPS | Monitors the input/output operations per second (IOPS) for write operations on the volume. |
aws_mwaa_write_iops | WriteIOPS | Tracks the number of write operations per second in the MWAA environment. |
aws_mwaa_write_latency | WriteLatency | Measures the latency of write operations in the MWAA environment. |
aws_mwaa_write_throughput | WriteThroughput | Monitors the amount of data written per second in the MWAA environment. |
AWS/MediaConnect
Function: Secure and reliable transport of live video streams
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_mediaconnect_info | ||
aws_mediaconnect_arqrecovered | ARQRecovered | Monitors the number of Automatic Repeat reQuest (ARQ) packets successfully recovered in the MediaConnect flow. |
aws_mediaconnect_arqrequests | ARQRequests | Tracks the number of ARQ requests made by MediaConnect flows. |
aws_mediaconnect_bit_rate | BitRate | Measures the bitrate of the MediaConnect stream. |
aws_mediaconnect_caterror | CATError | Detects Conditional Access Table (CAT) errors in the MediaConnect stream. |
aws_mediaconnect_crcerror | CRCError | Tracks the number of cyclic redundancy check (CRC) errors in the stream. |
aws_mediaconnect_connected | Connected | Monitors the connection status of the MediaConnect flow. |
aws_mediaconnect_connected_outputs | ConnectedOutputs | Tracks the number of outputs connected to the MediaConnect flow. |
aws_mediaconnect_connection_attempts | ConnectionAttempts | Measures the number of attempts made to establish a connection for the flow. |
aws_mediaconnect_consecutive_drops | ConsecutiveDrops | Monitors the number of consecutive dropped packets in the MediaConnect flow. |
aws_mediaconnect_consecutive_not_recovered | ConsecutiveNotRecovered | Tracks the number of consecutive packets that were not successfully recovered. |
aws_mediaconnect_continuity_counter | ContinuityCounter | Monitors the continuity counter of the stream to detect missing packets. |
aws_mediaconnect_disconnections | Disconnections | Tracks the number of times the MediaConnect flow was disconnected. |
aws_mediaconnect_dropped_packets | DroppedPackets | Monitors the number of packets dropped in the MediaConnect flow. |
aws_mediaconnect_egress_bridge_bit_rate | EgressBridgeBitRate | Tracks the bitrate for egress bridge flows. |
aws_mediaconnect_egress_bridge_caterror | EgressBridgeCATError | Detects CAT errors in egress bridge flows. |
aws_mediaconnect_egress_bridge_crcerror | EgressBridgeCRCError | Monitors the CRC errors in egress bridge flows. |
aws_mediaconnect_egress_bridge_continuity_counter | EgressBridgeContinuityCounter | Measures the continuity of the egress bridge stream to detect missing packets. |
aws_mediaconnect_egress_bridge_dropped_packets | EgressBridgeDroppedPackets | Tracks the number of packets dropped in the egress bridge flows. |
aws_mediaconnect_egress_bridge_failover_switches | EgressBridgeFailoverSwitches | Monitors failover switches in the egress bridge flows. |
aws_mediaconnect_egress_bridge_merge_active | EgressBridgeMergeActive | Indicates if an egress bridge merge is active. |
aws_mediaconnect_egress_bridge_not_recovered_packets | EgressBridgeNotRecoveredPackets | Tracks the number of packets that were not recovered in the egress bridge. |
aws_mediaconnect_egress_bridge_paterror | EgressBridgePATError | Detects Program Association Table (PAT) errors in the egress bridge. |
aws_mediaconnect_egress_bridge_pcraccuracy_error | EgressBridgePCRAccuracyError | Monitors errors related to the accuracy of Program Clock Reference (PCR) in the egress bridge. |
aws_mediaconnect_egress_bridge_pcrerror | EgressBridgePCRError | Tracks PCR errors in the egress bridge. |
aws_mediaconnect_egress_bridge_piderror | EgressBridgePIDError | Monitors Packet Identifier (PID) errors in the egress bridge stream. |
aws_mediaconnect_egress_bridge_pmterror | EgressBridgePMTError | Detects errors in the Program Map Table (PMT) in the egress bridge. |
aws_mediaconnect_egress_bridge_ptserror | EgressBridgePTSError | Tracks Presentation Time Stamp (PTS) errors in the egress bridge stream. |
aws_mediaconnect_egress_bridge_packet_loss_percent | EgressBridgePacketLossPercent | Measures the percentage of packet loss in the egress bridge. |
aws_mediaconnect_egress_bridge_recovered_packets | EgressBridgeRecoveredPackets | Tracks the number of recovered packets in the egress bridge stream. |
aws_mediaconnect_egress_bridge_source_bit_rate | EgressBridgeSourceBitRate | Monitors the bitrate of the source in the egress bridge. |
aws_mediaconnect_egress_bridge_source_caterror | EgressBridgeSourceCATError | Detects CAT errors in the source of the egress bridge. |
aws_mediaconnect_egress_bridge_source_crcerror | EgressBridgeSourceCRCError | Tracks CRC errors in the source of the egress bridge. |
aws_mediaconnect_egress_bridge_source_continuity_counter | EgressBridgeSourceContinuityCounter | Measures the continuity of the source stream in the egress bridge to detect missing packets. |
aws_mediaconnect_egress_bridge_source_dropped_packets | EgressBridgeSourceDroppedPackets | Monitors the number of dropped packets in the source stream of the egress bridge. |
aws_mediaconnect_egress_bridge_source_merge_active | EgressBridgeSourceMergeActive | Indicates if the source merge is active in the egress bridge. |
aws_mediaconnect_egress_bridge_source_merge_latency | EgressBridgeSourceMergeLatency | Measures latency during source merge in the egress bridge. |
aws_mediaconnect_egress_bridge_source_not_recovered_packets | EgressBridgeSourceNotRecoveredPackets | Tracks the number of packets not recovered in the source of the egress bridge. |
aws_mediaconnect_egress_bridge_source_paterror | EgressBridgeSourcePATError | Detects PAT errors in the source of the egress bridge. |
aws_mediaconnect_egress_bridge_source_pcraccuracy_error | EgressBridgeSourcePCRAccuracyError | Monitors errors in the accuracy of the PCR in the source of the egress bridge. |
aws_mediaconnect_egress_bridge_source_pcrerror | EgressBridgeSourcePCRError | Tracks PCR errors in the source stream of the egress bridge. |
aws_mediaconnect_egress_bridge_source_piderror | EgressBridgeSourcePIDError | |
aws_mediaconnect_egress_bridge_source_pmterror | EgressBridgeSourcePMTError | |
aws_mediaconnect_egress_bridge_source_ptserror | EgressBridgeSourcePTSError | |
aws_mediaconnect_egress_bridge_source_packet_loss_percent | EgressBridgeSourcePacketLossPercent | |
aws_mediaconnect_egress_bridge_source_recovered_packets | EgressBridgeSourceRecoveredPackets | |
aws_mediaconnect_egress_bridge_source_tsbyte_error | EgressBridgeSourceTSByteError | |
aws_mediaconnect_egress_bridge_source_tssync_loss | EgressBridgeSourceTSSyncLoss | |
aws_mediaconnect_egress_bridge_source_total_packets | EgressBridgeSourceTotalPackets | |
aws_mediaconnect_egress_bridge_source_transport_error | EgressBridgeSourceTransportError | |
aws_mediaconnect_egress_bridge_tsbyte_error | EgressBridgeTSByteError | |
aws_mediaconnect_egress_bridge_tssync_loss | EgressBridgeTSSyncLoss | |
aws_mediaconnect_egress_bridge_total_packets | EgressBridgeTotalPackets | |
aws_mediaconnect_egress_bridge_transport_error | EgressBridgeTransportError | |
aws_mediaconnect_failover_switches | FailoverSwitches | |
aws_mediaconnect_ingress_bridge_bit_rate | IngressBridgeBitRate | |
aws_mediaconnect_ingress_bridge_caterror | IngressBridgeCATError | |
aws_mediaconnect_ingress_bridge_crcerror | IngressBridgeCRCError | |
aws_mediaconnect_ingress_bridge_continuity_counter | IngressBridgeContinuityCounter | |
aws_mediaconnect_ingress_bridge_dropped_packets | IngressBridgeDroppedPackets | |
aws_mediaconnect_ingress_bridge_failover_switches | IngressBridgeFailoverSwitches | |
aws_mediaconnect_ingress_bridge_merge_active | IngressBridgeMergeActive | |
aws_mediaconnect_ingress_bridge_not_recovered_packets | IngressBridgeNotRecoveredPackets | |
aws_mediaconnect_ingress_bridge_paterror | IngressBridgePATError | |
aws_mediaconnect_ingress_bridge_pcraccuracy_error | IngressBridgePCRAccuracyError | |
aws_mediaconnect_ingress_bridge_pcrerror | IngressBridgePCRError | |
aws_mediaconnect_ingress_bridge_piderror | IngressBridgePIDError | |
aws_mediaconnect_ingress_bridge_pmterror | IngressBridgePMTError | |
aws_mediaconnect_ingress_bridge_ptserror | IngressBridgePTSError | |
aws_mediaconnect_ingress_bridge_packet_loss_percent | IngressBridgePacketLossPercent | |
aws_mediaconnect_ingress_bridge_recovered_packets | IngressBridgeRecoveredPackets | |
aws_mediaconnect_ingress_bridge_source_arqrecovered | IngressBridgeSourceARQRecovered | |
aws_mediaconnect_ingress_bridge_source_arqrequests | IngressBridgeSourceARQRequests | |
aws_mediaconnect_ingress_bridge_source_bit_rate | IngressBridgeSourceBitRate | |
aws_mediaconnect_ingress_bridge_source_caterror | IngressBridgeSourceCATError | |
aws_mediaconnect_ingress_bridge_source_crcerror | IngressBridgeSourceCRCError | |
aws_mediaconnect_ingress_bridge_source_continuity_counter | IngressBridgeSourceContinuityCounter | |
aws_mediaconnect_ingress_bridge_source_dropped_packets | IngressBridgeSourceDroppedPackets | |
aws_mediaconnect_ingress_bridge_source_fecpackets | IngressBridgeSourceFECPackets | |
aws_mediaconnect_ingress_bridge_source_fecrecovered | IngressBridgeSourceFECRecovered | |
aws_mediaconnect_ingress_bridge_source_merge_active | IngressBridgeSourceMergeActive | |
aws_mediaconnect_ingress_bridge_source_merge_latency | IngressBridgeSourceMergeLatency | |
aws_mediaconnect_ingress_bridge_source_not_recovered_packets | IngressBridgeSourceNotRecoveredPackets | |
aws_mediaconnect_ingress_bridge_source_overflow_packets | IngressBridgeSourceOverflowPackets | |
aws_mediaconnect_ingress_bridge_source_paterror | IngressBridgeSourcePATError | |
aws_mediaconnect_ingress_bridge_source_pcraccuracy_error | IngressBridgeSourcePCRAccuracyError | |
aws_mediaconnect_ingress_bridge_source_pcrerror | IngressBridgeSourcePCRError | |
aws_mediaconnect_ingress_bridge_source_piderror | IngressBridgeSourcePIDError | |
aws_mediaconnect_ingress_bridge_source_pmterror | IngressBridgeSourcePMTError | |
aws_mediaconnect_ingress_bridge_source_ptserror | IngressBridgeSourcePTSError | |
aws_mediaconnect_ingress_bridge_source_packet_loss_percent | IngressBridgeSourcePacketLossPercent | |
aws_mediaconnect_ingress_bridge_source_recovered_packets | IngressBridgeSourceRecoveredPackets | |
aws_mediaconnect_ingress_bridge_source_round_trip_time | IngressBridgeSourceRoundTripTime | |
aws_mediaconnect_ingress_bridge_source_tsbyte_error | IngressBridgeSourceTSByteError | |
aws_mediaconnect_ingress_bridge_source_tssync_loss | IngressBridgeSourceTSSyncLoss | |
aws_mediaconnect_ingress_bridge_source_total_packets | IngressBridgeSourceTotalPackets | |
aws_mediaconnect_ingress_bridge_source_transport_error | IngressBridgeSourceTransportError | |
aws_mediaconnect_ingress_bridge_tsbyte_error | IngressBridgeTSByteError | |
aws_mediaconnect_ingress_bridge_tssync_loss | IngressBridgeTSSyncLoss | |
aws_mediaconnect_ingress_bridge_total_packets | IngressBridgeTotalPackets | |
aws_mediaconnect_ingress_bridge_transport_error | IngressBridgeTransportError | |
aws_mediaconnect_jitter | Jitter | |
aws_mediaconnect_latency | Latency | |
aws_mediaconnect_maintenance_canceled | MaintenanceCanceled | |
aws_mediaconnect_maintenance_failed | MaintenanceFailed | |
aws_mediaconnect_maintenance_rescheduled | MaintenanceRescheduled | |
aws_mediaconnect_maintenance_scheduled | MaintenanceScheduled | |
aws_mediaconnect_maintenance_started | MaintenanceStarted | |
aws_mediaconnect_maintenance_succeeded | MaintenanceSucceeded | |
aws_mediaconnect_merge_active | MergeActive | |
aws_mediaconnect_merge_latency | MergeLatency | |
aws_mediaconnect_not_recovered_packets | NotRecoveredPackets | |
aws_mediaconnect_output_connected | OutputConnected | |
aws_mediaconnect_output_disconnections | OutputDisconnections | |
aws_mediaconnect_output_dropped_payloads | OutputDroppedPayloads | |
aws_mediaconnect_output_late_payloads | OutputLatePayloads | |
aws_mediaconnect_output_total_bytes | OutputTotalBytes | |
aws_mediaconnect_output_total_payloads | OutputTotalPayloads | |
aws_mediaconnect_overflow_packets | OverflowPackets | |
aws_mediaconnect_paterror | PATError | |
aws_mediaconnect_pcraccuracy_error | PCRAccuracyError | |
aws_mediaconnect_pcrerror | PCRError | |
aws_mediaconnect_piderror | PIDError | |
aws_mediaconnect_pmterror | PMTError | |
aws_mediaconnect_ptserror | PTSError | |
aws_mediaconnect_packet_loss_percent | PacketLossPercent | |
aws_mediaconnect_recovered_packets | RecoveredPackets | |
aws_mediaconnect_round_trip_time | RoundTripTime | |
aws_mediaconnect_source_arqrecovered | SourceARQRecovered | |
aws_mediaconnect_source_arqrequests | SourceARQRequests | |
aws_mediaconnect_source_bit_rate | SourceBitRate | |
aws_mediaconnect_source_caterror | SourceCATError | |
aws_mediaconnect_source_crcerror | SourceCRCError | |
aws_mediaconnect_source_connected | SourceConnected | |
aws_mediaconnect_source_continuity_counter | SourceContinuityCounter | |
aws_mediaconnect_source_disconnections | SourceDisconnections | |
aws_mediaconnect_source_dropped_packets | SourceDroppedPackets | |
aws_mediaconnect_source_dropped_payloads | SourceDroppedPayloads | |
aws_mediaconnect_source_fecpackets | SourceFECPackets | |
aws_mediaconnect_source_fecrecovered | SourceFECRecovered | |
aws_mediaconnect_source_late_payloads | SourceLatePayloads | |
aws_mediaconnect_source_merge_active | SourceMergeActive | |
aws_mediaconnect_source_merge_latency | SourceMergeLatency | |
aws_mediaconnect_source_merge_status_warn_mismatch | SourceMergeStatusWarnMismatch | |
aws_mediaconnect_source_merge_status_warn_solo | SourceMergeStatusWarnSolo | |
aws_mediaconnect_source_missing_packets | SourceMissingPackets | |
aws_mediaconnect_source_not_recovered_packets | SourceNotRecoveredPackets | |
aws_mediaconnect_source_overflow_packets | SourceOverflowPackets | |
aws_mediaconnect_source_paterror | SourcePATError | |
aws_mediaconnect_source_pcraccuracy_error | SourcePCRAccuracyError | |
aws_mediaconnect_source_pcrerror | SourcePCRError | |
aws_mediaconnect_source_piderror | SourcePIDError | |
aws_mediaconnect_source_pmterror | SourcePMTError | |
aws_mediaconnect_source_ptserror | SourcePTSError | |
aws_mediaconnect_source_packet_loss_percent | SourcePacketLossPercent | |
aws_mediaconnect_source_recovered_packets | SourceRecoveredPackets | |
aws_mediaconnect_source_round_trip_time | SourceRoundTripTime | |
aws_mediaconnect_source_selected | SourceSelected | |
aws_mediaconnect_source_tsbyte_error | SourceTSByteError | |
aws_mediaconnect_source_tssync_loss | SourceTSSyncLoss | |
aws_mediaconnect_source_total_bytes | SourceTotalBytes | |
aws_mediaconnect_source_total_packets | SourceTotalPackets | |
aws_mediaconnect_source_total_payloads | SourceTotalPayloads | |
aws_mediaconnect_source_transport_error | SourceTransportError | |
aws_mediaconnect_tsbyte_error | TSByteError | |
aws_mediaconnect_tssync_loss | TSSyncLoss | |
aws_mediaconnect_total_packets | TotalPackets | |
aws_mediaconnect_transport_error | TransportError | |
aws_mediaconnect_uptime | Uptime |
AWS/MediaTailor
Function: Personalizes advertisement insertion in video streams for a seamless experience
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_mediatailor_info | ||
aws_mediatailor_ad_decision_server_ads | AdDecisionServer.Ads | Tracks the number of ads provided by the Ad Decision Server (ADS). |
aws_mediatailor_ad_decision_server_duration | AdDecisionServer.Duration | Measures the duration of requests made to the Ad Decision Server. |
aws_mediatailor_ad_decision_server_errors | AdDecisionServer.Errors | Monitors the number of errors returned by the Ad Decision Server. |
aws_mediatailor_ad_decision_server_fill_rate | AdDecisionServer.FillRate | Tracks the rate at which ad slots are successfully filled by the Ad Decision Server. |
aws_mediatailor_ad_decision_server_timeouts | AdDecisionServer.Timeouts | Tracks the number of timeouts during requests to the Ad Decision Server. |
aws_mediatailor_ad_not_ready | AdNotReady | Indicates the number of instances where ads were not ready to be served. |
aws_mediatailor_avails_duration | Avails.Duration | Measures the duration of available ad opportunities (avails). |
aws_mediatailor_avails_fill_rate | Avails.FillRate | Tracks the rate at which avails are filled with ads. |
aws_mediatailor_avails_filled_duration | Avails.FilledDuration | Measures the total filled duration of ad avails. |
aws_mediatailor_get_manifest_errors | GetManifest.Errors | Monitors the number of errors encountered while retrieving the manifest. |
aws_mediatailor_origin_errors | Origin.Errors | Tracks the number of errors originating from the content origin server. |
aws_mediatailor_origin_timeouts | Origin.Timeouts | Monitors the number of timeouts from requests to the content origin server. |
AWS/NATGateway
Function: Manages network address translation to securely connect instances to the internet
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_natgateway_info | ||
aws_natgateway_active_connection_count | ActiveConnectionCount | Tracks the number of active connections to the NAT Gateway. |
aws_natgateway_bytes_in_from_destination | BytesInFromDestination | Measures the amount of data received by the NAT Gateway from the destination (in bytes). |
aws_natgateway_bytes_in_from_source | BytesInFromSource | Measures the amount of data received by the NAT Gateway from the source (in bytes). |
aws_natgateway_bytes_out_to_destination | BytesOutToDestination | Tracks the data sent from the NAT Gateway to the destination (in bytes). |
aws_natgateway_bytes_out_to_source | BytesOutToSource | Measures the data sent from the NAT Gateway to the source (in bytes). |
aws_natgateway_connection_attempt_count | ConnectionAttemptCount | Counts the number of attempts to establish a connection via the NAT Gateway. |
aws_natgateway_connection_established_count | ConnectionEstablishedCount | Measures the successful establishment of connections through the NAT Gateway. |
aws_natgateway_error_port_allocation | ErrorPortAllocation | Tracks errors related to port allocation failures in the NAT Gateway. |
aws_natgateway_idle_timeout_count | IdleTimeoutCount | Counts the number of times connections are closed due to idle timeouts on the NAT Gateway. |
aws_natgateway_packets_drop_count | PacketsDropCount | Measures the number of packets dropped by the NAT Gateway. |
aws_natgateway_packets_in_from_destination | PacketsInFromDestination | Tracks the number of packets received by the NAT Gateway from the destination. |
aws_natgateway_packets_in_from_source | PacketsInFromSource | Measures the number of packets received by the NAT Gateway from the source. |
aws_natgateway_packets_out_to_destination | PacketsOutToDestination | Tracks the number of packets sent from the NAT Gateway to the destination. |
aws_natgateway_packets_out_to_source | PacketsOutToSource | Measures the number of packets sent from the NAT Gateway to the source. |
AWS/Neptune
Function: Managed graph database service for building and running graph applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_neptune_info | ||
aws_neptune_cpuutilization | CPUUtilization | Monitors the percentage of CPU resources used by the Neptune database instance. |
aws_neptune_cluster_replica_lag | ClusterReplicaLag | Measures the replication lag between the Neptune writer and reader nodes in milliseconds. |
aws_neptune_cluster_replica_lag_maximum | ClusterReplicaLagMaximum | Tracks the maximum replica lag during the monitored period. |
aws_neptune_cluster_replica_lag_minimum | ClusterReplicaLagMinimum | Tracks the minimum replica lag during the monitored period. |
aws_neptune_engine_uptime | EngineUptime | Monitors the total uptime of the Neptune engine instance. |
aws_neptune_free_local_storage | FreeLocalStorage | Monitors the amount of local storage available on the Neptune instance. |
aws_neptune_freeable_memory | FreeableMemory | Tracks the amount of available memory on the Neptune instance. |
aws_neptune_gremlin_errors | GremlinErrors | Counts the errors encountered in Gremlin queries. |
aws_neptune_gremlin_http1xx | GremlinHttp1xx | Tracks HTTP 1xx responses for Gremlin queries. |
aws_neptune_gremlin_http2xx | GremlinHttp2xx | Tracks HTTP 2xx (successful) responses for Gremlin queries. |
aws_neptune_gremlin_http4xx | GremlinHttp4xx | Monitors HTTP 4xx (client error) responses for Gremlin queries. |
aws_neptune_gremlin_http5xx | GremlinHttp5xx | Tracks HTTP 5xx (server error) responses for Gremlin queries. |
aws_neptune_gremlin_requests | GremlinRequests | Monitors the total number of Gremlin requests made. |
aws_neptune_gremlin_requests_per_sec | GremlinRequestsPerSec | Measures the rate of Gremlin requests per second. |
aws_neptune_gremlin_web_socket_available_connections | GremlinWebSocketAvailableConnections | Tracks available WebSocket connections for Gremlin. |
aws_neptune_gremlin_web_socket_client_errors | GremlinWebSocketClientErrors | Monitors WebSocket client errors for Gremlin. |
aws_neptune_gremlin_web_socket_server_errors | GremlinWebSocketServerErrors | Monitors WebSocket server errors for Gremlin. |
aws_neptune_gremlin_web_socket_success | GremlinWebSocketSuccess | Counts successful WebSocket connections for Gremlin. |
aws_neptune_http100 | Http100 | Monitors HTTP 100 responses from the Neptune instance. |
aws_neptune_http101 | Http101 | Tracks HTTP 101 responses (Switching Protocols). |
aws_neptune_http1xx | Http1xx | Tracks all HTTP 1xx responses for requests made to the Neptune instance. |
aws_neptune_http200 | Http200 | Tracks HTTP 200 (OK) responses. |
aws_neptune_http2xx | Http2xx | Monitors all HTTP 2xx responses (successful requests). |
aws_neptune_http400 | Http400 | Tracks HTTP 400 (bad request) responses. |
aws_neptune_http403 | Http403 | Monitors HTTP 403 (forbidden) responses. |
aws_neptune_http405 | Http405 | Tracks HTTP 405 (method not allowed) responses. |
aws_neptune_http413 | Http413 | Tracks HTTP 413 (request entity too large) responses. |
aws_neptune_http429 | Http429 | Monitors HTTP 429 (too many requests) responses. |
aws_neptune_http4xx | Http4xx | Tracks all HTTP 4xx (client error) responses. |
aws_neptune_http500 | Http500 | Monitors HTTP 500 (internal server error) responses. |
aws_neptune_http501 | Http501 | Tracks HTTP 501 (not implemented) responses. |
aws_neptune_http5xx | Http5xx | Monitors all HTTP 5xx (server error) responses. |
aws_neptune_loader_errors | LoaderErrors | Counts errors encountered during bulk loader operations. |
aws_neptune_loader_requests | LoaderRequests | Tracks requests made to the bulk loader. |
aws_neptune_network_receive_throughput | NetworkReceiveThroughput | Monitors the network throughput for data received by the Neptune instance. |
aws_neptune_network_throughput | NetworkThroughput | Measures the total network throughput (incoming and outgoing) of the Neptune instance. |
aws_neptune_network_transmit_throughput | NetworkTransmitThroughput | Tracks the network throughput for data transmitted by the Neptune instance. |
aws_neptune_sparql_errors | SparqlErrors | Monitors errors encountered in SPARQL queries. |
aws_neptune_sparql_http1xx | SparqlHttp1xx | Tracks HTTP 1xx responses for SPARQL queries. |
aws_neptune_sparql_http2xx | SparqlHttp2xx | Tracks HTTP 2xx responses for SPARQL queries. |
aws_neptune_sparql_http4xx | SparqlHttp4xx | Monitors HTTP 4xx responses for SPARQL queries. |
aws_neptune_sparql_http5xx | SparqlHttp5xx | Tracks HTTP 5xx responses for SPARQL queries. |
aws_neptune_sparql_requests** | SparqlRequests | Measures the number of SPARQL requests made to the Neptune instance. |
aws_neptune_sparql_requests_per_sec | SparqlRequestsPerSec | Tracks the rate of SPARQL requests per second. |
aws_neptune_status_errors | StatusErrors | Monitors the number of status errors reported by the Neptune instance. |
aws_neptune_status_requests | StatusRequests | Tracks the number of status requests made to the Neptune instance. |
aws_neptune_volume_bytes_used | VolumeBytesUsed | Measures the amount of storage used by the Neptune instance. |
aws_neptune_volume_read_iops | VolumeReadIOPs | Monitors the read input/output operations per second on the Neptune instance’s volume. |
aws_neptune_volume_write_iops | VolumeWriteIOPs | Tracks the write input/output operations per second on the Neptune instance’s volume. |
AWS/NetworkELB
Function: Provides highly scalable and fault-tolerant network load balancing for traffic distribution
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_networkelb_info | ||
aws_networkelb_active_flow_count | ActiveFlowCount | Monitors the total number of active flow connections through the Network Load Balancer. |
aws_networkelb_active_flow_count_tls | ActiveFlowCount_TLS | Tracks the number of active flow connections through the Network Load Balancer that are using TLS. |
aws_networkelb_client_tlsnegotiation_error_count | ClientTLSNegotiationErrorCount | Monitors the number of client TLS negotiation errors, indicating issues with SSL/TLS handshakes. |
aws_networkelb_consumed_lcus | ConsumedLCUs | Measures Load Balancer Capacity Units (LCUs) consumed by the Network Load Balancer. |
aws_networkelb_healthy_host_count | HealthyHostCount | Tracks the number of healthy targets available to receive traffic. |
aws_networkelb_new_flow_count | NewFlowCount | Measures the number of new flow connections established with the Network Load Balancer. |
aws_networkelb_new_flow_count_tls | NewFlowCount_TLS | Tracks the number of new flow connections using TLS. |
aws_networkelb_processed_bytes | ProcessedBytes | Measures the total amount of data processed by the Network Load Balancer. |
aws_networkelb_target_tlsnegotiation_error_count | TargetTLSNegotiationErrorCount | Monitors TLS negotiation errors on the target side, indicating failed handshakes. |
aws_networkelb_tcp_client_reset_count | TCP_Client_Reset_Count | Tracks the number of TCP client resets, indicating client-initiated connection terminations. |
aws_networkelb_tcp_target_reset_count | TCP_Target_Reset_Count | Monitors TCP resets initiated by the target, indicating failed connections. |
aws_networkelb_un_healthy_host_count | UnHealthyHostCount | Measures the number of targets marked as unhealthy by the load balancer. |
aws_networkelb_active_flow_count_tcp | ActiveFlowCount_TCP | Monitors the number of active TCP flows through the Network Load Balancer. |
aws_networkelb_active_flow_count_udp | ActiveFlowCount_UDP | Tracks the number of active UDP flows through the Network Load Balancer. |
aws_networkelb_consumed_lcus_tcp | ConsumedLCUs_TCP | Measures LCUs consumed by TCP traffic. |
aws_networkelb_consumed_lcus_tls | ConsumedLCUs_TLS | Measures LCUs consumed by TLS traffic. |
aws_networkelb_consumed_lcus_udp | ConsumedLCUs_UDP | Measures LCUs consumed by UDP traffic. |
aws_networkelb_new_flow_count_tcp | NewFlowCount_TCP | Tracks the number of new TCP flow connections established. |
aws_networkelb_new_flow_count_udp | NewFlowCount_UDP | Measures the number of new UDP flow connections established. |
aws_networkelb_peak_packets_per_second | PeakPacketsPerSecond | Monitors the highest rate of packets processed by the Network Load Balancer per second. |
aws_networkelb_port_allocation_error_count | PortAllocationErrorCount | Tracks the number of errors due to port allocation failures. |
aws_networkelb_processed_bytes_tcp | ProcessedBytes_TCP | Measures the total data processed over TCP connections. |
aws_networkelb_processed_bytes_tls | ProcessedBytes_TLS | Tracks the total data processed over TLS connections. |
aws_networkelb_processed_bytes_udp | ProcessedBytes_UDP | Monitors the total data processed over UDP connections. |
aws_networkelb_processed_packets | ProcessedPackets | Tracks the total number of packets processed by the Network Load Balancer. |
aws_networkelb_security_group_blocked_flow_count_inbound_icmp | SecurityGroupBlockedFlowCount_Inbound_ICMP | Measures the number of inbound ICMP flows blocked by security groups. |
aws_networkelb_security_group_blocked_flow_count_inbound_tcp | SecurityGroupBlockedFlowCount_Inbound_TCP | Tracks the number of inbound TCP flows blocked by security groups. |
aws_networkelb_security_group_blocked_flow_count_inbound_udp | SecurityGroupBlockedFlowCount_Inbound_UDP | Monitors the number of inbound UDP flows blocked by security groups. |
aws_networkelb_security_group_blocked_flow_count_outbound_icmp | SecurityGroupBlockedFlowCount_Outbound_ICMP | Measures the number of outbound ICMP flows blocked by security groups. |
aws_networkelb_security_group_blocked_flow_count_outbound_tcp | SecurityGroupBlockedFlowCount_Outbound_TCP | Tracks the number of outbound TCP flows blocked by security groups. |
aws_networkelb_security_group_blocked_flow_count_outbound_udp | SecurityGroupBlockedFlowCount_Outbound_UDP | Monitors the number of outbound UDP flows blocked by security groups. |
aws_networkelb_tcp_elb_reset_count | TCP_ELB_Reset_Count | Tracks the number of TCP resets initiated by the Network Load Balancer itself. |
aws_networkelb_unhealthy_routing_flow_count | UnhealthyRoutingFlowCount | Monitors the number of routing flows directed to unhealthy targets. |
AWS/NetworkFirewall
Function: Managed network firewall service to secure VPCs
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_networkfirewall_info | ||
aws_networkfirewall_dropped_packets | DroppedPackets | Tracks the number of packets dropped by the Network Firewall, indicating blocked or failed traffic. |
aws_networkfirewall_packets | Packets | Monitors the total number of packets inspected by the Network Firewall. |
aws_networkfirewall_passed_packets | PassedPackets | Measures the number of packets allowed through the Network Firewall, indicating successful traffic. |
aws_networkfirewall_received_packet_count | ReceivedPacketCount | Tracks the total number of packets received by the Network Firewall for inspection. |
AWS/PrivateLinkEndpoints
Function: Provides private connectivity between VPCs and AWS services or third-party services
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_privatelinkendpoints_info | ||
aws_privatelinkendpoints_active_connections | ActiveConnections | Tracks the number of active connections through the PrivateLink endpoints. |
aws_privatelinkendpoints_bytes_processed | BytesProcessed | Measures the amount of data processed by the PrivateLink endpoints in bytes. |
aws_privatelinkendpoints_new_connections | NewConnections | Monitors the number of new connections established through the PrivateLink endpoints. |
aws_privatelinkendpoints_packets_dropped | PacketsDropped | Tracks the number of packets dropped by the PrivateLink endpoints, which could indicate errors or network issues. |
aws_privatelinkendpoints_rst_packets_received | RstPacketsReceived | Measures the number of reset (RST) packets received, which can indicate connection terminations. |
AWS/PrivateLinkServices
Function: Service for building services accessible over AWS PrivateLink
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_privatelinkservices_info | ||
aws_privatelinkservices_active_connections | ActiveConnections | Monitors the number of active connections managed by the PrivateLink services. |
aws_privatelinkservices_bytes_processed | BytesProcessed | Measures the total amount of data processed by the PrivateLink services in bytes. |
aws_privatelinkservices_endpoints_count | EndpointsCount | Tracks the number of PrivateLink service endpoints currently connected. |
aws_privatelinkservices_new_connections | NewConnections | Monitors the number of new connections established via the PrivateLink services. |
aws_privatelinkservices_rst_packets_received | RstPacketsReceived | Measures the number of reset (RST) packets received, indicating terminated connections. |
AWS/Prometheus
Function: Managed Prometheus service for monitoring and alerting metrics
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_prometheus_info | ||
aws_prometheus_alert_manager_alerts_received | AlertManagerAlertsReceived | Tracks the number of alerts received by the Prometheus Alert Manager. |
aws_prometheus_alert_manager_notifications_failed | AlertManagerNotificationsFailed | Monitors the number of failed alert notifications sent by the Prometheus Alert Manager. |
aws_prometheus_alert_manager_notifications_throttled | AlertManagerNotificationsThrottled | Measures the number of alert notifications throttled due to rate limits or other constraints. |
aws_prometheus_discarded_samples | DiscardedSamples | Tracks the number of discarded samples due to errors or incorrect data. |
aws_prometheus_rule_evaluation_failures | RuleEvaluationFailures | Monitors the number of failed rule evaluations in Prometheus. |
aws_prometheus_rule_evaluations | RuleEvaluations | Measures the total number of rule evaluations performed by Prometheus. |
aws_prometheus_rule_group_iterations_missed | RuleGroupIterationsMissed | Tracks the number of rule group evaluation iterations that were missed due to processing delays. |
AWS/RDS
Function: Managed relational database service for databases like MySQL, PostgreSQL, and Oracle
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_rds_info | ||
aws_rds_cpuutilization | CPUUtilization | Tracks the utilization of CPU resources by RDS instances. |
aws_rds_database_connections | DatabaseConnections | Measures the number of active database connections to RDS instances. |
aws_rds_replica_lag | ReplicaLag | Monitors the lag time between the master and replica databases. |
aws_rds_freeable_memory | FreeableMemory | Indicates the available memory that can be used by the RDS instance. |
aws_rds_free_storage_space | FreeStorageSpace | Shows the remaining storage space available on the RDS instance. |
aws_rds_free_storage_space_log_volume | FreeStorageSpaceLogVolume | |
aws_rds_swap_usage | SwapUsage | Monitors the amount of swap space used by the RDS instance. |
aws_rds_read_throughput | ReadThroughput | Measures the throughput for read operations from the database. |
aws_rds_read_latency | ReadLatency | Indicates the latency for read operations on the database. |
aws_rds_read_iops | ReadIOPS | Tracks the input/output operations per second for reads on the RDS instance. |
aws_rds_write_throughput | WriteThroughput | Measures the throughput for write operations to the database. |
aws_rds_write_latency | WriteLatency | Indicates the latency for write operations on the database. |
aws_rds_write_iops | WriteIOPS | Tracks the input/output operations per second for writes on the RDS instance. |
aws_rds_burst_balance | BurstBalance | Monitors the burst balance percentage for instances with burstable performance. |
aws_rds_ebsbyte_balance_percent | EBSByteBalance% | |
aws_rds_ebsiobalance_percent | EBSIOBalance% | |
aws_rds_dbload | DBLoad | Measures the database load on the instance. |
aws_rds_dbload_cpu | DBLoadCPU | Tracks the portion of database load related to CPU usage. |
aws_rds_dbload_non_cpu | DBLoadNonCPU | Measures the portion of database load unrelated to CPU usage. |
aws_rds_cpucredit_usage | CPUCreditUsage | |
aws_rds_cpucredit_balance | CPUCreditBalance | |
aws_rds_acuutilization | ACUUtilization | Monitors the utilization of Aurora Capacity Units (ACUs). |
aws_rds_aborted_clients | AbortedClients | Tracks the number of aborted client connections to the database. |
aws_rds_active_transactions | ActiveTransactions | Shows the number of active transactions on the database. |
aws_rds_aurora_binlog_replica_lag | AuroraBinlogReplicaLag | Monitors the replication lag between the Aurora master and replicas. |
aws_rds_aurora_dmlrejected_master_full | AuroraDMLRejectedMasterFull | |
aws_rds_aurora_dmlrejected_writer_full | AuroraDMLRejectedWriterFull | |
aws_rds_aurora_estimated_shared_memory_bytes | AuroraEstimatedSharedMemoryBytes | |
aws_rds_aurora_global_dbdata_transfer_bytes | AuroraGlobalDBDataTransferBytes | |
aws_rds_aurora_global_dbprogress_lag | AuroraGlobalDBProgressLag | |
aws_rds_aurora_global_dbrpolag | AuroraGlobalDBRPOLag | |
aws_rds_aurora_global_dbreplicated_write_io | AuroraGlobalDBReplicatedWriteIO | |
aws_rds_aurora_global_dbreplication_lag | AuroraGlobalDBReplicationLag | |
aws_rds_aurora_memory_health_state | AuroraMemoryHealthState | Indicates the health state of memory in Aurora instances. |
aws_rds_aurora_memory_num_declined_sql_total | AuroraMemoryNumDeclinedSqlTotal | |
aws_rds_aurora_memory_num_kill_conn_total | AuroraMemoryNumKillConnTotal | |
aws_rds_aurora_memory_num_kill_query_total | AuroraMemoryNumKillQueryTotal | |
aws_rds_aurora_optimized_reads_cache_hit_ratio | AuroraOptimizedReadsCacheHitRatio | |
aws_rds_aurora_replica_lag | AuroraReplicaLag | |
aws_rds_aurora_replica_lag_maximum | AuroraReplicaLagMaximum | |
aws_rds_aurora_replica_lag_minimum | AuroraReplicaLagMinimum | |
aws_rds_aurora_slow_connection_handle_count | AuroraSlowConnectionHandleCount | |
aws_rds_aurora_slow_handshake_count | AuroraSlowHandshakeCount | |
aws_rds_aurora_volume_bytes_left_total | AuroraVolumeBytesLeftTotal | |
aws_rds_availability_percentage | AvailabilityPercentage | Measures the availability of the RDS instance in terms of percentage uptime. |
aws_rds_backtrack_change_records_creation_rate | BacktrackChangeRecordsCreationRate | |
aws_rds_backtrack_change_records_stored | BacktrackChangeRecordsStored | |
aws_rds_backtrack_window_actual | BacktrackWindowActual | |
aws_rds_backtrack_window_alert | BacktrackWindowAlert | |
aws_rds_backup_retention_period_storage_used | BackupRetentionPeriodStorageUsed | |
aws_rds_bin_log_disk_usage | BinLogDiskUsage | |
aws_rds_blocked_transactions | BlockedTransactions | |
aws_rds_buffer_cache_hit_ratio | BufferCacheHitRatio | |
aws_rds_cpusurplus_credit_balance | CPUSurplusCreditBalance | |
aws_rds_cpusurplus_credits_charged | CPUSurplusCreditsCharged | |
aws_rds_checkpoint_lag | CheckpointLag | |
aws_rds_client_connections | ClientConnections | |
aws_rds_client_connections_closed | ClientConnectionsClosed | |
aws_rds_client_connections_no_tls | ClientConnectionsNoTLS | |
aws_rds_client_connections_received | ClientConnectionsReceived | |
aws_rds_client_connections_setup_failed_auth | ClientConnectionsSetupFailedAuth | |
aws_rds_client_connections_setup_succeeded | ClientConnectionsSetupSucceeded | |
aws_rds_client_connections_tls | ClientConnectionsTLS | |
aws_rds_commit_latency | CommitLatency | |
aws_rds_commit_throughput | CommitThroughput | |
aws_rds_connection_attempts | ConnectionAttempts | |
aws_rds_ddllatency | DDLLatency | |
aws_rds_ddlthroughput | DDLThroughput | |
aws_rds_dmllatency | DMLLatency | |
aws_rds_dmlthroughput | DMLThroughput | |
aws_rds_database_connection_requests | DatabaseConnectionRequests | |
aws_rds_database_connection_requests_with_tls | DatabaseConnectionRequestsWithTLS | |
aws_rds_database_connections_borrow_latency | DatabaseConnectionsBorrowLatency | |
aws_rds_database_connections_currently_borrowed | DatabaseConnectionsCurrentlyBorrowed | |
aws_rds_database_connections_currently_in_transaction | DatabaseConnectionsCurrentlyInTransaction | |
aws_rds_database_connections_currently_session_pinned | DatabaseConnectionsCurrentlySessionPinned | |
aws_rds_database_connections_setup_failed | DatabaseConnectionsSetupFailed | |
aws_rds_database_connections_setup_succeeded | DatabaseConnectionsSetupSucceeded | |
aws_rds_database_connections_with_tls | DatabaseConnectionsWithTLS | |
aws_rds_deadlocks | Deadlocks | |
aws_rds_delete_latency | DeleteLatency | |
aws_rds_delete_throughput | DeleteThroughput | |
aws_rds_disk_queue_depth | DiskQueueDepth | |
aws_rds_disk_queue_depth_log_volume | DiskQueueDepthLogVolume | |
aws_rds_engine_uptime | EngineUptime | |
aws_rds_failed_sqlserver_agent_jobs_count | FailedSQLServerAgentJobsCount | |
aws_rds_free_ephemeral_storage | FreeEphemeralStorage | |
aws_rds_free_local_storage | FreeLocalStorage | |
aws_rds_insert_latency | InsertLatency | |
aws_rds_insert_throughput | InsertThroughput | |
aws_rds_login_failures | LoginFailures | |
aws_rds_max_database_connections_allowed | MaxDatabaseConnectionsAllowed | |
aws_rds_maximum_used_transaction_ids | MaximumUsedTransactionIDs | |
aws_rds_network_receive_throughput | NetworkReceiveThroughput | |
aws_rds_network_throughput | NetworkThroughput | |
aws_rds_network_transmit_throughput | NetworkTransmitThroughput | |
aws_rds_num_binary_log_files | NumBinaryLogFiles | |
aws_rds_oldest_replication_slot_lag | OldestReplicationSlotLag | |
aws_rds_purge_boundary | PurgeBoundary | |
aws_rds_purge_finished_point | PurgeFinishedPoint | |
aws_rds_queries | Queries | Counts the number of queries executed on the RDS instance. |
aws_rds_query_database_response_latency | QueryDatabaseResponseLatency | |
aws_rds_query_requests | QueryRequests | |
aws_rds_query_requests_no_tls | QueryRequestsNoTLS | |
aws_rds_query_requests_tls | QueryRequestsTLS | |
aws_rds_query_response_latency | QueryResponseLatency | |
aws_rds_to_aurora_postgre_sqlreplica_lag | RDSToAuroraPostgreSQLReplicaLag | |
aws_rds_read_iopsephemeral_storage | ReadIOPSEphemeralStorage | |
aws_rds_read_iopslog_volume | ReadIOPSLogVolume | |
aws_rds_read_latency_ephemeral_storage | ReadLatencyEphemeralStorage | |
aws_rds_read_latency_log_volume | ReadLatencyLogVolume | |
aws_rds_read_throughput_ephemeral_storage | ReadThroughputEphemeralStorage | |
aws_rds_read_throughput_log_volume | ReadThroughputLogVolume | |
aws_rds_replication_channel_lag | ReplicationChannelLag | |
aws_rds_replication_slot_disk_usage | ReplicationSlotDiskUsage | |
aws_rds_result_set_cache_hit_ratio | ResultSetCacheHitRatio | |
aws_rds_rollback_segment_history_list_length | RollbackSegmentHistoryListLength | |
aws_rds_row_lock_time | RowLockTime | |
aws_rds_select_latency | SelectLatency | |
aws_rds_select_throughput | SelectThroughput | |
aws_rds_serverless_database_capacity | ServerlessDatabaseCapacity | |
aws_rds_snapshot_storage_used | SnapshotStorageUsed | |
aws_rds_storage_network_receive_throughput | StorageNetworkReceiveThroughput | |
aws_rds_storage_network_throughput | StorageNetworkThroughput | Measures the network throughput for both transmitting and receiving data from the RDS instance. |
aws_rds_storage_network_transmit_throughput | StorageNetworkTransmitThroughput | |
aws_rds_sum_binary_log_size | SumBinaryLogSize | |
aws_rds_temp_storage_iops | TempStorageIOPS | |
aws_rds_temp_storage_throughput | TempStorageThroughput | |
aws_rds_total_backup_storage_billed | TotalBackupStorageBilled | |
aws_rds_transaction_logs_disk_usage | TransactionLogsDiskUsage | Tracks the amount of disk space used by transaction logs. |
aws_rds_transaction_logs_generation | TransactionLogsGeneration | |
aws_rds_truncate_finished_point | TruncateFinishedPoint | |
aws_rds_update_latency | UpdateLatency | |
aws_rds_update_throughput | UpdateThroughput | |
aws_rds_volume_bytes_used | VolumeBytesUsed | Shows the total amount of disk space used by the RDS instance. |
aws_rds_volume_read_iops | VolumeReadIOPs | |
aws_rds_volume_write_iops | VolumeWriteIOPs | |
aws_rds_write_iopsephemeral_storage | WriteIOPSEphemeralStorage | |
aws_rds_write_iopslog_volume | WriteIOPSLogVolume | |
aws_rds_write_latency_ephemeral_storage | WriteLatencyEphemeralStorage | |
aws_rds_write_latency_log_volume | WriteLatencyLogVolume | Monitors the latency for write operations on the log volume. |
aws_rds_write_throughput_ephemeral_storage | WriteThroughputEphemeralStorage | |
aws_rds_write_throughput_log_volume | WriteThroughputLogVolume |
AWS/Redshift
Function: Fully managed data warehouse for large-scale data analytics
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose | |
---|---|---|---|
aws_redshift_info | |||
aws_redshift_cpuutilization | CPUUtilization | Tracks CPU utilization across Redshift clusters. | |
aws_redshift_commit_queue_length | CommitQueueLength | Measures the length of the commit queue for query execution. | |
aws_redshift_concurrency_scaling_active_clusters | ConcurrencyScalingActiveClusters | Monitors the number of active concurrency scaling clusters. | |
aws_redshift_concurrency_scaling_seconds | ConcurrencyScalingSeconds | Measures the time spent scaling for concurrency. | |
aws_redshift_database_connections | DatabaseConnections | Tracks the number of database connections to the Redshift cluster. | |
aws_redshift_health_status | HealthStatus | Provides health status of Redshift clusters. | |
aws_redshift_maintenance_mode | MaintenanceMode | Indicates if the cluster is in maintenance mode. | |
aws_redshift_max_configured_concurrency_scaling_clusters | MaxConfiguredConcurrencyScalingClusters | Tracks the maximum number of concurrency scaling clusters configured. | |
aws_redshift_network_receive_throughput | NetworkReceiveThroughput | Measures the network throughput for receiving data. | |
aws_redshift_network_transmit_throughput | NetworkTransmitThroughput | Measures the network throughput for transmitting data. | |
aws_redshift_num_exceeded_schema_quotas | NumExceededSchemaQuotas | Tracks how often schema quotas have been exceeded. | |
aws_redshift_percentage_disk_space_used | PercentageDiskSpaceUsed | Shows the percentage of disk space used by the cluster. | |
aws_redshift_percentage_quota_used | PercentageQuotaUsed | Monitors the percentage of quota used. | |
aws_redshift_queries_completed_per_second | QueriesCompletedPerSecond | Measures the number of queries completed per second. | |
aws_redshift_query_duration | QueryDuration | Tracks the duration of queries. | |
aws_redshift_query_runtime_breakdown | QueryRuntimeBreakdown | Provides a breakdown of the time spent on query execution. | |
aws_redshift_read_iops | ReadIOPS | Measures input/output operations per second for reads. | |
aws_redshift_read_latency | ReadLatency | Tracks latency for read operations. | |
aws_redshift_read_throughput | ReadThroughput | Measures throughput for read operations. | |
aws_redshift_schema_quota | SchemaQuota | Monitors schema quota usage. | |
aws_redshift_storage_used | StorageUsed | Shows the amount of storage used by the Redshift cluster. | |
aws_redshift_total_table_count | TotalTableCount | Measures the total number of tables in the cluster. | |
aws_redshift_wlmqueries_completed_per_second | WLMQueriesCompletedPerSecond | Tracks the number of queries completed per second in the Workload Management (WLM) queue. | |
aws_redshift_wlmquery_duration | WLMQueryDuration | Measures the duration of queries in the WLM queue. | |
aws_redshift_wlmqueue_length | WLMQueueLength | Tracks the length of the WLM queue. | |
aws_redshift_wlmqueue_wait_time | WLMQueueWaitTime | Measures the wait time for queries in the WLM queue. | |
aws_redshift_wlmrunning_queries | WLMRunningQueries | Shows the number of queries currently running in the WLM queue. | |
aws_redshift_write_iops | WriteIOPS | Measures input/output operations per second for writes. | |
aws_redshift_write_latency | WriteLatency | Tracks latency for write operations. | |
aws_redshift_write_throughput | WriteThroughput | Measures throughput for write operations. |
AWS/Route53
Function: Scalable DNS and domain registration service
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_route53_info | ||
aws_route53_child_health_check_healthy_count | ChildHealthCheckHealthyCount | Tracks the count of healthy child health checks. |
aws_route53_connection_time | ConnectionTime | Measures the time it takes to establish a connection. |
aws_route53_dnsqueries | DNSQueries | Monitors the number of DNS queries handled by Route 53. |
aws_route53_health_check_percentage_healthy | HealthCheckPercentageHealthy | Displays the percentage of healthy Route 53 health checks. |
aws_route53_health_check_status | HealthCheckStatus | Indicates the status of health checks, showing whether they are passing or failing. |
aws_route53_sslhandshake_time | SSLHandshakeTime | Measures the time it takes to complete the SSL handshake. |
aws_route53_time_to_first_byte | TimeToFirstByte | Tracks the time taken to receive the first byte of the response after a request is sent. |
AWS/Route53Resolver
Function: DNS firewall to filter and monitor DNS queries
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_route53resolver_info | ||
aws_route53resolver_inbound_query_volume | InboundQueryVolume | Measures the volume of DNS queries received by the Route 53 Resolver inbound endpoint. |
aws_route53resolver_outbound_query_aggregated_volume | OutboundQueryAggregatedVolume | Tracks the total volume of outbound DNS queries across all outbound endpoints. |
aws_route53resolver_outbound_query_volume | OutboundQueryVolume | Monitors the volume of DNS queries sent by the Route 53 Resolver outbound endpoint. |
AWS/S3
Function: Scalable object storage service for a wide range of data types
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_s3_info | ||
aws_s3_number_of_objects | NumberOfObjects | Tracks the total number of objects stored in an S3 bucket. |
aws_s3_bucket_size_bytes | BucketSizeBytes | Measures the total size of an S3 bucket in bytes. |
aws_s3_all_requests | AllRequests | Measures the total number of all requests made to an S3 bucket. |
aws_s3_4xx_errors | 4xxErrors | Counts the number of 4xx HTTP status code errors encountered during S3 requests. |
aws_s3_total_request_latency | TotalRequestLatency | TotalRequestLatency Measures the total latency for S3 requests. |
aws_s3_5xx_errors | 5xxErrors | Counts the number of 5xx HTTP status code errors encountered during S3 requests. |
aws_s3_bytes_downloaded | BytesDownloaded | Tracks the total bytes downloaded from an S3 bucket. |
aws_s3_bytes_pending_replication | BytesPendingReplication | Measures the bytes pending replication in S3 cross-region replication scenarios. |
aws_s3_bytes_uploaded | BytesUploaded | Tracks the total bytes uploaded to an S3 bucket. |
aws_s3_delete_requests | DeleteRequests | Measures the number of delete requests made to an S3 bucket. |
aws_s3_first_byte_latency | FirstByteLatency | Tracks the latency until the first byte is sent in an S3 request. |
aws_s3_get_requests | GetRequests | Measures the number of GET requests made to an S3 bucket. |
aws_s3_head_requests | HeadRequests | Counts the number of HEAD requests made to an S3 bucket. |
aws_s3_list_requests | ListRequests | Tracks the number of LIST requests made to an S3 bucket. |
aws_s3_operations_failed_replication | OperationsFailedReplication | Counts the number of replication operations that have failed. |
aws_s3_operations_pending_replication | OperationsPendingReplication | Tracks the number of pending replication operations in an S3 bucket. |
aws_s3_post_requests | PostRequests | Counts the number of POST requests made to an S3 bucket. |
aws_s3_put_requests | PutRequests | Tracks the number of PUT requests made to an S3 bucket. |
aws_s3_replication_latency | ReplicationLatency | Measures the latency of replication operations. |
aws_s3_select_requests | SelectRequests | Measures the number of select requests made to an S3 bucket. |
aws_s3_select_returned_bytes | SelectReturnedBytes | Tracks the bytes returned by S3 Select queries. |
aws_s3_select_scanned_bytes | SelectScannedBytes | Measures the bytes scanned by S3 Select queries. |
AWS/SES
Function: Email service for sending marketing, notification, and transactional emails
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ses_bounce | Bounce | |
aws_ses_complaint | Complaint | |
aws_ses_delivery | Delivery | |
aws_ses_reject | Reject | |
aws_ses_send | Send | |
aws_ses_clicks | Clicks | |
aws_ses_opens | Opens | |
aws_ses_rendering_failures | Rendering Failures | |
aws_ses_reputation_bounce_rate | Reputation.BounceRate | |
aws_ses_reputation_complaint_rate | Reputation.ComplaintRate |
AWS/SNS
Function: Managed messaging service for sending notifications to mobile devices or other services
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sns_info | ||
aws_sns_number_of_messages_published | NumberOfMessagesPublished | Tracks the number of messages published to SNS topics. |
aws_sns_number_of_notifications_delivered | NumberOfNotificationsDelivered | Measures the number of successfully delivered notifications. |
aws_sns_number_of_notifications_failed | NumberOfNotificationsFailed | Tracks the number of failed notifications. |
aws_sns_number_of_notifications_filtered_out | NumberOfNotificationsFilteredOut | Measures the notifications that were filtered out based on the subscription’s filter policies. |
aws_sns_number_of_notifications_filtered_out_invalid_attributes | NumberOfNotificationsFilteredOut-InvalidAttributes | Tracks the notifications filtered out due to invalid message attributes. |
aws_sns_number_of_notifications_filtered_out_message_body | NumberOfNotificationsFilteredOut-MessageBody | Measures notifications filtered out because of the message body content. |
aws_sns_number_of_notifications_filtered_out_no_message_attributes | NumberOfNotificationsFilteredOut-NoMessageAttributes | Tracks notifications filtered out due to missing message attributes. |
aws_sns_publish_size | PublishSize | Measures the size of messages published to SNS topics. |
aws_sns_smsmonth_to_date_spent_usd | SMSMonthToDateSpentUSD | Tracks the month-to-date costs incurred for sending SMS messages. |
aws_sns_smssuccess_rate | SMSSuccessRate | Measures the success rate of sending SMS messages via SNS. |
AWS/SQS
Function: Fully managed message queuing service for decoupling and scaling microservices
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sqs_info | ||
aws_sqs_approximate_age_of_oldest_message | ApproximateAgeOfOldestMessage | Tracks the approximate age of the oldest message in the queue. |
aws_sqs_approximate_number_of_messages_delayed | ApproximateNumberOfMessagesDelayed | Measures the approximate number of messages currently delayed. |
aws_sqs_approximate_number_of_messages_not_visible | ApproximateNumberOfMessagesNotVisible | Tracks the approximate number of messages that are not visible to consumers due to being in flight. |
aws_sqs_approximate_number_of_messages_visible | ApproximateNumberOfMessagesVisible | Measures the approximate number of messages currently visible to consumers. |
aws_sqs_number_of_empty_receives | NumberOfEmptyReceives | Tracks the number of receive requests that did not return any messages. |
aws_sqs_number_of_messages_deleted | NumberOfMessagesDeleted | Measures the number of messages successfully deleted from the queue. |
aws_sqs_number_of_messages_received | NumberOfMessagesReceived | Tracks the number of messages received from the queue. |
aws_sqs_number_of_messages_sent | NumberOfMessagesSent | Measures the number of messages successfully sent to the queue. |
aws_sqs_sent_message_size | SentMessageSize | Tracks the size of messages sent to the queue. |
AWS/SageMaker
Function: Managed service for building, training, and deploying machine learning models
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_info | ||
aws_sagemaker_invocation4_xxerrors | Invocation4XXErrors | Tracks the count of 4XX errors (client-side errors) during model invocations. |
aws_sagemaker_invocation5_xxerrors | Invocation5XXErrors | Tracks the count of 5XX errors (server-side errors) during model invocations. |
aws_sagemaker_invocation_model_errors | InvocationModelErrors | Measures the errors specific to model invocations. |
aws_sagemaker_invocations | Invocations | Counts the number of successful model invocations. |
aws_sagemaker_invocations_per_copy | InvocationsPerCopy | Tracks the number of invocations per copy of the model. |
aws_sagemaker_invocations_per_instance | InvocationsPerInstance | Measures the number of invocations per instance. |
aws_sagemaker_model_cache_hit | ModelCacheHit | Tracks the instances where model cache is hit, reducing load times. |
aws_sagemaker_model_downloading_time | ModelDownloadingTime | Measures the time taken to download the model to the instance. |
aws_sagemaker_model_latency | ModelLatency | Tracks the latency of model invocations. |
aws_sagemaker_model_loading_time | ModelLoadingTime | Measures the time taken to load the model on the instance. |
aws_sagemaker_model_loading_wait_time | ModelLoadingWaitTime | Measures the wait time during the model loading process. |
aws_sagemaker_model_setup_time | ModelSetupTime | Tracks the time taken to set up the model environment. |
aws_sagemaker_model_unloading_time | ModelUnloadingTime | Measures the time taken to unload the model from the instance. |
aws_sagemaker_overhead_latency | OverheadLatency | Tracks additional latency incurred due to overheads during the invocation process. |
AWS/SageMaker/Endpoints
Function: Provides real-time and batch inference capabilities for deployed machine learning models
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_endpoints_info | ||
aws_sagemaker_endpoints_cpureservation | CPUReservation | Tracks the amount of reserved CPU resources for SageMaker endpoints. |
aws_sagemaker_endpoints_cpuutilization | CPUUtilization | Monitors the actual CPU utilization by the SageMaker endpoint. |
aws_sagemaker_endpoints_cpuutilization_normalized | CPUUtilizationNormalized | Measures normalized CPU utilization based on instance type and capacity. |
aws_sagemaker_endpoints_disk_utilization | DiskUtilization | Tracks the disk space utilization for SageMaker endpoints. |
aws_sagemaker_endpoints_gpumemory_utilization | GPUMemoryUtilization | Monitors the actual GPU memory utilization for endpoints using GPU instances. |
aws_sagemaker_endpoints_gpumemory_utilization_normalized | GPUMemoryUtilizationNormalized | Measures normalized GPU memory utilization. |
aws_sagemaker_endpoints_gpureservation | GPUReservation | Tracks the amount of reserved GPU resources for endpoints using GPU instances. |
aws_sagemaker_endpoints_gpuutilization | GPUUtilization | Monitors the actual GPU utilization by the SageMaker endpoint. |
aws_sagemaker_endpoints_gpuutilization_normalized | GPUUtilizationNormalized | Measures normalized GPU utilization. |
aws_sagemaker_endpoints_loaded_model_count | LoadedModelCount | Tracks the number of models currently loaded on the SageMaker endpoint. |
aws_sagemaker_endpoints_memory_reservation | MemoryReservation | Tracks the amount of reserved memory for the SageMaker endpoint. |
aws_sagemaker_endpoints_memory_utilization | MemoryUtilization | Monitors the actual memory utilization by the SageMaker endpoint. |
AWS/SageMaker/InferenceRecommendationsJobs
Function: Offers guidance on optimizing inference workloads for ML models
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_inferencerecommendationsjobs_info | ||
aws_sagemaker_inferencerecommendationsjobs_client_invocation_errors | ClientInvocationErrors | Tracks the number of errors encountered during client invocations for inference recommendations. |
aws_sagemaker_inferencerecommendationsjobs_client_invocations | ClientInvocations | Monitors the number of client invocations of the inference recommendations job. |
aws_sagemaker_inferencerecommendationsjobs_client_latency | ClientLatency | Measures the latency of client invocations during the inference recommendations job. |
aws_sagemaker_inferencerecommendationsjobs_number_of_users | NumberOfUsers | Tracks the number of users interacting with the inference recommendations job. |
AWS/SageMaker/ModelBuildingPipeline
Function: Managed pipelines to automate model training and deployment processes
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_modelbuildingpipeline_info | ||
aws_sagemaker_modelbuildingpipeline_execution_duration | ExecutionDuration | Tracks the duration of pipeline executions. |
aws_sagemaker_modelbuildingpipeline_execution_failed | ExecutionFailed | Monitors the number of failed pipeline executions. |
aws_sagemaker_modelbuildingpipeline_execution_started | ExecutionStarted | Counts the number of started pipeline executions. |
aws_sagemaker_modelbuildingpipeline_execution_stopped | ExecutionStopped | Tracks pipeline executions that were stopped. |
aws_sagemaker_modelbuildingpipeline_execution_succeeded | ExecutionSucceeded | Monitors the number of successfully completed pipeline executions. |
aws_sagemaker_modelbuildingpipeline_step_duration | StepDuration | Tracks the duration of individual steps within the pipeline. |
aws_sagemaker_modelbuildingpipeline_step_failed | StepFailed | Monitors the number of failed steps within the pipeline. |
aws_sagemaker_modelbuildingpipeline_step_started | StepStarted | Counts the number of steps started in the pipeline. |
aws_sagemaker_modelbuildingpipeline_step_stopped | StepStopped | Tracks the steps that were stopped within the pipeline. |
aws_sagemaker_modelbuildingpipeline_step_succeeded | StepSucceeded | Monitors the number of successfully completed steps within the pipeline. |
AWS/SageMaker/ProcessingJobs
Function: Managed service for processing and transforming data at scale for machine learning
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_processingjobs_info | ||
aws_sagemaker_processingjobs_cpureservation | CPUReservation | Monitors the amount of CPU resources reserved for processing jobs. |
aws_sagemaker_processingjobs_cpuutilization | CPUUtilization | Tracks the utilization of CPU resources during processing jobs. |
aws_sagemaker_processingjobs_cpuutilization_normalized | CPUUtilizationNormalized | Provides normalized CPU utilization for easier comparison across different instance types. |
aws_sagemaker_processingjobs_disk_utilization | DiskUtilization | Monitors the disk utilization during the processing jobs. |
aws_sagemaker_processingjobs_gpumemory_utilization | GPUMemoryUtilization | Tracks GPU memory usage during processing jobs. |
aws_sagemaker_processingjobs_gpumemory_utilization_normalized | GPUMemoryUtilizationNormalized | Provides normalized GPU memory utilization for comparison across different instances. |
aws_sagemaker_processingjobs_gpureservation | GPUReservation | Monitors the amount of GPU resources reserved for processing jobs. |
aws_sagemaker_processingjobs_gpuutilization | GPUUtilization | Tracks the utilization of GPU resources during processing jobs. |
aws_sagemaker_processingjobs_gpuutilization_normalized | GPUUtilizationNormalized | Provides normalized GPU utilization for easier cross-instance comparison. |
aws_sagemaker_processingjobs_memory_reservation | MemoryReservation | Tracks memory resources reserved for processing jobs. |
aws_sagemaker_processingjobs_memory_utilization | MemoryUtilization | Monitors the utilization of memory resources during processing jobs. |
AWS/SageMaker/TrainingJobs
Function: Managed service for training ML models on large datasets
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_trainingjobs_info | ||
aws_sagemaker_trainingjobs_cpureservation | CPUReservation | Tracks the amount of CPU resources reserved for training jobs. |
aws_sagemaker_trainingjobs_cpuutilization | CPUUtilization | Monitors the CPU utilization during training jobs. |
aws_sagemaker_trainingjobs_cpuutilization_normalized | CPUUtilizationNormalized | Provides normalized CPU utilization across different instance types. |
aws_sagemaker_trainingjobs_disk_utilization | DiskUtilization | Monitors the disk utilization during training jobs. |
aws_sagemaker_trainingjobs_gpumemory_utilization | GPUMemoryUtilization | Tracks GPU memory utilization during training jobs. |
aws_sagemaker_trainingjobs_gpumemory_utilization_normalized | GPUMemoryUtilizationNormalized | Provides normalized GPU memory utilization for comparison across different instances. |
aws_sagemaker_trainingjobs_gpureservation | GPUReservation | Tracks the amount of GPU resources reserved for training jobs. |
aws_sagemaker_trainingjobs_gpuutilization | GPUUtilization | Monitors GPU utilization during training jobs. |
aws_sagemaker_trainingjobs_gpuutilization_normalized | GPUUtilizationNormalized | Provides normalized GPU utilization across different instances. |
aws_sagemaker_trainingjobs_memory_reservation | MemoryReservation | Monitors the amount of memory reserved for training jobs. |
aws_sagemaker_trainingjobs_memory_utilization | MemoryUtilization | Tracks the memory usage during training jobs. |
AWS/SageMaker/TransformJobs
Function: Enables large-scale, batch ML model inferences for data transformations
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_sagemaker_transformjobs_info | ||
aws_sagemaker_transformjobs_cpureservation | CPUReservation | Tracks the CPU resources reserved for transform jobs. |
aws_sagemaker_transformjobs_cpuutilization | CPUUtilization | Monitors the CPU utilization during transform jobs. |
aws_sagemaker_transformjobs_cpuutilization_normalized | CPUUtilizationNormalized | Provides normalized CPU utilization across different instance types during transform jobs. |
aws_sagemaker_transformjobs_disk_utilization | DiskUtilization | Monitors disk utilization during transform jobs. |
aws_sagemaker_transformjobs_gpumemory_utilization | GPUMemoryUtilization | Tracks GPU memory utilization during transform jobs. |
aws_sagemaker_transformjobs_gpumemory_utilization_normalized | GPUMemoryUtilizationNormalized | Provides normalized GPU memory utilization for comparison across different instances during transform jobs. |
aws_sagemaker_transformjobs_gpureservation | GPUReservation | Tracks the GPU resources reserved for transform jobs. |
aws_sagemaker_transformjobs_gpuutilization | GPUUtilization | Monitors GPU utilization during transform jobs. |
aws_sagemaker_transformjobs_gpuutilization_normalized | GPUUtilizationNormalized | Provides normalized GPU utilization across different instances during transform jobs. |
aws_sagemaker_transformjobs_memory_reservation | MemoryReservation | Monitors memory resources reserved for transform jobs. |
aws_sagemaker_transformjobs_memory_utilization | MemoryUtilization | Tracks memory usage during transform jobs. |
AWS/Scheduler
Function: Managed service to trigger events or workflows at a scheduled time
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_scheduler_invocation_attempt_count | InvocationAttemptCount | Tracks the number of attempts made for invocations. |
aws_scheduler_invocation_dropped_count | InvocationDroppedCount | Monitors the count of invocations that were dropped. |
aws_scheduler_invocation_throttle_count | InvocationThrottleCount | Counts the number of invocations that were throttled due to exceeding limits. |
aws_scheduler_invocations_failed_to_be_sent_to_dead_letter_count | InvocationsFailedToBeSentToDeadLetterCount | Tracks the number of invocations that failed to be sent to the dead letter queue. |
aws_scheduler_invocations_sent_to_dead_letter_count | InvocationsSentToDeadLetterCount | Counts the number of invocations successfully sent to the dead letter queue. |
aws_scheduler_invocations_sent_to_dead_letter_count_truncated_message_size_exceeded | InvocationsSentToDeadLetterCount_Truncated_MessageSizeExceeded | Monitors the number of invocations sent to the dead letter queue due to exceeding message size. |
aws_scheduler_target_error_count | TargetErrorCount | Tracks the count of errors encountered by the target. |
aws_scheduler_target_error_throttled_count | TargetErrorThrottledCount | Counts the number of target errors caused by throttling. |
AWS/States
Function: AWS Step Functions for orchestrating workflows and coordinating services
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_states_info | ||
aws_states_activities_failed | ActivitiesFailed | Tracks the number of failed activities. |
aws_states_activities_heartbeat_timed_out | ActivitiesHeartbeatTimedOut | Monitors activities whose heartbeat timed out. |
aws_states_activities_scheduled | ActivitiesScheduled | Tracks the number of activities that have been scheduled. |
aws_states_activities_started | ActivitiesStarted | Measures the number of activities that have started. |
aws_states_activities_succeeded | ActivitiesSucceeded | Tracks successful activities. |
aws_states_activities_timed_out | ActivitiesTimedOut | Tracks the number of activities that timed out. |
aws_states_activity_run_time | ActivityRunTime | Monitors the runtime of activities. |
aws_states_activity_schedule_time | ActivityScheduleTime | Tracks the schedule time for activities. |
aws_states_activity_time | ActivityTime | Tracks the total time taken by an activity. |
aws_states_consumed_capacity | ConsumedCapacity | Measures the consumed capacity for Step Functions. |
aws_states_execution_throttled | ExecutionThrottled | Monitors throttled execution attempts. |
aws_states_execution_time | ExecutionTime | Tracks the total time taken by an execution. |
aws_states_executions_aborted | ExecutionsAborted | Tracks the number of executions that were aborted. |
aws_states_executions_failed | ExecutionsFailed | Measures the number of failed executions. |
aws_states_executions_started | ExecutionsStarted | Tracks the number of executions that started. |
aws_states_executions_succeeded | ExecutionsSucceeded | Tracks successful executions. |
aws_states_executions_timed_out | ExecutionsTimedOut | Monitors executions that timed out. |
aws_states_express_execution_billed_duration | ExpressExecutionBilledDuration | Measures the billed duration for Express Workflows. |
aws_states_express_execution_billed_memory | ExpressExecutionBilledMemory | Measures the billed memory for Express Workflows. |
aws_states_express_execution_memory | ExpressExecutionMemory | Monitors the memory consumed by Express Workflows. |
aws_states_lambda_function_run_time | LambdaFunctionRunTime | Measures the runtime of Lambda functions. |
aws_states_lambda_function_schedule_time | LambdaFunctionScheduleTime | Tracks the schedule time for Lambda functions. |
aws_states_lambda_function_time | LambdaFunctionTime | Tracks the total time taken by Lambda functions. |
aws_states_lambda_functions_failed | LambdaFunctionsFailed | Monitors Lambda functions that failed. |
aws_states_lambda_functions_scheduled | LambdaFunctionsScheduled | Tracks the number of Lambda functions that were scheduled. |
aws_states_lambda_functions_started | LambdaFunctionsStarted | Tracks Lambda functions that have started. |
aws_states_lambda_functions_succeeded | LambdaFunctionsSucceeded | Measures successful Lambda function executions. |
aws_states_lambda_functions_timed_out | LambdaFunctionsTimedOut | Monitors Lambda functions that timed out. |
aws_states_provisioned_bucket_size | ProvisionedBucketSize | Tracks the provisioned bucket size for Step Functions. |
aws_states_provisioned_refill_rate | ProvisionedRefillRate | Measures the rate at which provisioned capacity is refilled. |
aws_states_service_integration_run_time | ServiceIntegrationRunTime Measures | the runtime of service integrations. |
aws_states_service_integration_schedule_time | ServiceIntegrationScheduleTime | Tracks the schedule time for service integrations. |
aws_states_service_integration_time | ServiceIntegrationTime | Monitors the total time taken by service integrations. |
aws_states_service_integrations_failed | ServiceIntegrationsFailed | Tracks failed service integrations. |
aws_states_service_integrations_scheduled | ServiceIntegrationsScheduled | Measures the number of service integrations that were scheduled. |
aws_states_service_integrations_started | ServiceIntegrationsStarted | Tracks service integrations that have started. |
aws_states_service_integrations_succeeded | ServiceIntegrationsSucceeded | Monitors successful service integrations. |
aws_states_service_integrations_timed_out | ServiceIntegrationsTimedOut | Measures service integrations that timed out. |
aws_states_throttled_events | ThrottledEvents | Tracks the number of events that were throttled. |
AWS/StorageGateway
Function: Hybrid cloud storage service connecting on-premises software appliances to AWS
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_storagegateway_info | ||
aws_storagegateway_cache_free | CacheFree | Tracks the amount of free cache space in the gateway. |
aws_storagegateway_cache_hit_percent | CacheHitPercent | Monitors the percentage of read operations served by the cache. |
aws_storagegateway_cache_percent_dirty | CachePercentDirty | Measures the percentage of cache space that contains data that hasn’t been uploaded yet. |
aws_storagegateway_cache_percent_used | CachePercentUsed | Tracks the percentage of used cache space. |
aws_storagegateway_cache_used | CacheUsed | Measures the amount of cache space used. |
aws_storagegateway_cloud_bytes_downloaded | CloudBytesDownloaded | Tracks the amount of data downloaded from AWS to the gateway. |
aws_storagegateway_cloud_bytes_uploaded | CloudBytesUploaded | Measures the amount of data uploaded from the gateway to AWS. |
aws_storagegateway_cloud_download_latency | CloudDownloadLatency | Tracks the latency experienced during downloads from AWS. |
aws_storagegateway_queued_writes | QueuedWrites | Monitors the number of write operations queued in the gateway. |
aws_storagegateway_read_bytes | ReadBytes | Tracks the amount of data read by the gateway. |
aws_storagegateway_read_time | ReadTime | Measures the time spent on read operations. |
aws_storagegateway_time_since_last_recovery_point | TimeSinceLastRecoveryPoint | Tracks the time since the last recovery point was created. |
aws_storagegateway_total_cache_size | TotalCacheSize | Measures the total size of the cache. |
aws_storagegateway_upload_buffer_free | UploadBufferFree | Tracks the amount of free space in the upload buffer. |
aws_storagegateway_upload_buffer_percent_used | UploadBufferPercentUsed | Measures the percentage of the upload buffer that is used. |
aws_storagegateway_upload_buffer_used | UploadBufferUsed | Monitors the amount of upload buffer space used. |
aws_storagegateway_working_storage_free | WorkingStorageFree | Measures the amount of free working storage in the gateway. |
aws_storagegateway_working_storage_percent_used | WorkingStoragePercentUsed | Tracks the percentage of working storage used. |
aws_storagegateway_working_storage_used | WorkingStorageUsed | Monitors the amount of working storage used. |
aws_storagegateway_write_bytes | WriteBytes | Monitors the amount of working storage used. |
aws_storagegateway_write_time | WriteTime | Tracks the time spent on write operations. |
AWS/Timestream
Function: Managed time series database for IoT and operational applications
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_timestream_data_scanned_bytes | DataScannedBytes | Tracks the total amount of data scanned by AWS Timestream during queries. |
aws_timestream_successful_request_latency | SuccessfulRequestLatency | Measures the latency of successful requests sent to AWS Timestream. |
aws_timestream_system_errors | SystemErrors | Monitors the number of system errors occurring in AWS Timestream. |
aws_timestream_user_errors | UserErrors | Tracks the number of user-generated errors in AWS Timestream, such as invalid queries. |
AWS/TransitGateway
Function: Service for connecting VPCs and on-premises networks through a central hub
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_transitgateway_info | ||
aws_transitgateway_bytes_in | BytesIn | Tracks the total number of bytes received by the Transit Gateway. |
aws_transitgateway_bytes_out | BytesOut | Measures the total number of bytes sent from the Transit Gateway. |
aws_transitgateway_packet_drop_count_blackhole | PacketDropCountBlackhole | Monitors the number of packets dropped due to blackholing (unreachable routes). |
aws_transitgateway_packet_drop_count_no_route | PacketDropCountNoRoute | Tracks the number of packets dropped due to no matching route found. |
aws_transitgateway_packets_in | PacketsIn | Measures the total number of packets received by the Transit Gateway. |
aws_transitgateway_packets_out | PacketsOut | Tracks the total number of packets sent from the Transit Gateway. |
AWS/TrustedAdvisor
Function: Provides real-time recommendations to improve AWS resource optimization and security. This service only produces metrics to specific regions in AWS. Any jobs configured with this service will only gather data from the us-east-1 regions.
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_trustedadvisor_green_checks | GreenChecks | Tracks the number of Trusted Advisor checks in the green (optimal) status. |
aws_trustedadvisor_red_checks | RedChecks | Measures the number of Trusted Advisor checks that indicate critical issues (red status). |
aws_trustedadvisor_red_resources | RedResources | Tracks the number of resources flagged as critical or failing (red status). |
aws_trustedadvisor_service_limit_usage | ServiceLimitUsage | Monitors the usage of service limits based on Trusted Advisor service limit checks. |
aws_trustedadvisor_yellow_checks | YellowChecks | Measures the number of checks that show warnings (yellow status). |
aws_trustedadvisor_yellow_resources | YellowResources | Tracks the number of resources flagged as warnings or requiring attention (yellow status). |
AWS/Usage
Function: Tracks AWS service usage for cost monitoring and optimization
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_usage_call_count | CallCount | Tracks the number of API or service calls made. |
aws_usage_resource_count | ResourceCount | Measures the number of resources in use or allocated in the AWS environment. |
AWS/VPN
Function: Managed VPN service to securely connect on-premises networks to AWS
Scrape interval: 5 minutes
Includes: Out-of-the-box dashboard
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_vpn_info | ||
aws_vpn_tunnel_data_in | TunnelDataIn | Monitors the amount of inbound data being transferred through the VPN tunnel. Helps track network traffic. |
aws_vpn_tunnel_data_out | TunnelDataOut | Tracks the amount of outbound data being transferred through the VPN tunnel. Useful for bandwidth monitoring. |
aws_vpn_tunnel_state | TunnelState | Monitors the current status of the VPN tunnel (e.g., up or down). Helps in identifying tunnel connectivity issues. |
AWS/WAFV2
Function: Web application firewall to protect applications from common web exploits
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_wafv2_info | ||
aws_wafv2_allowed_requests | AllowedRequests | Tracks the number of requests that are allowed by the WAF rules. Useful for monitoring legitimate traffic. |
aws_wafv2_blocked_requests | BlockedRequests | Monitors the number of requests that are blocked by the WAF rules. Helps detect and prevent malicious traffic. |
aws_wafv2_captcha_requests | CaptchaRequests | Tracks the number of requests that triggered a CAPTCHA challenge. Useful for tracking potential bot traffic. |
aws_wafv2_captchas_attempted | CaptchasAttempted | Monitors the number of CAPTCHA challenges that were attempted by users. Indicates user engagement with challenges. |
aws_wafv2_captchas_solved | CaptchasSolved | Tracks the number of CAPTCHA challenges successfully solved. Helps assess CAPTCHA effectiveness. |
aws_wafv2_challenge_requests | ChallengeRequests | Monitors the number of requests that triggered additional security challenges. Useful for advanced threat detection. |
aws_wafv2_counted_requests | CountedRequests | Tracks the number of requests counted for rule evaluation but not necessarily blocked or allowed. |
aws_wafv2_passed_requests | PassedRequests | Monitors requests that passed through the challenge phase and were allowed access. |
aws_wafv2_requests_with_valid_captcha_token | RequestsWithValidCaptchaToken | Tracks the number of requests with a valid CAPTCHA token. Useful for validating CAPTCHA implementation. |
aws_wafv2_requests_with_valid_challenge_token | RequestsWithValidChallengeToken | Monitors the number of requests with valid security challenge tokens. Helps track successful security checks. |
AWS/WorkSpaces
Function: Managed desktop virtualization service for delivering cloud-based desktops
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_workspaces_info | ||
aws_workspaces_available | Available | Monitors the number of available WorkSpaces. Useful for tracking the availability of WorkSpaces for users. |
aws_workspaces_connection_attempt | ConnectionAttempt | Tracks the number of connection attempts to WorkSpaces. Helps monitor user access and demand. |
aws_workspaces_connection_failure | ConnectionFailure | Monitors the number of failed connection attempts. Useful for identifying connectivity issues or failures. |
aws_workspaces_connection_success | ConnectionSuccess | Tracks the number of successful connections to WorkSpaces. Indicates the success rate of user connections. |
aws_workspaces_in_session_latency | InSessionLatency | Monitors the latency experienced by users during WorkSpaces sessions. Helps assess user experience quality. |
aws_workspaces_maintenance | Maintenance | Tracks the number of WorkSpaces under maintenance. Useful for understanding maintenance impact on availability. |
aws_workspaces_session_disconnect | SessionDisconnect | Monitors the number of session disconnections. Helps detect connectivity issues or user-initiated disconnects. |
aws_workspaces_session_launch_time | SessionLaunchTime | Tracks the time taken to launch a WorkSpaces session. Useful for assessing the performance of WorkSpaces launches. |
aws_workspaces_stopped | Stopped | Monitors the number of WorkSpaces that are in the stopped state. Helps track WorkSpaces that are not running. |
aws_workspaces_unhealthy | Unhealthy | Tracks the number of unhealthy WorkSpaces. Useful for identifying potential issues with WorkSpaces health. |
aws_workspaces_user_connected | UserConnected | Monitors the number of users currently connected to WorkSpaces. Helps measure active user engagement. |
AmazonMWAA
Function: Managed service for Apache Airflow workflows in the cloud
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_amazonmwaa_info | ||
aws_amazonmwaa_collect_dbdags | CollectDBDags Monitors | how often database DAGs are collected. |
aws_amazonmwaa_critical_section_busy | CriticalSectionBusy | Tracks the time spent when critical sections of code are busy. |
aws_amazonmwaa_critical_section_duration | CriticalSectionDuration | Measures the duration for which critical sections remain busy. |
aws_amazonmwaa_critical_section_query_duration | CriticalSectionQueryDuration | Monitors the time spent querying within critical sections. |
aws_amazonmwaa_dagdependency_check | DAGDependencyCheck | Monitors dependency checks between DAGs. |
aws_amazonmwaa_dagduration_failed | DAGDurationFailed | Tracks the duration of failed DAG runs. |
aws_amazonmwaa_dagduration_success | DAGDurationSuccess | Tracks the duration of successful DAG runs. |
aws_amazonmwaa_dagfile_processing_last_duration | DAGFileProcessingLastDuration | Measures the last processing time for DAG files. |
aws_amazonmwaa_dagfile_processing_last_run_seconds_ago | DAGFileProcessingLastRunSecondsAgo | Tracks the time since the last DAG file processing run. |
aws_amazonmwaa_dagfile_refresh_error | DAGFileRefreshError | Monitors errors in refreshing DAG files. |
aws_amazonmwaa_dagschedule_delay | DAGScheduleDelay | Monitors delays in DAG scheduling. |
aws_amazonmwaa_dag_bag_size | DagBagSize | Tracks the size of the DAG bag. |
aws_amazonmwaa_dag_callback_exceptions | DagCallbackExceptions | Monitors exceptions occurring in DAG callbacks. |
aws_amazonmwaa_exception_failures | ExceptionFailures | Tracks the number of exception failures. |
aws_amazonmwaa_executed_tasks | ExecutedTasks | Tracks the total number of executed tasks. |
aws_amazonmwaa_failed_celery_task_execution | FailedCeleryTaskExecution | Monitors failed task executions in Celery. |
aws_amazonmwaa_failed_slacallback | FailedSLACallback | Tracks failures in SLA callbacks. |
aws_amazonmwaa_failed_slaemail_attempts | FailedSLAEmailAttempts | Monitors failed attempts to send SLA emails. |
aws_amazonmwaa_file_path_queue_update_count | FilePathQueueUpdateCount | Tracks the number of file path queue updates. |
aws_amazonmwaa_first_task_scheduling_delay | FirstTaskSchedulingDelay | Measures the delay in scheduling the first task. |
aws_amazonmwaa_import_errors | ImportErrors | Monitors errors encountered during imports. |
aws_amazonmwaa_infra_failures | InfraFailures | Tracks infrastructure failures in the environment. |
aws_amazonmwaa_job_end | JobEnd | Monitors the number of jobs completed. |
aws_amazonmwaa_job_heartbeat_failure | JobHeartbeatFailure | Tracks heartbeat failures for jobs. |
aws_amazonmwaa_job_start | JobStart | Monitors the number of jobs started. |
aws_amazonmwaa_loaded_tasks | LoadedTasks | Tracks the number of tasks loaded in the environment. |
aws_amazonmwaa_manager_stalls | ManagerStalls | Monitors the number of times the manager process stalls. |
aws_amazonmwaa_open_slots | OpenSlots | Tracks the number of open task slots. |
aws_amazonmwaa_operator_failures | OperatorFailures | Tracks the number of operator task failures. |
aws_amazonmwaa_operator_successes | OperatorSuccesses | Tracks the number of operator task successes. |
aws_amazonmwaa_orphaned | Orphaned | Monitors orphaned task instances. |
aws_amazonmwaa_orphaned_tasks_adopted | OrphanedTasksAdopted | Tracks the number of orphaned tasks adopted. |
aws_amazonmwaa_orphaned_tasks_cleared | OrphanedTasksCleared | Tracks the number of orphaned tasks cleared. |
aws_amazonmwaa_other_callback_count | OtherCallbackCount | Tracks the number of other callbacks occurring in the environment. |
aws_amazonmwaa_poked_exceptions | PokedExceptions | Monitors the number of exceptions in poked tasks. |
aws_amazonmwaa_poked_success | PokedSuccess | Tracks successful pokes in tasks. |
aws_amazonmwaa_poked_tasks | PokedTasks | Tracks the number of poked tasks. |
aws_amazonmwaa_pool_deferred_slots | PoolDeferredSlots | Tracks deferred slots in task pools. |
aws_amazonmwaa_pool_failures | PoolFailures | Monitors the number of task pool failures. |
aws_amazonmwaa_pool_open_slots | PoolOpenSlots | Tracks the number of open slots in the task pool. |
aws_amazonmwaa_pool_queued_slots | PoolQueuedSlots | Tracks the number of queued slots in the task pool. |
aws_amazonmwaa_pool_running_slots | PoolRunningSlots | Tracks the number of running slots in the task pool. |
aws_amazonmwaa_pool_starving_tasks | PoolStarvingTasks | Tracks tasks that are starving for resources in the task pool. |
aws_amazonmwaa_processes | Processes | Tracks the number of processes running in the environment. |
aws_amazonmwaa_processor_timeouts | ProcessorTimeouts | Monitors timeouts in processors. |
aws_amazonmwaa_queued_tasks | QueuedTasks | Tracks the number of tasks in the queue. |
aws_amazonmwaa_running_tasks | RunningTasks | Tracks the number of running tasks in the environment. |
aws_amazonmwaa_slamissed | SLAMissed | Tracks the number of SLA misses in tasks. |
aws_amazonmwaa_scheduler_heartbeat | SchedulerHeartbeat | Monitors the health of the scheduler through its heartbeat. |
aws_amazonmwaa_scheduler_loop_duration | SchedulerLoopDuration | Measures the duration of scheduler loops. |
aws_amazonmwaa_sla_callback_count | SlaCallbackCount | Tracks the number of SLA callbacks made. |
aws_amazonmwaa_started_task_instances | StartedTaskInstances | Monitors the number of started task instances. |
aws_amazonmwaa_task_instance_created_using_operator | TaskInstanceCreatedUsingOperator | Tracks the number of task instances created using an operator. |
aws_amazonmwaa_task_instance_duration | TaskInstanceDuration | Monitors the duration of task instances. |
aws_amazonmwaa_task_instance_failures | TaskInstanceFailures | Tracks the number of task instance failures. |
aws_amazonmwaa_task_instance_finished | TaskInstanceFinished | Monitors the number of task instances that have finished. |
aws_amazonmwaa_task_instance_previously_succeeded | TaskInstancePreviouslySucceeded | Tracks the number of task instances that have previously succeeded. |
aws_amazonmwaa_task_instance_queued_duration | TaskInstanceQueuedDuration | Measures the time task instances spend in the queue before execution. |
aws_amazonmwaa_task_instance_scheduled_duration | TaskInstanceScheduledDuration | Tracks the duration of time task instances were scheduled. |
aws_amazonmwaa_task_instance_successes | TaskInstanceSuccesses | Tracks the number of successful task instances. |
aws_amazonmwaa_task_removed_from_dag | TaskRemovedFromDAG | Monitors tasks that were removed from the DAG. |
aws_amazonmwaa_task_restored_to_dag | TaskRestoredToDAG | Tracks tasks that were restored to the DAG. |
aws_amazonmwaa_task_timeout_error | TaskTimeoutError | Monitors timeout errors in tasks. |
aws_amazonmwaa_tasks_executable | TasksExecutable | Tracks the number of executable tasks. |
aws_amazonmwaa_tasks_killed_externally | TasksKilledExternally | Tracks tasks that were killed externally. |
aws_amazonmwaa_tasks_pending | TasksPending | Monitors pending tasks. |
aws_amazonmwaa_tasks_running | TasksRunning | Tracks the number of tasks currently running. |
aws_amazonmwaa_tasks_starving | TasksStarving | Tracks the number of tasks starving for resources. |
aws_amazonmwaa_tasks_without_dag_run | TasksWithoutDagRun | Tracks tasks that are not associated with any DAG run. |
aws_amazonmwaa_total_parse_time | TotalParseTime | Measures the total time spent parsing DAG files. |
aws_amazonmwaa_trigger_heartbeat | TriggerHeartbeat | Tracks the heartbeat of task triggers. |
aws_amazonmwaa_triggered_dag_runs | TriggeredDagRuns | Monitors the number of DAG runs triggered. |
aws_amazonmwaa_triggers_blocked_main_thread | TriggersBlockedMainThread Tracks the number of triggers that block the main thread. | |
aws_amazonmwaa_triggers_failed | TriggersFailed | Monitors failed task triggers. |
aws_amazonmwaa_triggers_running | TriggersRunning | Tracks the number of running task triggers. |
aws_amazonmwaa_triggers_succeeded | TriggersSucceeded | Monitors successful task triggers. |
aws_amazonmwaa_updates | Updates | Tracks the number of updates made to DAGs and other configurations. |
aws_amazonmwaa_zombies_killed | ZombiesKilled Monitors | the number of zombie tasks killed in the environment. |
ECS/ContainerInsights
Function: Provides monitoring and insights for ECS clusters, tasks, and containers
Scrape interval: 5 minutes
Metric | Cloudwatch metric | Purpose |
---|---|---|
aws_ecs_containerinsights_info | ||
aws_ecs_containerinsights_container_instance_count | ContainerInstanceCount | Tracks the number of container instances in a cluster. |
aws_ecs_containerinsights_cpu_reserved | CpuReserved | Monitors the amount of CPU reserved for tasks. |
aws_ecs_containerinsights_cpu_utilized | CpuUtilized | Tracks the CPU utilization of running tasks. |
aws_ecs_containerinsights_deployment_count | DeploymentCount | Measures the number of service deployments. |
aws_ecs_containerinsights_desired_task_count | DesiredTaskCount | Monitors the desired number of running tasks in a service. |
aws_ecs_containerinsights_ebsfilesystem_size | EBSFilesystemSize | Tracks the size of the EBS filesystem attached to the ECS instance. |
aws_ecs_containerinsights_ebsfilesystem_utilized | EBSFilesystemUtilized | Monitors the utilized space in the EBS filesystem. |
aws_ecs_containerinsights_ephemeral_storage_reserved | EphemeralStorageReserved | Measures the amount of reserved ephemeral storage for tasks. |
aws_ecs_containerinsights_ephemeral_storage_utilized | EphemeralStorageUtilized | Tracks the ephemeral storage utilized by tasks. |
aws_ecs_containerinsights_memory_reserved | MemoryReserved | Monitors the amount of memory reserved for tasks in ECS. |
aws_ecs_containerinsights_memory_utilized | MemoryUtilized | Measures the memory utilized by tasks. |
aws_ecs_containerinsights_network_rx_bytes | NetworkRxBytes | Tracks the number of bytes received by the network interfaces on the instance. |
aws_ecs_containerinsights_network_tx_bytes | NetworkTxBytes | Monitors the number of bytes transmitted from the network interfaces on the instance. |
aws_ecs_containerinsights_pending_task_count | PendingTaskCount | Monitors the number of tasks that are in the pending state in the service. |
aws_ecs_containerinsights_running_task_count | RunningTaskCount | Tracks the number of running tasks in the service. |
aws_ecs_containerinsights_service_count | ServiceCount | Monitors the number of services running in the cluster. |
aws_ecs_containerinsights_storage_read_bytes | StorageReadBytes | Tracks the number of bytes read from the storage attached to the ECS instance. |
aws_ecs_containerinsights_storage_write_bytes | StorageWriteBytes | Measures the number of bytes written to storage. |
aws_ecs_containerinsights_task_count | TaskCount | Monitors the total number of tasks running in the ECS cluster. |
aws_ecs_containerinsights_task_set_count | TaskSetCount | Measures the number of task sets in a service. |
aws_ecs_containerinsights_instance_cpu_limit | instance_cpu_limit | Tracks the total CPU limit configured for the instance. |
aws_ecs_containerinsights_instance_cpu_reserved_capacity | instance_cpu_reserved_capacity | Measures the reserved CPU capacity on the instance. |
aws_ecs_containerinsights_instance_cpu_usage_total | instance_cpu_usage_total | Tracks the total CPU usage across all tasks on the instance. |
aws_ecs_containerinsights_instance_cpu_utilization | instance_cpu_utilization | Monitors the percentage of CPU utilization on the ECS instance. |
aws_ecs_containerinsights_instance_filesystem_utilization | instance_filesystem_utilization | Tracks the utilization of the filesystem attached to the ECS instance. |
aws_ecs_containerinsights_instance_memory_limit | instance_memory_limit | Measures the total memory limit configured for the instance. |
aws_ecs_containerinsights_instance_memory_reserved_capacity | instance_memory_reserved_capacity | Tracks the reserved memory capacity on the instance. |
aws_ecs_containerinsights_instance_memory_utilization | instance_memory_utilization | Monitors the percentage of memory utilization on the ECS instance. |
aws_ecs_containerinsights_instance_memory_working_set | instance_memory_working_set | Measures the working set memory on the instance, which is the amount of memory actively used. |
aws_ecs_containerinsights_instance_network_total_bytes | instance_network_total_bytes | Tracks the total number of bytes transferred (both received and transmitted) by the network interfaces. |
aws_ecs_containerinsights_instance_number_of_running_tasks | instance_number_of_running_tasks | Monitors the total number of running tasks on the instance. |
aws_ecs_containerinsights_instance_memory_utliization | instance_memory_utliization | Measures the memory utilization of the instance. |