Configure thresholds using Terraform
Threshold configurations in Knowledge Graph allow you to define custom thresholds for request, resource, and health assertions. These configurations help you set specific limits and conditions for monitoring your services and infrastructure.
For information about managing thresholds in the Knowledge Graph UI, refer to Manage thresholds.
Basic threshold configuration
Create a file named thresholds.tf and add the following:
# Basic threshold configuration with all three types
resource "grafana_asserts_thresholds" "basic" {
provider = grafana.asserts
request_thresholds = [{
entity_name = "payment-service"
assertion_name = "ErrorRatioBreach"
request_type = "inbound"
request_context = "/charge"
value = 0.01
}]
resource_thresholds = [{
assertion_name = "Saturation"
resource_type = "container"
container_name = "worker"
source = "metrics"
severity = "warning"
value = 75
}]
health_thresholds = [{
assertion_name = "ServiceDown"
expression = "up < 1"
entity_type = "Service"
}]
}Request threshold configurations
Configure thresholds for different service request types and contexts:
# Multiple request thresholds for different services
resource "grafana_asserts_thresholds" "request_thresholds" {
provider = grafana.asserts
request_thresholds = [
{
entity_name = "api-service"
assertion_name = "ErrorRatioBreach"
request_type = "inbound"
request_context = "/api/v1/users"
value = 0.02
},
{
entity_name = "api-service"
assertion_name = "LatencyP99ErrorBuildup"
request_type = "inbound"
request_context = "/api/v1/orders"
value = 500
},
{
entity_name = "payment-gateway"
assertion_name = "RequestRateAnomaly"
request_type = "outbound"
request_context = "/payment/process"
value = 1000
}
]
}Resource threshold configurations
Define resource thresholds for different severity levels:
# Resource thresholds for different severity levels
resource "grafana_asserts_thresholds" "resource_thresholds" {
provider = grafana.asserts
resource_thresholds = [
{
assertion_name = "Saturation"
resource_type = "container"
container_name = "web-server"
source = "metrics"
severity = "warning"
value = 75
},
{
assertion_name = "Saturation"
resource_type = "container"
container_name = "web-server"
source = "metrics"
severity = "critical"
value = 90
},
{
assertion_name = "ResourceRateBreach"
resource_type = "Pod"
container_name = "database"
source = "logs"
severity = "warning"
value = 80
}
]
}Health threshold configurations
Configure health checks with Prometheus expressions:
# Health thresholds with Prometheus expressions
resource "grafana_asserts_thresholds" "health_thresholds" {
provider = grafana.asserts
health_thresholds = [
{
assertion_name = "ServiceDown"
expression = "up{job=\"api-service\"} < 1"
entity_type = "Service"
},
{
assertion_name = "HighMemoryUsage"
expression = "memory_usage_percent > 85"
entity_type = "Service"
},
{
assertion_name = "DatabaseConnectivity"
expression = "db_connection_pool_active / db_connection_pool_max > 0.9"
entity_type = "Service"
}
]
}Comprehensive threshold configuration
Define comprehensive thresholds for production environments:
# Production environment with comprehensive thresholds
resource "grafana_asserts_thresholds" "production" {
provider = grafana.asserts
request_thresholds = [
{
entity_name = "frontend"
assertion_name = "ErrorRatioBreach"
request_type = "inbound"
request_context = "/"
value = 0.005
},
{
entity_name = "backend-api"
assertion_name = "LatencyP99ErrorBuildup"
request_type = "inbound"
request_context = "/api"
value = 200
}
]
resource_thresholds = [
{
assertion_name = "Saturation"
resource_type = "container"
container_name = "frontend"
source = "metrics"
severity = "warning"
value = 70
},
{
assertion_name = "Saturation"
resource_type = "container"
container_name = "backend-api"
source = "metrics"
severity = "critical"
value = 85
}
]
health_thresholds = [
{
assertion_name = "ServiceDown"
expression = "up < 1"
entity_type = "Service"
},
{
assertion_name = "NodeDown"
expression = "up{job=\"node-exporter\"} < 1"
entity_type = "Service"
}
]
}Resource reference
grafana_asserts_thresholds
Manage Knowledge Graph threshold configurations through the Grafana API. This resource allows you to define custom thresholds for request, resource, and health assertions.
Arguments
Request thresholds block
Each request_thresholds block supports the following:
Resource thresholds block
Each resource_thresholds block supports the following:
Health thresholds block
Each health_thresholds block supports the following:
Example
resource "grafana_asserts_thresholds" "example" {
provider = grafana.asserts
request_thresholds = [{
entity_name = "api-service"
assertion_name = "ErrorRatioBreach"
request_type = "inbound"
request_context = "/api/v1/users"
value = 0.02
}]
resource_thresholds = [{
assertion_name = "Saturation"
resource_type = "container"
container_name = "web-server"
source = "metrics"
severity = "warning"
value = 75
}]
health_thresholds = [{
assertion_name = "ServiceDown"
expression = "up{job=\"api-service\"} < 1"
entity_type = "Service"
}]
}Best practices
Threshold configuration management
- Set appropriate threshold values based on your service level objectives (SLOs)
- Use different severity levels (warning, critical) to create escalation paths
- Test threshold configurations in non-production environments first
- Monitor threshold effectiveness and adjust values based on actual performance data
Request threshold best practices
- Configure request thresholds for critical user-facing endpoints
- Set different thresholds for different request types (inbound vs outbound)
- Consider request context when setting thresholds for specific API paths
- Use error ratio thresholds to catch service degradation early
- Review historical performance data to set realistic threshold values
Resource threshold best practices
- Set resource thresholds based on your infrastructure capacity
- Use container-specific thresholds for microservices architectures
- Configure both warning and critical thresholds for gradual escalation
- Monitor resource utilization patterns to set realistic threshold values
- Consider seasonal or periodic patterns in resource usage
Health threshold best practices
- Use Prometheus expressions that accurately reflect service health
- Test health check expressions independently before applying them
- Set up health thresholds for critical dependencies and external services
- Use composite expressions for complex health checks
- Ensure expressions perform efficiently without causing excessive load
Value selection guidelines
- Start conservative and adjust based on real-world performance
- Use percentages (0-1 range) for ratio-based metrics
- Use milliseconds for latency thresholds
- Document the reasoning behind specific threshold values
- Review and update thresholds regularly based on system evolution
Validation
After applying the Terraform configuration, verify that:
- Threshold configurations are applied in your Knowledge Graph instance
- Configurations appear in the Knowledge Graph UI under Observability > Rules > Threshold
- Request thresholds correctly identify breaches for specified services
- Resource thresholds trigger at appropriate severity levels
- Health thresholds accurately reflect service status
- Threshold values align with your SLO commitments



