Troubleshoot signal correlation
This guide helps you diagnose and fix common signal correlation issues.
Can’t navigate from metrics to logs
Symptoms
- Clicking a metric doesn’t show related logs.
- Logs panel is empty when navigating from metrics.
- “No logs found” message appears.
Possible causes
- Labels don’t match between data sources.
- Time ranges are misaligned.
- Label names are different (for example,
servicevsservice_name). - No logs exist for that time range or label combination.
Solutions
Verify label names match:
Check Prometheus metric labels:
up{service="api"}Check Loki log labels:
{service="api"}Compare label names. They must be identical (case-sensitive).
If label names differ:
Use Alloy or collector configuration to normalize labels:
// Prometheus remote write
prometheus.remote_write "default" {
external_labels = {
service = "api",
}
}
// Loki write
loki.write "default" {
external_labels = {
service = "api", // Use same name as Prometheus
}
}Verify labels have the same values:
up{service="api"} # Prometheus shows: service="api"{service="api"} # Loki must also show: service="api"If values differ (for example, api vs api-service), fix in your instrumentation or configuration.
Check time ranges:
Ensure both queries use the same time range. In Drilldown or Explore, the time range selector should show the same values for both panes.
Trace IDs don’t appear in logs
Symptoms
- Log entries don’t show trace IDs.
- No clickable links to traces in log view.
- Derived fields don’t work.
Possible causes
- Trace context not propagated in application.
- Logging format doesn’t include trace context.
- OpenTelemetry SDK not configured correctly.
Solutions
Verify application propagates trace context:
Check your application code includes trace context in logs. For OpenTelemetry-instrumented apps:
Python:
from opentelemetry.instrumentation.logging import LoggingInstrumentor
# Must be called during app initialization
LoggingInstrumentor().instrument()Go:
import (
"go.opentelemetry.io/contrib/instrumentation/log/slog"
)
// Use OpenTelemetry-aware logger
logger := slog.New(slog.NewJSONHandler(os.Stdout))Check log output format:
View raw logs to confirm trace IDs are present:
kubectl logs <pod-name> | grep -i traceExpected output should include trace_id:
{"level":"info","trace_id":"abc123...","msg":"request processed"}If trace IDs are missing:
Add manual instrumentation to include trace context:
import logging
from opentelemetry import trace
logger = logging.getLogger(__name__)
def handle_request():
span = trace.get_current_span()
trace_id = format(span.get_span_context().trace_id, "032x")
logger.info("Processing request", extra={"trace_id": trace_id})Derived fields don’t work
Symptoms
- Trace IDs appear in logs but aren’t clickable.
- Clicking trace ID shows “No trace found”.
- Derived field configured but not appearing.
Possible causes
- Regex doesn’t match log format.
- Derived field not configured correctly.
- Tempo data source not selected.
Solutions
Test your regular expression pattern:
- Find a log entry with a trace ID.
- Copy the full log line.
- Test your regular expression against it.
Example log line:
{"timestamp":"2024-01-01T12:00:00Z","trace_id":"abc123def456","level":"info"}Matching regular expression:
"trace_id":\s*"(\w+)"Verify derived field configuration:
- In Grafana Cloud, go to Connections > Data sources.
- Select your Loki data source.
- Check Derived fields section:
- Name: Must be unique (for example,
traceId) - Regex: Must match your log format
- Internal link: Must be toggled ON
- Data source: Must select Tempo data source
- Query: Should be
${__value.raw}
- Name: Must be unique (for example,
Test the configuration:
Query logs that should have trace IDs:
{service="api"} | json | trace_id != ""Look for trace IDs that are underlined/clickable. If they’re not clickable, the regular expression or data source configuration is incorrect.
Exemplars aren’t showing on graphs
Symptoms
- No diamond points appear on Time series graphs.
- Can’t click from metrics to traces.
- Exemplars enabled but not visible.
Possible causes
- Application not emitting exemplars.
- Panel type isn’t Time series.
- Exemplars not enabled in panel settings.
send_exemplarsnot enabled in Alloy/Collector.
Solutions
Verify application emits exemplars:
Check your metrics endpoint:
curl -H "Accept: application/openmetrics-text" http://your-app:9090/metrics | grep -i "traceid"Expected output:
http_request_duration_seconds_bucket{le="0.1"} 45 # {trace_id="abc123"} 0.05If exemplars are missing:
Update your instrumentation to include exemplars:
histogram.ObserveWithExemplar(
duration,
prometheus.Labels{"trace_id": traceID},
)Check panel configuration:
- Edit the dashboard panel.
- Verify Panel type is “Time series” (not “Graph”).
- In the legend, toggle Exemplars ON.
Verify Alloy/collector configuration:
prometheus.remote_write "default" {
endpoint {
url = "..."
send_exemplars = true // Must be true
}
}Check data source configuration:
The Prometheus data source in Grafana must have the Tempo data source configured for exemplar linking:
- In Grafana Cloud, go to Connections > Data sources.
- Select Prometheus data source.
- Scroll to Exemplars.
- Verify Tempo data source is selected.
Can’t link traces to profiles
Symptoms
- No “Profiles” link appears on trace spans.
- Clicking profiles link shows “No data”.
- Profiles appear but for wrong time range.
Possible causes
- Span profiles not configured.
- Tempo data source not configured for traces-to-profiles.
- Label mappings incorrect.
- Profiles not available for the time range.
Solutions
Verify span profiles are enabled:
Check your application sends span profiles:
pyroscope.Start(pyroscope.Config{
EnableSpanProfiling: true, // Must be true
})Check Tempo data source configuration:
- In Grafana Cloud, go to Connections > Data sources.
- Select Tempo data source.
- Scroll to Traces to profiles.
- Verify:
- Enabled toggle is ON
- Pyroscope data source is selected
- Profile type is set (for example,
process_cpu:cpu:nanoseconds:cpu:nanoseconds) - Label mappings are correct
Verify attribute-to-label mappings:
Trace attributes must map to profile labels:
Trace attributes (from Tempo):
resource.service.name = "api"Profile labels (in Pyroscope):
service_name = "api"Mapping configuration:
- Span attribute (from traces):
service.name - Label (in profiles):
service_name
Test manually:
Note the service name and time range from a trace.
Go to Explore with Pyroscope data source.
Query profiles manually:
{service_name="api"}Verify profiles exist for that time range.
If no profiles exist, your application may not be sending them.
Labels don’t match between signals
Symptoms
- Can’t correlate across different signal types.
- Each signal has different label names or values.
- Inconsistent data when filtering.
Possible causes
- Different instrumentation libraries use different conventions.
- External labels not configured consistently.
- Resource attributes not mapped correctly.
Solutions
Standardize label and attribute names:
Create a mapping of identifiers across signals:
Configure external labels consistently:
In Alloy:
prometheus.remote_write "default" {
external_labels = {
service = "api",
environment = "prod",
}
}
loki.write "default" {
external_labels = {
service = "api", // Same name and value
environment = "prod",
}
}For OpenTelemetry:
Use resource attributes for traces:
processors:
resource:
attributes:
- key: service.name
value: api
action: upsertTempo automatically uses resource.service.name, and you can map it in the Tempo data source configuration.
Verify label and attribute consistency:
Query each data source with the same identifier:
up{service="api"} # Prometheus{service="api"} # Loki{resource.service.name="api"} # Tempo{service_name="api"} # PyroscopeAll queries should return data for the same service.
Profiles not available
Symptoms
- Profiles don’t appear in Explore or Profiles Drilldown.
- “No data” when querying Pyroscope.
- Profiles link from traces shows empty results.
Possible causes
- Daily ingestion limit reached.
- Profiles not being sent from application.
- Incorrect
service_namelabel. - Time range mismatch.
Solutions
Check daily ingestion limit:
If you have configured a daily ingestion limit for profiles, data is discarded when the limit is reached. Monitor your usage:
- In the Grafana main menu, click Dashboards. The Dashboards page lists all of the available dashboards in your Grafana instance.
- Click the Usage Insights dashboard.
- Check the
grafanacloud_profiles_instance_period_ingested_megabytesmetric. - Compare against
grafanacloud_profiles_instance_ingest_limit_megabytes.
If limit is reached, data won’t be available until the limit resets at 00:00 UTC.
Verify profiles are being sent:
Check your application’s Pyroscope configuration:
pyroscope.Start(pyroscope.Config{
ApplicationName: "my-service",
ServerAddress: "https://profiles-xxx.grafana.net",
// Ensure correct authentication
BasicAuthUser: "<PYROSCOPE_USERNAME>",
BasicAuthPassword: "<GRAFANA_CLOUD_API_KEY>",
})Check service_name label:
Profiles use service_name (with underscore), not service.name (with dot):
{service_name="my-service"} // Correct for Pyroscope
{service.name="my-service"} // Wrong, this is trace attribute syntaxVerify time range:
Profiles are continuous but may have gaps. Ensure you’re querying a time range when your application was actively running and sending data.
Ingestion limits exceeded
Symptoms
- Data gaps in metrics, logs, or traces.
- “Rate limit exceeded” errors in collector logs.
- Incomplete correlation due to missing data.
Possible causes
- Ingestion rate exceeds configured limits.
- Burst of traffic exceeded burst limits.
- High cardinality causing excessive series/streams.
Solutions
Check current limits and usage:
Query the Usage Insights data source:
# Metrics ingestion rate
rate(grafanacloud_instance_samples_per_second[5m])
# Logs ingestion rate
rate(grafanacloud_logs_instance_bytes_received_total[5m])Compare against your limits (check usage limits documentation).
Reduce ingestion rate:
- For metrics: Use Adaptive Metrics to aggregate unused metrics.
- For logs: Filter noisy log lines at the collector level.
- For traces: Implement tail sampling to keep only important traces.
Monitor for dropped data:
Check collector logs for rate limit errors:
# In Grafana Alloy logs
grep -i "rate limit" /var/log/alloy/*.logQuery results truncated
Symptoms
- “Results may be incomplete” warning.
- Fewer results than expected.
- Query times out.
Possible causes
- Query returns too many series or streams.
- Time range too long for data volume.
- High cardinality query.
Solutions
Add more specific label filters:
Instead of:
http_requests_totalUse:
http_requests_total{service="api", environment="prod"}Reduce time range:
Start with a shorter time range and expand if needed:
- Start with 1 hour instead of 24 hours.
- Use relative time ranges (Last 15 minutes) for initial investigation.
Use aggregation functions:
For metrics, use topk() or aggregations:
topk(10, sum by (service) (rate(http_requests_total[5m])))For logs, add filters before aggregating:
{service="api"} |= "error" | json | line_format "{{.message}}"Check query limits:
Data older than retention unavailable
Symptoms
- Historical correlation fails.
- “No data” for older time ranges.
- Only recent data appears.
Possible causes
- Data deleted due to retention policy.
- Different retention periods across signals.
Solutions
Check retention periods:
Default retention varies by signal:
Note
These are the defaults for Grafana Cloud paid accounts. Your company may have negotiated and paid for custom retention.
Plan for retention alignment:
If you need to correlate signals over longer periods:
- Extend logs retention (configurable from 30 days to 1 year via API).
- Export logs to long-term storage before retention expires.
Use metrics for historical context:
When logs or traces are no longer available:
- Use metrics (longer retention) to understand historical patterns.
- Correlate forward from metrics to recent logs/traces.
Verification checklist
Use this checklist to systematically verify correlation:
Shared labels and attributes
- Label and attribute names match across Prometheus, Loki, Tempo, and Pyroscope.
- Label and attribute values match exactly (case-sensitive).
- Query each data source confirms labels and attributes exist.
- Time ranges align when testing.
Trace IDs in logs
- Application code propagates trace context.
- Raw logs contain
trace_idfields. - Trace ID format is 32-character hex.
- All log entries from traced requests have trace IDs.
Derived fields (Loki to Tempo)
- Derived field configured in Loki data source.
- Regex pattern matches actual log format exactly.
- Tempo data source selected as target.
- Trace IDs are underlined/clickable in logs.
- Clicking trace ID opens correct trace.
Exemplars (Prometheus to Tempo)
- Application code emits exemplars with trace IDs.
- Metrics endpoint shows exemplars (OpenMetrics format).
- Alloy/collector has
send_exemplars = true. - Dashboard uses Time series panel type.
- Exemplars toggle enabled in panel.
- Prometheus data source links to Tempo.
- Exemplar points visible as diamonds on graph.
Traces to profiles (Tempo to Pyroscope)
- Application instrumented for both traces and profiles.
- Span profiles enabled if using span-level linking.
- Tempo data source configured for traces-to-profiles.
- Pyroscope data source selected.
- Label mappings configured correctly.
- Profiles link appears on trace spans.
- Clicking opens profile for correct time range.
Get help
If you’re still experiencing issues:
- Grafana Community Forums: https://community.grafana.com
- Grafana Cloud Support: Available in the Grafana Cloud portal
- Documentation:



