Get help
Having trouble with telemetry signals? This page consolidates the most common troubleshooting issues by signal, and shows you where to escalate issues across metrics, logs, traces, and profiles.
Metrics troubleshooting
If no data appears:
- Use the
grafanacloud-[instance]-usage-insightsdata source described in Troubleshoot Cloud Metrics write issues and run{instance_type="metrics"} |= "path=write"to surface recent write failures. - Confirm your collectors can reach the Prometheus endpoint and that credentials match the ones you generated in Connections.
- Inspect errors such as “sample too old” or “sample too far in the future” to verify clock skew and out-of-order replay windows.
If you receive high cardinality warnings:
- Label handling caps label names at 1024 characters, values at 2048 characters, and enforces a label-count limit. Truncated values include a hash suffix; remove unnecessary labels to avoid hitting the cap.
- Errors such as
err-mimir-label-value-too-longorreceived a series whose number of labels exceeds the limitindicate which series needs to be relabeled.
If you receive query performance issues:
- Duplicated timestamps generate
duplicate sample for timestamperrors; ensure scrapers are not sending the same sample twice. - Back filling must stay within the two-hour
out_of_order_time_window. Older data should be replayed chronologically so the latest timestamp always moves forward.
Logs troubleshooting
If no logs appear:
- Follow the steps in Troubleshoot Cloud Logs write issues: query
{instance_type="logs"} |= "push request failed"to see error messages from Loki. - Validate paths in
loki.source.file(or whichever source you use) and confirm that the Grafana Cloud user/password pair matches the ones listed under Connections → Logs.
If you receive label issues:
- Loki enforces the same 1024/2048 byte label length caps plus a maximum of 15 labels per stream. Errors such as
duplicate label name,invalid labels, orentry ... has N label namespoint to the offending stream. - Promote only low-cardinality labels; high-cardinality labels should stay in structured metadata via
otlp_configif you enable it through the self-serve limits API.
If you receive query timeout errors:
entry too far behinderrors mean data is arriving out of order (>1 hour behind the stream head).entry too oldsignals that data exceededreject_old_samples_max_age(default one week).- Lines larger than 256 KB are rejected unless you set
max_line_size_truncatevia the configuration API.
Traces troubleshooting
If no traces appear:
- Troubleshoot Grafana Cloud Traces recommends validating credentials (instance ID as username, token with
traces:writescope) and ensuring you hit the correct OTLP endpoint (https://<stack>.grafana.net/tempofor HTTP,<stack>.grafana.net:443for gRPC). - Use the Alloy UI (
http://localhost:12345) oralloy fmtto ensure receivers, processors, and exporters are healthy.
If you have missing spans:
- Metrics-generator only creates metrics for
SERVER/CONSUMERspan kinds by default. If spans show up in TraceQL but not RED metrics, file a Support ticket to enable the additional kinds or monitor slack as described in metrics-generator constraints. - Tail sampling can delay or drop spans if the decision wait is longer than the trace duration; reduce the wait time or adjust caches accordingly. Refer to Sampling for more information about sampling, policies, and examples.
If you find sampling issues or errors:
- Review the sampling strategy guide: combine probabilistic sampling for baseline coverage with status/latency policies for critical traces. For additional context, refer to Tail sampling policies and strategies in the Tempo documentation.
RESOURCE_EXHAUSTEDerrors are retryable; configuresending_queueandretry_on_failureblocks so exporters back off instead of dropping data.
Profiles troubleshooting
Profile upload failures
- The Profiles ingestion control API rejects traffic with HTTP 422 once the daily megabyte cap is hit. Check
grafanacloud_profiles_instance_ingest_limit_megabytesandgrafanacloud_profiles_instance_discarded_bytes_per_second{reason="ingest_limit_reached"}to confirm. - Only Grafana Admins can adjust the limit; include the latest
metadata.generationin your update to avoid conflicts.
If profiles have trouble uploading:
- The Profiles ingestion control API rejects traffic with HTTP 422 once the daily megabyte cap is hit. Check
grafanacloud_profiles_instance_ingest_limit_megabytesandgrafanacloud_profiles_instance_discarded_bytes_per_second{reason="ingest_limit_reached"}to confirm. - Only Grafana Admins can adjust the limit; include the latest
metadata.generationin your update to avoid conflicts.
If there are symbolization errors:
- Follow the checklist in Pyroscope Symbolization: download the profile, inspect mappings with
go tool pprof -raw, verify build IDs, and make sure adebuginfodserver can supply the debug info.
- Only system libraries are symbolized; customer code without build IDs will continue to show raw addresses.
Cross-signal troubleshooting
This section provides troubleshooting help for when you encounter problems using your telemetry signals together.
Correlation not working
- Use the same service, environment, cluster, and region labels and attributes in every pipeline. The instrumentation guide recommends adding shared labels (for metrics, logs, profiles) and attributes (for traces) with consistent service names so metrics, logs, traces, and profiles can be stitched together.
- Ensure trace context propagates through your services so log lines and metrics exemplars can link back to traces.
Authentication errors
- Re-download credentials from Connections if tokens expire or scopes are missing. Logs and metrics access-policy tokens differ from traces/profile tokens, so double-check which one each collector uses.
- Verify that Alloy/OpenTelemetry exporters reference the correct regional endpoints (for example,
tempo-prod-us-central-0.grafana.net:443). Typing a HTTP URL into a gRPC exporter returns404/415errors.
Diagnostic queries
The following sections provide queries that you can run to help you diagnose issues.
Check data is flowing
- Metrics:
sum by (id)(grafanacloud_instance_active_series)andsum by (id)(grafanacloud_instance_samples_per_second)show live usage and match the metrics troubleshooting workflow. - Logs:
{instance_type="logs"} |= "push request failed"or{instance_type="logs"} |= "rate limit"highlight recent ingestion errors in the Usage Insights Loki data source. - Traces: Use a known trace ID to run
{ trace:id = "0123456789abcdef" }or start from the TraceQL Search builder as documented in the traces troubleshooting guide.
Verify label matching
- Metrics:
sum by (cluster, environment) (up)confirms that the labels you expect are applied to every target. Missing labels indicate relabeling filters. - Logs:
count_over_time({cluster!=""} |~ "ERROR"[5m])checks that every stream carries aclusterlabel and reports errors consistently. - Profiles: Filter flame graphs by
service_name,region, or other labels coming from your Pyroscope scrapers to confirm that the same taxonomy exists across signals.
Support channels
If you still need help, here is how to get in touch with the Grafana community and Grafana Support.
Community support
- Grafana Community Forums
- Grafana Slack (
#mimir,#loki,#tempo,#pyroscope,#grafana-cloud)
Enterprise support
- Open a support ticket
- Contact your Grafana account team for plan-specific questions or limit increases



