Quick start for telemetry signals
Modern applications generate four types of telemetry signals: metrics, logs, traces, and profiles. Each tells part of the story when you investigate issues. Metrics alert you to problems, logs explain what happened, traces show the request journey, and profiles identify code-level bottlenecks.
In this guide, you explore how these signals work together through a hands-on scenario using play.grafana.org. No setup is required.
Note
For instructions on how to send telemetry data to Grafana Cloud, refer to Learning Journeys for your signal or Instrument or send data to Grafana Cloud.
Before you begin
- Familiarize yourself with each signal type described in the following section.
- Open play.grafana.org in your browser (no login required).
Understand the four observability signals
Before diving in, understand what each signal provides.
Metrics: What happened?
Metrics are aggregated numerical data over time, for example, request rate, CPU usage, or error count. They’re excellent for alerting and trend analysis, and they’re lightweight and efficient for long-term storage.
Example: “HTTP request rate increased from 100 to 500 req/sec at 2:15 PM.”
Logs: Why did it happen?
Logs are discrete event records with contextual information. They capture error messages, state changes, and application events. You can search and filter them.
Example: “Database connection timeout: ‘Connection pool exhausted after 30s.’”
Traces: Where did it happen?
Traces create a map of a request’s path through distributed services. They show timing and relationships between service calls and identify latency contributors.
Example: “API call spent 2.3s in database service, 0.1s in cache service.”
Profiles: Which code caused it?
Profiles show function-level resource consumption, such as CPU and memory. They’re visualized as flame graphs showing code execution paths and pinpoint specific functions consuming resources.
Example: “The JSON serialization function consumed 45% of CPU time.”
Investigate using all four signals
Watch this video to see how the four observability signals work together in Grafana Cloud.
Explore metrics
Metrics are your starting point for detecting anomalies.
In Grafana Play, click Explore in the main menu.
From the data source drop-down menu, select a Prometheus data source, for example, grafanacloud-play-prom. You can type the name of the data source in the search box to find it.
Select the Builder instead of Code to use the guided query builder.
Click the Explain slider to see additional descriptions.
In the Metric field, browse the list of available metrics or search for a metric like
app_checkout_total.- Click the Metrics explorer icon to view a list of available metrics and filter by name and type.
- Select a metric to add it to your query.
Click Run query to see a time series graph showing how the metric changes over time.
Metrics give you the “what?” and “when?” You can see resource usage patterns, but they don’t explain root causes.
Try this: In the Metrics field, filter by type (counter, gauge, histogram) to understand different metric types. Notice how you can see values change, but not why. This is where logs and traces provide context.
Refer to Prometheus metrics types for additional information.
Correlate with logs
Logs provide the narrative of what happened in your application.
- Click Explore in the main menu.
- From the data source drop-down menu, select a Grafana Loki data source, for example, grafanacloud-play-logs.
- Switch from Builder to Code mode using the toggle in the query editor.
- In the query editor, enter this query:
{job=~".+"} | json - Click the Run query icon to execute the query.
- Examine the log lines returned—notice the timestamp, level, and message fields.
Logs show discrete events with context. Error messages, stack traces, and application events help explain what happened during a metric anomaly.
Try this: Add | level="error" to the query to filter for error logs. Notice how logs include structured data (JSON fields) you can filter on. Click on a log line to expand it and see all fields. You may need to adjust the time range to see errors.
Follow requests with traces
Metrics give you the “what?” and “when?”. You can see resource usage patterns, but they don’t explain root causes. Traces connect the dots across distributed services.
- In Explore, from the data source drop-down menu, select a Grafana Tempo data source, for example, grafanacloud-play-traces.
- Select the Search query type.
- Under Service Name, select a service, for example,
frauddetectionserviceorshippingservice. - Click Run query to see a list of traces.
- Click any trace ID to open the trace view.
- Examine the waterfall view showing the timeline of the request, different services involved, and time spent in each service.
Traces visualize request flow. You can see exactly where time was spent across multiple services, which is critical for understanding latency in distributed systems.
Try this: Click individual spans (bars) in the trace to see details. Notice the parent-child relationships between spans. Look for spans with long durations—these are latency contributors.
Optimize with profiles
Profiles show function-level resource consumption.
- In Explore, from the data source drop-down menu, select a Grafana Pyroscope data source, for example, grafanacloud-play-profiles.
- Select a profile type, for example,
process_cpu-cpu. - Click Run query to see a flame graph visualization.
- The flame graph shows width (time/resources consumed), height (call stack depth), and each box (a function).
Profiles identify the specific code consuming resources. This is essential for performance optimization.
Try this: Click different sections of the flame graph to zoom in. Wider sections indicate more resource consumption. Follow the stack from bottom (entry points) to top (leaf functions).
Result
After completing these steps, you’ve explored all four telemetry signals in Grafana:
- Metrics for detection, identifying that something happened.
- Logs for context, understanding why it happened.
- Traces for location, pinpointing where in the system it happened.
- Profiles for optimization, identifying which code caused it.
Apply signals to a real investigation
Here’s how you use all four signals to investigate a production issue.
Scenario: Your application response time increased from 200ms to 2000ms at 2:00 PM.
Root cause identified: Connection pool too small for traffic volume. Solution: Increase pool size from 10 to 50 connections.
Navigate between signals in Grafana
You can jump between signals in Grafana:
- From metrics to logs: In Explore, run a metric query in Prometheus. Switch the data source dropdown to Loki or logs data source. Grafana automatically retains matching labels (like
service,namespace) and builds a log query. Adjust the time range to focus on the anomaly window. - From logs to traces: Expand a log line and look for a
traceIDfield in the log details. Click it to open the trace view. Find logs with traces:{service="api"} | traceID != "" - From traces to profiles: Expand a span with profiling data and click Profiles for this span to view a flame graph. Find instrumented spans using:
{span.pyroscope.profile.id != nil}
In some cases, moving between signals requires configuration between labels and attributes or in the data source settings. For example, to move from metrics to traces, you need to set up shared labels or attributes. Refer to Set up correlations.
When to use each signal
Next steps
Now that you understand how signals work together:
- Explore correlation features: Try the split-screen mode to compare signals side-by-side.
- Learn query languages:
- Set up your own stack: Try Grafana Cloud’s free tier for your applications.
- Build dashboards: Create custom dashboards combining all four signals.
Resources
- Play.grafana.org—Practice anytime
- Grafana community—Get help and share knowledge



