Use the Cloud Profiles UI to explore profiling data
The Cloud Profile user interface (UI) is designed to make it easy to visualize and analyze profiling data. There are several different modes for viewing, analyzing, uploading, and comparing profiling data.
While code profiling has been a long-standing practice, continuous profiling represents a modern and more advanced approach to performance monitoring. This technique adds two critical dimensions to traditional profiles:
- Time
- Profiling data is collected continuously, providing a time-centric view that allows querying performance data from any point in the past.
- Metadata
- Profiles are enriched with metadata, adding contextual depth to the performance data.
These dimensions, coupled with the detailed nature of performance profiles, make continuous profiling a uniquely valuable tool. The Profiles UI enhances this further by offering a convenient platform to analyze profiles and get insights that are impossible to get from using other traditional signals like logs, metrics, or tracing.
In this UI reference, you’ll learn how Cloud Profiles parallels other modern observability tools by providing a Prometheus-like querying experience. More importantly, you’ll learn how to use the extensive UI features for a deeper insight into your application’s performance.
Key features
The following sections describe Cloud Profiles UI capabilities.
On some views, you can use Explain flame graph to provide an AI flame graph analysis that explains the performance bottleneck, root cause, and recommended fix. For more information, refer to Flame graph AI.
Tag Explorer
The Tag Explorer page lets you navigate and analyze performance data through tags and labels. This feature is crucial for identifying performance anomalies and understanding the behavior of different application segments under various conditions. Cloud Profiles intentionally doesn’t include a query language on this page. This page was designed to be as intuitive as possible for users to use the UI to navigate and drill down into which tags are most interesting to them.
To use the Tag Explorer:
- Select a tag to view the corresponding profiling data
- Analyze the pie chart and the table of descriptive statistics to determine which tags if any are behaving abnormally
- Select a tag to view the corresponding profiling data
- Make use of the shortcuts to the single, comparison, and diff pages to further identify the root cause of the performance issue
Single view
The Single view page is built for in-depth profile analysis.
Here, you can explore a single flame graph with multiple viewing options and functionalities:
- Table view
- Breaks down the profiling data into a sortable table format. Selecting Top Table displays the table and hides the flame graph.
- Sandwich view
- Displays both the callers and callees for a selected function, offering a comprehensive view of function interactions. Access by clicking in the flame graph and selecting Sandwich view.
- Flame Graph view
- Visualizes profiling data in a flame graph format, allowing easy identification of resource-intensive functions. Selecting Flame Graph displays the flame graph and hides the table.
- Both view
- Displays both the table and the flame graph. This is the default view for Single View.
- Export Data
- Options to export the flame graph for offline analysis or share it via a flamegraph.com link for collaborative review.
This screenshot shows a spike in CPU usage.
Without profiling, you would go from a spike in CPU usage metric to digging through code or guessing the cause.
However, with profiling, you can use the flame graph and table to see exactly which function is most responsible for the spike.
Often this shows up as a single node taking up a noticeably disproportionate width in the flame graph as seen below with the checkDriverAvailability
function.
In some instances, it may be a function that’s called many times and is taking up a large amount of space in the flame graph. In this case, you can use the sandwich view to see that a logging function called throughout many functions in the codebase is the culprit.
Comparison page
The Comparison view facilitates side-by-side comparison of profiles either based on different label sets, different time periods, or both. This feature is valuable for understanding the impact of changes or differences between two distinct queries of your application.
You can use Comparison view to compare different time ranges whether or not the labels are the same. For example, in investigating the cause of a memory leak, the timeline might show a steadily increasing amount of memory allocations over time. You can use the Comparison view to compare the memory allocations between two different time periods where allocations were low and where allocations were high. This information helps you identify the function that’s causing the memory leak.
To run a comparison:
- Select two different sets of labels (for example,
env:production
vs.env:development
) and or time periods, reflected by the sub-timelines preceding each flame graph. - View the resulting flame graphs side by side to identify disparities in performance.
There are many practical use cases for comparison.
Some examples of labels below expressed as label:value
are:
- Feature flags
- Compare application performance with
feature_flag:a
vs.feature_flag:b
- Deployment environments
- Contrast
env:production
vs.env:development
- Release analysis
- Examine
commit:release-1
vs.commit:release-2
- Region
- Compare
region:us-east-1
vs.region:us-west-1
Comparison diff view
The Comparison diff view is an extension of the comparison page, crucial for more easily visually showing the differences between two profiling data sets. It normalizes the data by comparing the percentage of total time spent in each function so that the resulting flame graph is comparing the share of time spent in each function rather than the absolute amount of time spent in each function. This is important because it allows you to compare two different queries that may have different total amounts of time spent in each function.
Similar to a git diff
, it takes the flame graphs from the Comparison page and highlights the differences between the two flame graphs where red represents an increase in CPU usage from the baseline to the comparison and green represents a decrease.
Here is a diff between two label sets: