Help build the future of open source observability software Open positions

Check out the open source projects we support Downloads

Grot cannot remember your choice unless you click the consent notice at the bottom.

A beginner's guide to Kubernetes application monitoring

A beginner's guide to Kubernetes application monitoring

31 Jan, 2023 11 min

Application performance monitoring (APM) involves a mix of tools and practices to track specific performance metrics. Engineers use APM to monitor and maintain the health of their applications and ensure a better user experience.

This is crucial to high quality architecture, development, and operations, but it can be difficult to achieve in Kubernetes since the container orchestration system doesn’t provide an easy way to monitor application data like it does for other cluster components.

In this article, you’ll learn what Kubernetes APM is and the key metrics that are important to track. You’ll also discover three options for monitoring your applications in Kubernetes:

  1. Building metrics logic into your application
  2. Using Kubernetes sidecar containers
  3. Deploying Grafana Agent

We’ll also discuss some challenges you might encounter when monitoring applications in Kubernetes, as well as some potential solutions.

An illustration featuring the Kubernetes and Grafana Labs logo with text about this being a guide to application monitoring in Kubernetes.

What is Kubernetes APM?

Kubernetes APM focuses on the performance of an application or service running on your Kubernetes cluster. Along with logging and tracing, developers use APM to get a comprehensive overview of the behavior of applications and services running in their Kubernetes cluster, which provides insights into how to improve them. For example, the APM system should detect if an application has a memory leak and is consuming excessive RAM.

Kubernetes APM vs. Kubernetes monitoring

When talking about monitoring, it’s common to confuse Kubernetes APM with Kubernetes monitoring. Both are important, and both are aimed at monitoring performance. However, Kubernetes APM focuses on monitoring the performance of the applications running in the Kubernetes cluster, while Kubernetes monitoring focuses on the performance of the cluster itself.

To put it another way, if your Kubernetes cluster runs several applications, you should monitor the performance of each one separately, using metrics to understand which aspects are likely to be optimized in each one. On the other hand, the performance of the Kubernetes cluster affects all the applications and services that run on it, so it must also be monitored.

Monitoring the Kubernetes cluster involves tracking the control plane/API server and the worker nodes. The control plane is where the Kubernetes API runs along with the cluster store (etcd), controller manager and controllers (which control the state of your cluster), and the scheduler (how and where pods run). And it’s where you’ll find the pods, including the kubelet (agent), kube-proxy (networking), DNS, and the container runtime. You also need to monitor the underlying infrastructure, which can mean two to three control planes and three to four worker nodes in production.

Monitoring the application means understanding how it’s performing in a pod, which holds the containers. You need to monitor the binary running in the pod, because the pod might be running even if the application isn’t. As we stated at the outset, the problem is that Kubernetes doesn’t provide an easy way to monitor application data like it does for the rest of the cluster components.

You should also monitor the resource usage of applications inside the Kubernetes cluster. Make sure to limit each application’s resources (like CPU and memory) so that one application doesn’t use up all the resources on the node and cause issues for the other applications running on the same node. One way to do this is with resource limits and limit range policies.

Key metrics to know for Kubernetes APM

Before getting to the how, you first need to consider the what, or which metrics are worth scraping.

Some of the key metrics your team can use to detect issues that affect application performance and user experience include:

  • Request rate. This metric allows you to visualize the number of requests users make to the application or service per unit of time. This makes it easy to identify spikes in user traffic, which in turn helps engineers plan for resource scaling at times of high and low demand.
  • Response time. This metric measures the average response time of the application. When this value exceeds a certain threshold, lags could impact the user experience, hence the importance of monitoring it.
  • Error rate. This tracks how many errors occur in a certain amount of time, which makes it a useful metric for ensuring compliance with service level agreements (SLAs).
  • Memory usage. This metric provides insight into the memory usage of the application or service. It’s useful for setting alerts and tracking application optimizations.
  • CPU usage. Like memory usage, this metric allows for evaluation of the resources consumed by the application or service. You can use this information in many ways, including planning the resources needed in the cluster during peak hours and detecting higher than usual CPU usage.
  • Persistent storage usage. This metric also has to do with the resources that the application needs, but from the viewpoint of permanent storage. In Kubernetes, managing persistent storage is just as important as managing CPU and memory usage, so it’s crucial to include this metric in your analyses.
  • Uptime: Along with error rate, this metric is tremendously useful to keep an eye on SLAs since it allows you to calculate the percentage of time that the application remains online.

For more information on why you should use metrics, as well as how to use and visualize them, check out this blog about using the popular RED Method.

How to monitor application performance in Kubernetes

Increasingly, developers are taking a shift-left approach to monitoring and observability, which means they use these practices earlier in the software development lifecycle (SDLC) so they can detect and resolve problems before the software is released. To implement this approach for logging, monitoring, and metrics, you need to determine how you’re going to collect those metrics.

As mentioned earlier, Kubernetes doesn’t offer a way to get metrics for applications running on the cluster — at least not out of the box. You can use different methods to scrape these metrics from their origin containers. For the purposes of this guide, we’ll explore three aforementioned methods:

  1. Building metrics logic into applications
  2. Using Kubernetes sidecar containers
  3. Deploying Grafana Agent

Determining which of these methods is best for your needs will depend on your particular use case, but here is a high-level overview of all three, including the pros and cons of each one, to help you make an informed decision.

Method 1: Building metrics logic into applications

The first method is familiar to senior developers, since it was used long before the appearance of Kubernetes and cloud computing. Basically, you add instrumentation directly into the application code. This can be achieved, for example, by using a Prometheus client library that corresponds to the language you’re using to develop your application.

Prometheus provides Go, Java, Scala, Python, Ruby, and Rust libraries to monitor your applications and services. Additionally, unofficial third-party client libraries are available for Bash, C, C++, Node.js, Perl, PHP, and many other popular languages. All of these libraries expose metrics via an HTTP endpoint that can be scraped by an aggregator such as Prometheus. In turn, Prometheus can send the metrics to an observability platform like Grafana Cloud for visualization and analysis.

In other words, this method consists of several steps: implementing the instrumentation in the code of the application or service to be monitored, scraping the metrics, and exporting the metrics to an external platform for analysis and visualization.


  • You can implement custom metrics, and it gives you the maximum amount of flexibility in regards to the data collected.


  • You have to modify the application code, which requires deep knowledge of the codebase to implement in existing applications.
  • It increases the possibility of creating dependency conflicts with external aggregators or the application code itself.
  • It increases the risk of vendor lock-in if you use third-party libraries that use closed source code.

While this approach has its disadvantages, it’s a powerful method to monitor applications and part of what we recommend at Grafana Labs.

Method 2: Using Kubernetes sidecar containers

Think of this second method as a variation of the first. Instead of implementing the instrumentation directly in the code of the application or service to be monitored, this method deploys a sidecar container that runs alongside the container that hosts the application. The sidecar container executes the instrumentation code or logging agent and exports the data to the corresponding observability platform.

This strategy is possible because containers of the same pod share resources, such as storage volumes and network interfaces, in Kubernetes. In other words, the sidecar container can easily access the logs and other metadata that reside in the main container’s filesystem.

Among the advantages of sidecars is how easy it is to deploy them in Kubernetes. The following sidecar-example.yml manifest is proof of this:

apiVersion: v1
kind: Pod
  name: sidecar-example
  # Application container
  - name: main-app
    image: alpine 
    command: ["/bin/sh"]
    args: ["-c", "while true; do date >> /var/log/app.txt; sleep 30;done"]
    # Mount the pod's shared log file into the app container.
    # The app writes logs here.
    - name: shared-logs
      mountPath: /var/log
  # Sidecar container
  - name: sidecar-container
    image: busybox
	command: ["sh","-c","while true; do cat /var/log/app.txt; sleep 30; done"]
    - name: shared-logs
      mountPath: /var/log
   # Shared volume that can be accessed by the sidecar container
   # app and sidecar share.
  - name: shared-logs 
    emptyDir: {}

The code shown above defines two containers: main-app writes the current date to the /var/log/app.txt location every 30 seconds, and sidecar-container prints the contents of /var/log/app.txt to the console every 30 seconds. Both containers share the shared-logs volume, which makes this functionality possible. Note that instead of printing information to the console, the sidecar container could pass the data to a logging agent or whatever else you need it to do.


  • It separates instrumentation code from the main application code.
  • It’s relatively easy to implement in Kubernetes.


  • It adds an extra layer of complexity to your cluster. Managing the lifecycle of sidecar containers requires the same planning as the application itself.
  • It could increase the use of cluster resources.
  • It’s prone to compatibility issues when the main application is updated.
  • Not every language is supported.

Although this method offers some advantages over the previous one, it introduces new challenges. In particular, using more containers can be an issue in large-scale deployments, given the higher resource consumption. Therefore, this solution is more suitable for small to medium deployments where it’s not feasible to use agents such as Grafana Agent. That method is explained next.

Method 3: Deploying Grafana Agent

Currently, the preferred method for pulling monitoring data from containers is to install a monitoring agent such as Grafana Agent. This is an extension to the first method. If you use Grafana Agent, it collects and forwards telemetry data to open source deployments of the Grafana OSS Stack, Grafana Cloud, or Grafana Enterprise, where your data can then be analyzed. In a nutshell, when this agent is installed on each node of your Kubernetes cluster, it can pull metrics from the application and its dependencies and send them to an external monitoring platform — in this case, Grafana.

Setting up the agent is a relatively simple process since it’s done through a ConfigMap that can be configured according to your needs. You can take a look at its manifest in the quickstart guide.

An even more convenient method is to install the Grafana Agent Operator. This custom resource simplifies the process since it automatically installs and configures the Grafana Agent. It eliminates manual configuration work and provides a comprehensive out-of-the-box solution.


  • The Grafana Agent not only collects metrics; it can also collect logs and traces, which makes it easy to implement a comprehensive observability solution in your Kubernetes cluster.
  • It works natively with the Grafana stack and you can send metrics to any Prometheus-compatible endpoint, resulting in no vendor lock-in.
  • It’s easy to implement via a ConfigMap or Grafana Agent Operator.


  • It may not be the best solution if your application does not support Prometheus libraries.

As you can see, this method offers multiple advantages over the previous ones. And when it’s used in tandem with the first method, you are in the best position to scrape and expose those metrics to support your application monitoring. However, as mentioned, each use case is unique, so you’d need to determine which is the best for your application or service.

Key metrics challenges

Typically, there are three questions you’ll face when planning your metrics strategy:

  1. Are metrics publicly available?
  2. How can you collect the metrics?
  3. How can you alert on the metrics?

If your application makes a metrics endpoint public, you can easily ingest and consume those metrics. Collecting metrics depends on how and where you’ll be storing them — in a TSDB, for example. As for alerting, you’ll need to determine which use cases are most relevant and to whom. For example, if a 503 error is occurring, should that be addressed by the platform engineering team or the development team?


Without proper monitoring and observability, you’ll never know how well an application is performing or even whether an application is down. Using APM practices with Kubernetes can make all the difference in terms of resolving potential issues and maintaining the health of your clusters as well as your applications.

As you saw in this article, Grafana offers multiple ways for you to implement APM with your Kubernetes clusters. The added visibility and analytics that it provides can help you improve the quality of your applications as well as your Kubernetes workflow.

If you’re interested in a managed monitoring solution, check out Kubernetes Monitoring in Grafana Cloud, which is available to all Grafana Cloud users, including those in our generous free tier. If you don’t already have a Grafana Cloud account, you can sign up for a free account today!