---
title: "Infrastructure as code | Grafana Cloud documentation"
description: "Automate your Grafana IRM setup and configuration using Terraform and the OnCall API."
---

# Infrastructure as code

Infrastructure as code (IaC) enables you to automate your Grafana IRM setup and configuration using version-controlled, repeatable processes. This approach helps maintain consistency and simplifies the management of your incident response infrastructure.

Using infrastructure as code with IRM provides several key benefits:

- **Consistency**: Reduce manual errors and enforce standardized configurations.
- **Version control**: Track changes, manage rollbacks, and collaborate effectively.
- **Scalability**: Deploy and manage configurations across multiple teams and environments.
- **Automation**: Simplify updates and minimize manual intervention.
- **Compliance**: Maintain audit trails and enforce security policies through code.

## Before you begin

Ensure you have the following:

- Access to your organization’s Grafana Cloud account.
- Appropriate permissions to create and manage IRM resources.
- A Grafana Cloud service account token or OnCall API key with the necessary permissions.
- [Terraform](https://developer.hashicorp.com/terraform/downloads) installed on your system (if using Terraform).

## Supported tools

IRM supports the following infrastructure as code approaches:

- **Terraform**: Use the [Grafana Terraform provider](https://registry.terraform.io/providers/grafana/grafana/latest/docs) to manage on-call schedules, escalation chains, integrations, routes, and more.
- **OnCall API**: Build custom workflows and programmatically control your incident response setup. For API reference, refer to [OnCall API](/docs/grafana-cloud/alerting-and-irm/irm/reference/oncall-api).

## Understand IRM resource management limitations

Not all IRM resources support full IaC management. Understanding these boundaries helps you plan your automation strategy.

**Fully manageable via API and Terraform:**

- Integrations (including direct paging)
- Escalation chains and escalation policies
- Routes
- On-call schedules and shifts
- Shift swaps
- Outgoing webhooks
- Personal notification rules
- Resolution notes

**Read-only via API (provisioned through Grafana):**

- Users: Synced from your Grafana instance. Listing users with a Terraform user-agent triggers a sync.
- Teams: Synced from Grafana teams. You can reference teams by ID, but you can’t create or modify them through the OnCall API.
- Organizations: Read-only.

**Not available via the OnCall API:**

- Incident management resources: Managed through the separate [Incident API](/docs/grafana-cloud/alerting-and-irm/irm/reference/incident-api).
- ChatOps configurations (Slack, Microsoft Teams, Telegram): Managed through the IRM UI.
- Admin organization settings: Managed through the IRM UI.

> Note
> 
> IRM uses its own internal user IDs, which are different from Grafana user IDs. When you manage IRM resources with Terraform, you must map Grafana users to their corresponding IRM user IDs. For an example, refer to the [Map user IDs](#map-user-ids) section.

## Set up the Terraform provider

To authenticate Terraform requests, you can use either a **Grafana Cloud service account token** (recommended) or a **legacy OnCall API key**.

### Create an API token

1. In your Grafana Cloud instance, go to **Alerts &amp; IRM** &gt; **IRM**.
2. Go to **Settings** and select **Admin &amp; API**.
3. In the **API Tokens** section, click **Create New Token**.
4. Provide a name and select appropriate permissions.
5. Save the token securely. You can’t view it again after creation.

For more information about authentication methods, refer to the [OnCall API authentication documentation](/docs/grafana-cloud/alerting-and-irm/irm/reference/oncall-api/#authentication).

### Configure the provider

Add the following configuration to your Terraform files:

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
terraform {
  required_providers {
    grafana = {
      source  = "grafana/grafana"
      version = ">= 3.0.0"
    }
  }
}

provider "grafana" {
  alias               = "oncall"
  oncall_access_token  = var.oncall_access_token  # Store tokens in variables
}
```

## Example configurations

### Map user IDs

IRM uses internal user IDs that differ from Grafana user IDs. Use the `grafana_oncall_user` data source to look up IRM user IDs by username (typically the user’s email address):

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
// Import users from IRM
data "grafana_oncall_user" "all_users" {
  provider = grafana.oncall
  // Extract a flat set of all users from all teams
  for_each = toset(flatten([
    for team_name, username_list in local.teams : [
      username_list
    ]
  ]))
  username = each.key
}

// On-call groups / teams
locals {
  teams = {
    emea = [
      "alfa@grafana.com",
      "bravo@grafana.com",
      "charlie@grafana.com",
      "delta@grafana.com",
      "echo@grafana.com",
      "foxtrot@grafana.com",
      "golf@grafana.com",
    ]
  }
  // The OnCall API operates with resource IDs, so convert emails into IDs
  teams_map_of_user_id = { for team_name, username_list in local.teams : team_name => [
  for username in username_list : lookup(data.grafana_oncall_user.all_users, username).id] }
  // Reverse lookup: find a user by their OnCall ID
  users_map_by_id = { for username, oncall_user in data.grafana_oncall_user.all_users :
  oncall_user.id => oncall_user }
}
```

### Define an on-call schedule

Create a web-based schedule with on-call shifts. Schedules and shifts are separate resources - define shifts first, then reference them in the schedule:

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
resource "grafana_oncall_on_call_shift" "week_shift" {
  provider   = grafana.oncall
  name       = "Weekly Rotation"
  type       = "rolling_users"
  start      = "2024-01-01T08:00:00"
  duration   = 60 * 60 * 24 * 7  # One week in seconds (604800)
  frequency  = "weekly"
  rolling_users = [
    [data.grafana_oncall_user.all_users["alfa@grafana.com"].id],
    [data.grafana_oncall_user.all_users["bravo@grafana.com"].id],
  ]
  time_zone  = "UTC"
}

resource "grafana_oncall_schedule" "primary" {
  provider  = grafana.oncall
  name      = "Primary On-Call Rotation"
  type      = "web"
  time_zone = "UTC"
  shifts    = [grafana_oncall_on_call_shift.week_shift.id]
}
```

For more schedule examples, refer to [Schedules as code](/docs/grafana-cloud/alerting-and-irm/irm/on-call-schedules/schedules-as-code).

### Create an escalation chain

Escalation chains and their steps are separate resources. Define the chain first, then add escalation steps:

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
resource "grafana_oncall_escalation_chain" "default" {
  provider = grafana.oncall
  name     = "Primary Escalation Chain"
}

resource "grafana_oncall_escalation" "step_notify" {
  provider             = grafana.oncall
  escalation_chain_id  = grafana_oncall_escalation_chain.default.id
  type                 = "notify_persons"
  persons_to_notify    = [data.grafana_oncall_user.all_users["alfa@grafana.com"].id]
  position             = 0
}

resource "grafana_oncall_escalation" "step_wait" {
  provider             = grafana.oncall
  escalation_chain_id  = grafana_oncall_escalation_chain.default.id
  type                 = "wait"
  duration             = 300  # Wait 5 minutes before next step
  position             = 1
}

resource "grafana_oncall_escalation" "step_notify_schedule" {
  provider                     = grafana.oncall
  escalation_chain_id          = grafana_oncall_escalation_chain.default.id
  type                         = "notify_on_call_from_schedule"
  notify_on_call_from_schedule = grafana_oncall_schedule.primary.id
  position                     = 2
}
```

### Configure an integration with labels

Create an integration and assign static labels. First, define labels using the `grafana_oncall_label` data source:

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
data "grafana_oncall_label" "env_label" {
  provider = grafana.oncall
  key      = "environment"
  value    = "production"
}
```

Then, pass the label into the `grafana_oncall_integration` resource:

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
resource "grafana_oncall_integration" "monitoring" {
  provider = grafana.oncall
  name     = "Production Monitoring"
  type     = "webhook"
  labels   = [data.grafana_oncall_label.env_label]

  default_route {}
}
```

IRM also supports **dynamic labels** on integrations. Dynamic labels are extracted from alert payloads at ingestion time rather than being statically assigned. Configure dynamic labels through the IRM UI or the OnCall API.

For more information, refer to [Configure labels](/docs/grafana-cloud/alerting-and-irm/irm/escalation-and-routing/labels/configure-labels).

### Configure a direct paging integration

hcl ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```hcl
resource "grafana_oncall_integration" "direct_paging" {
  provider = grafana.oncall
  name     = "Engineering Direct Paging"
  type     = "direct_paging"

  default_route {}
}
```

> Note
> 
> To manage direct paging integrations through Terraform, enable the **Manually manage direct paging integrations** setting in [Admin settings](/docs/grafana-cloud/alerting-and-irm/irm/set-up/admin-settings). Otherwise, IRM automatically creates and manages direct paging integrations for each team.

## Apply your configuration

Initialize, preview, and apply your Terraform configuration:

sh ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```sh
terraform init
```

Preview the changes:

sh ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```sh
terraform plan
```

Apply the configuration:

sh ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```sh
terraform apply
```

## Service account token limitations

When using Grafana Cloud service account tokens with the OnCall API, be aware of the following limitations:

- Service accounts can’t retrieve their own user profile (`GET /api/v1/users/current/`).
- Service accounts can’t perform state-changing actions on alert groups, such as acknowledge, resolve, or silence. Use a user API token or the IRM UI for these operations.
- The `/info`, `/make_call`, and `/send_sms` endpoints require a legacy OnCall API key and don’t accept service account tokens.

If you encounter `500` errors when using service account tokens, verify the endpoint supports service account authentication and that the token has the necessary permissions.

## Continuous integration

For teams using CI/CD pipelines, consider automating the validation and deployment of your IRM configuration:

1. Create a separate workspace for each environment (development, staging, production).
2. Use pull requests to review configuration changes.
3. Implement automated testing of your Terraform configurations.
4. Configure CI pipelines to automatically apply changes after approval.

The following GitHub Actions workflow example validates Terraform configurations on every push and pull request:

YAML ![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```yaml
name: IRM Infrastructure

on:
  push:
    branches: [main]
    paths:
      - 'terraform/oncall/**'
  pull_request:
    branches: [main]
    paths:
      - 'terraform/oncall/**'

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - name: Validate Terraform
        run: |
          cd terraform/oncall
          terraform init -backend=false
          terraform validate
```

## Best practices

- Store tokens and sensitive values in Terraform variables or a secrets manager.
- Store state files securely using a remote backend.
- Use workspaces to manage separate environments (development, staging, production).
- Implement code review for configuration changes.
- Test configurations in a non-production environment before applying to production.
- Use consistent naming conventions across resources.

## Next steps

- [Grafana Terraform provider documentation](https://registry.terraform.io/providers/grafana/grafana/latest/docs)
- [Grafana OnCall Terraform resources](https://registry.terraform.io/providers/grafana/grafana/latest/docs/resources/oncall_schedule)
- [Example configurations on GitHub](https://github.com/grafana/oncall/tree/dev/terraform/examples)
- [Grafana Cloud Terraform documentation](/docs/grafana-cloud/developer-resources/infrastructure-as-code/terraform/terraform-oncall/)
- [Manage schedules as code](/docs/grafana-cloud/alerting-and-irm/irm/on-call-schedules/schedules-as-code)
- [Get started with Grafana OnCall and Terraform](/blog/get-started-with-grafana-oncall-and-terraform/)
