Canary tokens: Learn all about the unsung heroes of security at Grafana Labs

• 2025-08-25 • 11 min

We’ve written at length about a recent security incident tied to a GitHub action, but what we haven’t publicized yet is how we found the trail left by the attacker: canary tokens.

To quickly recap the security incident in question, on April 26 a vulnerable GitHub Action workflow allowed an unauthorized user access to a limited number of tokens.

Our investigation into the incident concluded May 12, and we confirmed that there were no code modifications, unauthorized access to production systems, exposure of customer data, or access to personal information.

After the initial incident was triggered (but before our investigation concluded), the attacker deleted their fork, PRs, and workflow runs in an attempt to cover their tracks. This led to unauthorized access to secrets across five public repositories, though this did not impact production systems or customer data either. We disabled affected workflows, rotated all exposed credentials, and audited our repositories with Trufflehog, Zizmor, and Gato-X. We also conducted a full access log review in Grafana Loki and other data sources. You can read the initial announcement here and the PIR here.

Without canary tokens and our watchful team, it would have taken longer to detect the attack. In this blog, we’ll explain what canary tokens are and show you how they’ve helped us. We’ll also tell you how they can help you detect intrusions before they become serious security incidents.

What are canary tokens? And how do we use them at Grafana Labs?

Canary tokens are digital tripwires or decoys that look valuable to an intruder but have no legitimate use. If someone finds and uses one you’ve deployed, you’ll receive an immediate alert.

Named after coal-mining canaries (early warning for toxic gas), these tokens are far lighter than honeypots. They can take many simple forms, such as API keys, files, URLs, or DNS entries, making them fast and easy to deploy across your environment.

In our case, canary tokens weren’t just an experiment, they were the primary signal that told us an attacker was inside: An AWS API key was validated by the attacker; we got a real-time alert; teams swarmed; and the intrusion was contained within minutes.

Here are the actionable lessons from that detection and why placement and deployment matter more than the token itself:

Speed wins: Canary tokens turn hours of triage into minutes of containment.
Placement matters: Choose the right token types and place them where attackers actually look.
Precision in pinpointing: Utilize organization-level tokens for broad compromise detection and repository-level tokens to identify specific entry points.
Metadata is gold: Comprehensive reminder text, token names, and locations are crucial for efficient triage.
Integration is key: Canary tokens only shine when paired with clear naming, automated alerting, and integrated response workflows.
Iterative scaling: Begin with a small-scale placement in high-risk areas. Only then, by refining your response strategy, can you scale use of canary tokens.

How we built our canary token infrastructure

There are two options for deploying canary tokens:

DIY setup: Generate tokens (API keys, fake files, custom URLs) yourself and build or plug into your existing monitoring system to catch any token use. This gives you full control over token behavior and alert handling, making it ideal for custom platforms.
Automated platforms: Solutions like Thinkst or TraceBit offer out-of-the-box token creation, monitoring, and notifications across major cloud providers and apps. For any bespoke tokens or corner-case scenarios, you’ll still need to handle monitoring yourself to ensure complete coverage.

While a DIY approach could be sufficient for a relatively small environment, we needed something that could scale without manual intervention. So, after experimenting with Thinkst’s OSS and free editions, we switched to their cloud version for features missing in OSS, such as undetectable tokens, a robust API, audit-trail logging, and a full authentication/authorization system.

To make the most out of our setup, we connected the notifications of the Thinkst platform to Grafana Cloud IRM webhooks, as shown below:

Screenshot of a web page showing a Global Webhooks Feed section with options for managing flocks and installed webhooks.

The webhook is then being listened to by the Grafana Cloud IRM integration:

Screenshot of a Canary Tokens configuration page showing HTTP Endpoint, labels schema, templates, and routes options for security alerts.

This setup allows us to route canary tokens’ notifications to Slack (as you can see in the Routes section above) and automate escalation chains (as you can see in the screenshot below). If you’d like to do something similar, further details on Grafana Cloud IRM integrations are available here.

A schedule interface showing five steps with start, wait, and import actions for 'Security Oncall' and 'Security Managers' notifications.

Canary token lifecycle

We use a simple, four-step lifecycle to operationalize canary tokens and turn them into an actionable detection and response capability:

Create: Generate a unique token, like an AWS API key, PDF link, or DNS subdomain that “phones home” when accessed.
Place: Embed tokens where attackers would look, like code repos, config directories, vaults, cloud consoles, databases, or shared drives.
Monitor: A background service watches for any token use and sends an alert with metadata (timestamp, IP, hostname). Keep in mind: If you’re not using a platform, you must build custom monitoring.
Respond: Investigate alerts, isolate affected assets, and remediate before any real data is touched. Keep the triggered token for forensics and replace it immediately.

Canary tokens and TruffleHog to the rescue!

We planted AWS API key canary tokens (alongside a few others) in organization- and repository-level secrets. When the attacker exfiltrated the secrets and ran TruffleHog to validate them, TruffleHog’s sts:GetCallerIdentity call against our tokens triggered the canary, unlike free canary tokens that are automatically detected by TruffleHog without calling the AWS API.

Because Thinkst was watching those AWS accounts, we got an immediate Slack alert. That one notification, thanks to our strategic token placement, revealed the exact entry point. The Detection & Response team reacted to the notification and raised an incident. Teams across R&D and Security sprang into action to contain and remediate the incident, which, again, resulted in no code modifications, unauthorized access to production systems, exposure of customer data, or access to personal information.

Our canary token placement strategy

Given GitHub’s central role as our primary source of truth, safeguarding it with canary tokens is crucial. Specifically, we strategically deploy AWS API key canary tokens within GitHub Secrets by placing the tokens at the organization level (to flag any repo breach) and per repository (to precisely pinpoint intrusions at the repository level). Other platforms get their own flavors. For example, Google Docs tokens live in shared Google Drive files, email-based tokens in dummy inboxes, and DNS tokens in unused subdomains.

The key is to map out every asset (repos, CI/CD pipelines, cloud consoles, vaults, pods, containers, etc.) and sprinkle a diverse mix of tokens wherever an attacker might snoop. We’ve deployed tens of thousands (or an infinite amount, for that matter), so watch your steps, since there’s a tripwire behind every click.

How you can utilize canary tokens in your security setup

Now that we’ve walked through how we deploy canary tokens, we want to share some tips for how you can utilize them as well. For the purposes of this blog, we’ll focus our advice on the same tools we use internally (because we believe in them as tried and tested technologies), but most of the general guidance still applies for other similar tools.

Shaping your placement strategy with team discussion

Some key questions you should consider before deciding on your canary placement and alerting strategy include:

Which token types catch attackers best?

Pick from Thinkst’s catalog (over two dozen types) to match your environment and threats. Mix cloud/API keys (AWS, Azure, Slack) with file- or link-based tokens (PDFs, DNS, URLs) so you cover code repos, docs, and networks. Use a variety to broaden detection.

How do you make alerts impossible to miss?

Tune your alert thresholds and channels to strike the right balance between urgency and noise. Consider:

Acknowledge alerts early: Acknowledge alerts promptly on the Thinkst platform to avoid grouping subsequent alerts and missing notifications of repeated canary token triggers from different IP addresses.
METADATA IS IMPORTANT: For quicker investigation, ensure every notification includes canary token metadata (name, location, reminder text). The Thinkst platform automates this if the reminder text is correctly configured.

Where do alerts need to land?

Choose channels your team actually watches:

Immediate channels: Slack (with webhooks) or the Grafana Cloud IRM mobile app for 24/7 paging
Secondary channels: Security mailing lists, SIEM, or ticketing (e.g., Splunk) for structured incident tracking

Map each severity level to the right channel so nothing slips through.

Where should tokens live?

Sprinkle tokens in high-risk and high-value spots:

CI/CD pipelines: As dummy environment variables, test jobs, or secondary workflows
Private GitHub/GitLab repositories: In config files, comments or branch-specific .env files
Developer workstations & build servers: In ~/.aws/credentials, hidden directories or registry entries
Cloud accounts & consoles: Nested within IAM policies or metadata documents
Shared network drives & documentation: In ordinary looking directories or PDFs
Public GitHub/GitLab repositories: Be cautious here! Alert fatigue is a real thing. Limit use of canary tokens in public repositories to avoid false positives from automated secret scanners triggered by inadvertently exposed sensitive information like AWS API keys in Git branches (i.e., obvious places).

How do you prevent accidental triggers?

Educate teams: Explain the “what” and “where” of canary tokens.
Demo alerts: Show how tokens work and what notifications look like.
Announce deployments: Share rollout plans in team channels.
Document “off-limits” zones: Post common token names and locations in your wiki.
Use runbooks: Guide responders through investigations to avoid confusion.

You have two communication strategies for canary tokens:

Silent: No internal notice, which is better for catching insiders but higher false positives.
Communicative: Teams know about tokens, with fewer false positives but risk tipping off attackers.

Who are you trying to catch?

Identify target adversaries for canary tokens. Understand that advanced attackers use unique methods, so tokens alone are insufficient for full access prevention.

External attackers probing repos or cloud assets
Automated scanners hunting for exposed secrets
Insider threats snooping on token-laden areas
Supply-chain attackers targeting CI/CD
Phishing or social-engineering via document or email tokens

By answering these, you’ll align token types, alerting, and placement with your organization’s risks and tools, thus turning every token into a reliable tripwire.

Best practices, limitations, and gotchas

Canary tokens provide valuable, high-confidence alerts of compromise, but triggering one means an attacker is already inside your system. While effective, they are not a foolproof solution and have limitations.

Provider support varies: Not every token type is available on every platform. If you build your own bespoke API or secret-generation service, you’ll need to roll your own monitoring. One way to do that is to log token IDs (never the secret!) into Grafana Loki, set up Grafana Alerting alerts on those IDs, and feed incidents into Grafana Cloud IRM.
Monitoring availability: If your token-watching infrastructure goes down, or worse, becomes compromised, you lose visibility entirely. So, secure your infrastructure!
False positives from insiders: Before we formally announced our token rollout, a handful of teammates accidentally tripped tokens simply by poking around. By asking them why they accessed those credentials, we quickly labeled those alerts as benign and moved on.
Obfuscated attacker source: VPNs hide real IPs, so location data can be misleading. Some browser-based tokens can add fingerprinting, but that only works for web-triggered tokens.
API rate limits: Even when vendors advertise “unlimited” API calls, their infrastructure typically enforces limitations. Blast too many token creation or update requests at once and you’ll hit throttling or even break the API.

Here are the key best practices we’ve learnt over time:

Use descriptive reminders: Leverage Thinkst’s “reminder” text field to tag each token with owner, environment, and location. You can also store JSON in that field and retrieve it later via the UI or API. That metadata lets you pinpoint precisely what was touched.
One token per location: When a token fires, archive it for forensics, since an attacker might reuse or sell it, then replace it ideally with a fresh name that blends in (e.g., AWS_API_CREDS instead of CANARY_TOKEN_AWS).
Group into flocks: Organize tokens by environment, team, or platform into flocks to simplify bulk management.
Integrate alerts into incident response: Route notifications into your messaging system, your SIEM, and/or ticketing system so that nothing slips through the cracks.
Mix token types: Don’t rely solely on AWS API keys. Deploy file, URL/DNS, email, and command tokens to cover every attacker tactic.
Automate placement: Use Thinkst’s scripts or IaC examples (Terraform, Ansible, Puppet, and AWS CloudFormation) to ensure consistency. We also worked closely with Thinkst support to master their UI, API, and tooling and learn some best practices.
Test your tokens and document actions: Trigger a few tokens in a safe environment after deployment to verify alert delivery and downstream automation. This will also enhance your incident response plan. This practice allows you to simulate real-world scenarios and refine your response procedures. Document these simulated incidents and the corresponding steps taken to create comprehensive runbooks. These runbooks will serve as valuable guides when canary tokens are triggered in the future, ensuring a more efficient and effective response and less confusion.
Communicate widely and wisely: Make your teams (or even your broader community) aware of canary tokens. That transparency reduces accidental triggers and builds security awareness.

By understanding these limitations and following these practices, you’ll keep your canary token deployment lean, reliable, and ready to catch real attacks the moment they unfold.

Conclusion

Canary tokens have been a force multiplier for us, turning every corner of our infrastructure into a potential tripwire and giving us the precious minutes needed to stop a breach before it escalates.

Seed your environment with right token types, integrate alerts for automation, and never underestimate the advantage of knowing an adversary is inside before they ever touch your real assets.

Grafana Cloud is the easiest way to get started with metrics, logs, traces, dashboards, and more. We have a generous forever-free tier and plans for every use case. Sign up for free now!