Configure Grafana private data source connect (PDC)
Private data source connect (PDC) enables you to securely connect your Grafana Cloud stack to data sources hosted on a private network.
Private data source connect (PDC) is available in all editions of Grafana Cloud.
Set up a private data source connection
To set up a private data source connection, you will first deploy the Grafana PDC agent, then configure which hosts and ports to allow on your network, and configure your data source with those ports.
Before you begin
Before you begin working with private data source connect (PDC) ensure the following:
You have the tools you need to deploy the PDC agent within your network. You can deploy it directly to a Linux or Windows server, or use a container management system like Docker or Kubernetes.
The OpenSSH version is 9.2 or higher on the server the PDC agent was deployed to. For more information on this version requirement, refer to the PDC scalability and security page
You need to know the local host name and port of the data source you would like to connect to, for example
loki:8080
.You have the proper set of credentials to access the data, for example, a username and password, or a token. Refer to the documentation for your data source to learn what credentials are needed.
You have an administrator account for your Grafana Cloud organization. To learn more about Grafana Cloud permissions, refer to Grafana Cloud user roles and permissions.
Note
To establish an SSH connection to Grafana Cloud, the PDC agent must run on a network that allows internet egress to the following endpoints:
private-datasource-connect-<cluster>.grafana.net:22
andprivate-datasource-connect-api-<cluster>.grafana.net:443
. The<cluster>
is displayed in the grafana UI (under Connections > Private data source connections > Configuration Details).The API endpoint (port 443) is used for signing the short-lived SSH certificates used for authenticating with the SSH endpoint (port 22).
If your data source uses AWS SigV4 (AWS Signature Version 4 Authentication), the network where the PDC agent runs must allow internet egress to
sts.<region>.amazonaws.com:443
. Replace<region>
with the AWS region you are querying. For more details on AWS SigV4, refer to the AWS documentation.
Multiple PDC Networks
Your Grafana Cloud stack can have multiple PDC networks. Use a separate PDC network for each isolated private network in your infrastructure.
Each PDC network can have multiple PDC agents connected to it. The requests for a PDC network are load balanced across all PDC agents connected to that PDC network.
Tip
If you have multiple data centers which are connected, but have high latency or low bandwidth, consider using a separate PDC network for each data center. This allows you to choose the most suitable PDC network for each data source.
PDC connection steps
To set up a private data source connection, follow these steps:
In Grafana, go to Connections > Private data source connections. Either choose an existing PDC network or create a new one. Click the Configuration Details tab.
Select your installation method and follow the instructions on the screen, or generate an API key and follow the remaining instructions below. You will need the following environment variables from your instance:
GCLOUD_PDC_SIGNING_TOKEN
set to the API token value generated in your Grafana Cloud instance. This is shown astoken
in the configuration instructions in the Private data source configuration page.GCLOUD_HOSTED_GRAFANA_ID
the ID of your Grafana Cloud instance. This is shown asgcloud-hosted-grafana-id
in the configuration instructions in the Private data source configuration page.GCLOUD_PDC_CLUSTER
the cluster for your Private data source connections. This is shown ascluster
in the configuration instructions in the Private data source configuration page.
Connect to Grafana Cloud using the PDC agent.
There are three installation options:
- running on Kubernetes
- running the PDC Agent Docker image
- running a PDC Agent binary
Option 1 - Using Kubernetes
Create a Kubernetes secret with the API Key, Hosted Grafana ID and PDC Cluster values (${NAMESPACE}
should be set to your desired Kubernetes namespace):
$ kubectl create secret generic grafana-pdc-agent \
--from-literal="token=${GCLOUD_PDC_SIGNING_TOKEN}" \
--from-literal="hosted-grafana-id=${GCLOUD_HOSTED_GRAFANA_ID}" \
--from-literal="cluster=${GCLOUD_PDC_CLUSTER}"
Generate a Kubernetes deployment to deploy the agent. An example deployment is provided in the pdc-agent repository:
kubectl apply -f https://raw.githubusercontent.com/grafana/pdc-agent/main/production/kubernetes/pdc-agent-deployment.yaml
Option 2 - Using the pdc-agent docker image:
docker run --name pdc-agent grafana/pdc-agent:latest -token ${GCLOUD_PDC_SIGNING_TOKEN} -cluster ${GCLOUD_PDC_CLUSTER} -gcloud-hosted-grafana-id ${GCLO UD_HOSTED_GRAFANA_ID}
Option 3 - Use a pdc-agent binary
Download and unzip the binary for your OS from the PDC Agent releases page.
Run the binary:
./pdc -token ${GCLOUD_PDC_SIGNING_TOKEN} -cluster ${GCLOUD_PDC_CLUSTER} -gcloud-hosted-grafana-id ${GCLOUD_HOSTED_GRAFANA_ID}
High availability
(Optional) For high availability, you can install additional instances of the agent on your network with the same configuration. These can be deployed to different regions, data centers, or providers as long as they are on the same network. For production environments, Grafana recommends running a minimum of 3 PDC agents.
Note
Updating the agent requires a restart of the PDC agent (or a rolling update of the PDC deployment when running in Kubernetes).
Once the PDC Agent successfully connects to Grafana Cloud, you will see the the following message in your logs:
This is Grafana Private Data Source Connect!
Resource requirements for pdc-agent
For information on resource requirements for the PDC agent, along with recommendations for how many PDC agents to run, refer to the PDC scalability and security page.
Configure a data source to use private data source connect (PDC)
After you have set up the PDC connection, you can set up a data source in Grafana to query your data.
Before you begin adding a data source
- Ensure the data source you want to connect to supports PDC. Refer to PDC known limitations for a list of supported data sources.
Steps to add a data source
Follow the Add a data source instructions.
Under the Private data source connection header, choose the connection to the network where your service is hosted.
In the URL field for your data source, use the same URL as if you were on your private network, instead of a public URL.
Save, test, and query your data source as usual.
Check your PDC Agent configuration
If you have trouble connecting to your data source check the list of destinations reachable by the PDC agent, which might be restricted using the PermitRemoteOpen SSH option, set with a --ssh-option
flag. You can see this list in the agent’s configuration. If your agent is running with high verbosity (-vvv
), you will be able to see attempted connections in the agent logs.
Configure PDC to connect a single Grafana stack to multiple networks
Once you have set up your first private data source connection, connect to data sources in additional networks by creating more than one PDC within your Grafana instance.
In Grafana, go to Connections > Private data source connections and click on Add New. Choose a name for your second connection and click the Add button.
Follow the instructions above to set up a private data source connection and deploy the PDC agent to the additional network.
When you configure a data source, select the new connection in the Private data source connect section.
Troubleshooting
For help troubleshooting the PDC agent or sending PDC queries, refer to the Troubleshooting PDC page.
View PDC activity in the grafanacloud-usage
data source
Your stack’s grafanacloud-usage
data source contains two metrics for tracking PDC activity:
grafanacloud_grafana_pdc_connected_agents
shows your how many PDC agents are connected to our infrastructure, for each stack in your org, and each PDC network (using thetunnelID
label).grafanacloud_grafana_pdc_datasource_request_duration_seconds_rate5m_p90
shows the p90 request duration for each data source in your stacks that are using PDC. There is also astatus_code
label, so you can see whether requests succeeded or failed.