Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
Databricks datasource for Grafana
The Databricks datasource allows a direct connection to Databricks to query and visualize Databricks data in Grafana.
This datasource provides a SQL editor to format and color code your SQL statements.
Note: This plugin is for Grafana Enterprise only.
Installation
For detailed instructions on how to install the plugin on Grafana Cloud or locally, please checkout the Plugin installation docs.
Note: This plugin uses dynamic links for Credentials authentication (deprecated). We suggest using Token authentication. If you run the plugin on bare Alpine Linux, using Credentials authentication it will not work. If for some reason Token based auth is not an option and Alpine Linux is a requirement, we suggest using our Alpine images.
Manual configuration
Once the plugin is installed on your Grafana instance, follow these instructions to add a new Databricks data source, and enter configuration options.
With a configuration file
It is possible to configure data sources using configuration files with Grafana’s provisioning system. To read about how it works, including all the settings that you can set for this data source, refer to Provisioning Grafana data sources.
Here are some provisioning examples for this data source using basic authentication:
apiVersion: 1
datasources:
- name: Databricks
type: grafana-databricks-datasource
jsonData:
host: community.cloud.databricks.com
httpPath: path-from-databricks-odbc-settings
secureJsonData:
token: password/personal-token
Time series
Time series visualization options are selectable after adding a datetime
field type to your query. This field will be used as the timestamp. You can
select time series visualizations using the visualization options. Grafana
interprets timestamp rows without explicit time zone as UTC. Any column except
time
is treated as a value column.
Multi-line time series
To create multi-line time series, the query must return at least 3 fields in the following order:
- field 1:
datetime
field with an alias oftime
- field 2: value to group by
- field 3+: the metric values
For example:
SELECT log_time AS time, machine_group, avg(disk_free) AS avg_disk_free
FROM mgbench.logs1
GROUP BY machine_group, log_time
ORDER BY log_time
Templates and variables
To add a new Databricks query variable, refer to Add a query variable.
After creating a variable, you can use it in your Databricks queries by using Variable syntax. For more information about variables, refer to Templates and variables.
Macros in Databricks Query
Macro | Description |
---|---|
$____interval_long | Converts Grafana’s interval to INTERVAL DAY TO SECOND literal. Applicable to Spark SQL window grouping expression. |
$__interval_long
macro
In some cases, you may want to use window grouping with Spark SQL.
Example:
SELECT window.start, avg(aggqueue) FROM a17
GROUP BY window(_time, '$__interval_long')
will be translated into the following query based on dashboard interval.
SELECT window.start, avg(aggqueue) FROM a17
GROUP BY window(_time, '2 MINUTE')
Macro examples
Below are examples when grafana has a 1m
interval.
Format | Expands to |
---|---|
$__interval_long | 1 MINUTE |
Learn more
- Add Annotations.
- Configure and use Templates and variables.
- Add Transformations.
- Set up alerting; refer to Alerts overview.