---
title: "Work with schemas | Grafana Enterprise Metrics documentation"
description: "Work with schemas The schemas define how data is aggregated before the Graphite query functions process the aggregated data. There are two types of schemas:"
---

# Work with schemas

The schemas define how data is aggregated before the Graphite query functions process the aggregated data. There are two types of schemas:

- Storage-schemas define to what interval data should be aggregated and also up to which point in the past data should be available for querying.
- Aggregation-schemas define what aggregation function should be used to aggregated the data.

Each of the two schemas has a global default which can be configured via the following flags:

- `-graphite.querier.schemas.default-storage-schemas-file`
- `-graphite.querier.schemas.default-storage-aggregations-file`

If the config flag `-graphite.querier.schemas.enable-user-overrides` is enabled then the global defaults can be overridden on a per-tenant basis.

## Querying and updating a tenant’s schemas

Each of the two schema types has an API endpoint that can be used to query it and to update it, the endpoints are:

- `/graphite/config/storageSchemas`
- `/graphite/config/storageAggregations`

This is an example request to the Graphite querier which gets the current storage-schemas of the tenant `12345`:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
~$ curl -H 'X-Scope-OrgId: 12345' http://<graphite-querier>/graphite/config/storageSchemas
[default]
pattern    = .*
intervals  = 0:1s
retentions = 1m:5w

~$
```

This is an example request to the Graphite querier which updates the storage-schemas of the tenant `12345`:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
~$ cat storage-schemas.conf
[default]
pattern    = .*
intervals  = 0:1s
retentions = 1m:5w,2h:1y

~$ curl -H 'X-Scope-OrgId: 12345' --data-binary '@storage-schemas.conf' http://<graphite-querier>/graphite/config/storageSchemas
~$
```

The tenant schemas get stored in the object store bucket configured with the configuration flags prefixed by `-graphite.querier.schemas`.

## Schema caching

When a query is received the Graphite querier needs to obtain the tenant’s schemas, in order to avoid getting the schema from the object store for every single query the schemas get cached in the process memory by default for `1h`, this is configurable via `-graphite.querier.schemas.schema-ttl`.

### Proactive cache refreshing

When the Graphite querier gets a schema from its local cache in order to handle a query it checks whether this cache entry is already expired and whether it is in the first or in the second half of its cache life time. If the entry is expired then it doesn’t get used and the latest schema gets fetched from the object store, the obtained schema is then used to handle the query and to populate the cache. If the entry is not expired but it is already in the second half of its cache life time then the cached value is used, but at the same time an asynchronous job is kicked off which fetches the latest schema from the object store and then populates the cache with it.

This pro-active cache refreshing mechanism prevents that queries ever get blocked on the fetching of the latest schema from the object store, assuming that the tenant submits queries relatively frequently. In the case where a tenant only submits queries very infrequently it is possible that their query handling gets blocked on the fetching of the latest schemas.

## Schema structure and syntax

Each of the two schema types (`storage-schemas`/`storage-aggregations`) consists of a list of schemas where each schema has a `pattern` property containing a regular expression. When the Graphite querier looks up the schema for a given metric it loops over the list of schemas and matches their patterns against the given metric name, the first schema where the pattern matches is going to be used, this means that the order of the schemas within the configuration files matters.

It is common practice that the last schema in each file is named `default` and has the pattern `.*`, this is a catch-all which matches all metrics that haven’t matched any of the previous schemas. If there is no `default` schema and none of the defined schemas matches, then a hard-coded default is used.

### Storage schemas

The format of the storage schemas is very similar to standard Graphite, with two additional parameters. For the documentation of the format refer to [storage-schemas.conf](https://graphite.readthedocs.io/en/latest/config-carbon.html#storage-schemas-conf).

Example:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
[apache_busyWorkers]
pattern = ^servers\.www.*\.workers\.busyWorkers$
retentions = 15s:7d,1m:21d,15m:5y
```

This example defines that:

- All metrics matching this regular expression will be available for querying for up to `5y`
- If a query only queries the most recent `7d` of data then the data is available at an interval of `15s`
- If a query queries the most recent `21d` of data then the data is available at an interval of `1m`
- If a query queries the most recent `5y` of data then the data is available at an interval of `15m`
- If a query queries for more than `5y`, it will only get data returned for the most recent `5y`

#### Additions to the standard storage schemas

The Graphite querier supports the standard format of Graphite’s storage-schemas and storage-aggregation config files. In addition to the standard properties, it supports two more parameters in the storage-schemas config file which are specific to the Graphite querier.

##### Intervals

The `intervals` parameter can be used to define a minimum interval with which the data of a certain static time range must be returned. This is useful in cases where the `retentions` would potentially assign an interval to data that is lower than the real interval of the stored data, for example because the data in the store has been generated according to an older version of the storage-schema which has been updated in the meantime. In such a scenario the gaps between the stored points would get filled with `math.NaN` values in order to enforce the interval defined by the `retentions`, which can lead to unexpected query results.

The intervals parameter is a list of absolute time ranges, each range has an interval associated with it which is the minimum interval at which this data shall be returned. Each time range is defined by its beginning and it ends at the beginning of the following time range, the last time range is open ended into the future.

Example:

`100:15s,200:30s,300:15s` This string defines that:

- The data in the time range `100-200` must be returned with an interval of at least `15s`
- The data in the time range `200-300` must be returned with an interval of at least `30s`
- The data in the time range `300-<unlimited>` must be returned with an interval of at least `15s`

Keep in mind that in Graphite it is a requirement that each metric which gets passed into the Graphite query engine has a constant interval, this means if a user would query the time range `100-500` then the highest minimum interval of that time range (`30s`) would effectively become the minimum interval for the entire queried range.

The minimum interval that has been determined according to the `intervals` parameter is then used in the selection of the retention interval at which the data shall be returned.

For example if the `intervals` parameter defines a minimum interval of `30s` like in the above example and the defined retentions which could be used for a query are `15s:7d,15m:5y` based on the time range then the Graphite querier will pick the second retention and return the data at an interval of `15m`, because even though the first retention would be valid according to the queried time range its interval is lower than the minimum interval of `30s` defined by `intervals`.

An example schema with the intervals parameter might look like this:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
[default]
pattern = .*
intervals = 1594166400:30min,1625702400:15s
retentions = 15s:7d,15m:5y
```

#### Applying retentions relative to end of queried time range

The parameter `relativeToQuery` is an optional flag which can be added to a storage schema, when it is not defined then its default value is `false`.

If set to `true` then this flag causes that the defined `retentions` do not get applied relative to the current wall clock time, but instead they get applied relative to the end of the queried time range. This means that for example with a `retentions` settings of `15s:7d` a time range of `7d` can be queried and gets returned with an interval of `15s` even if the query requests the time range from `now()-1y-7d` to `now()-1y` because in that situation the `7d` retention would get applied relative to `now()-1y`.

An example schema with this parameter might look like this:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
[default]
pattern = .*
relativeToQuery = true
retentions = 15s:7d,15m:5y
```

### Storage aggregations

The format of the storage aggregations is exactly the same as in standard Graphite.

For the documentation of the format refer to [storage-aggregation.conf](https://graphite.readthedocs.io/en/latest/config-carbon.html#storage-aggregation-conf).

A valid storage aggregations entry might look like this example:

![Copy code to clipboard](/media/images/icons/icon-copy-small-2.svg) Copy

```none
[all_min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
```
