Configurations

Configurations

There are mainly three relevant places where HM-API and its instances get configured:

HM-API Cli arguments

This lists the cli parameters which HM-API knows, together with an explanation what each of them does.

ArgumentDefaultExplanation
-alsologtostderrfalseLog to standard error as well as files
-api-usertestBasic auth user for the HTTP API
-api-passtestBasic auth password for the HTTP API
-cassandra-servicecassandraName of the Cassandra Kubernetes services to access
-clusterclusterName of cluster to be used in reported metrics
-config-maphm-api-configName of configMap used for default settings
-default-concurrency2Default concurrency to be used when rolling out MT deployments
-default-domainhosted-metrics.grafana.netDomain to append to instance names
-dry-runfalseStarts HM-API as mock server, mostly used for development
-kubeconfig./configPath to the kubeconfig file
-lets-encryptfalseEnable lets-encrypt for automatic creation of ssl keys
-listen-hosthm-apiHostname this HM-API can be reached at
-listen_port80TCP port to listen on
-log_backtrace_at""When logging hits line file:N, emit a stack trace
-log_dir""If non-empty, write log files in this directory
-logtostderrfalseLog to standard error instead of files
-namespacemetrictankNamespace to provision metrictank clusters in
-ns1-enabledfalseEnable NS1 DNS integration
-ns1-key""NS1 API key
-ns1-zonegrafana.netDNS zone in which records should be created in
-service-accountdefaultService account used by the HM-API pods
-sslfalseUse HTTPS
-ssl_certificate""Path to SSL certificate file
-ssl_key""Path to SSL certificate key
-stderrthresholdERRORLogs at or above this threshold go to stderr
-v0Enable V-leveled logging at the specified level
-vmodule""Comma separated list of pattern=N settings for file-filtered logging
-zk-servicezookeeperZookeeper service name

HM-API ConfigMap

By default the name of this ConfigMap is hm-api-config, unless specified otherwise with the -config-map argument. It contains various settings which are regularly read and used by HM-API. If an HM-API default setting is not present, or the whole ConfigMap doesn’t exist, then HM-API will add it on startup. This means it is possible that this ConfigMap gets modified when an HM-API version update is deployed which introduces new defaults.

Deployment Plans

A plan defines what sizes of instances are known to HM-API. When a plan has been selected for an instance, it is still possible to override certain limits with specific overrides, but the plan gives a basic template of sizes which should be deployed.

The plans exist under the key plans.json and the default plans look like this:

    {
      "large": {
        "Name": "Large",
        "Partitions": 128,
        "CPU": 100,
        "Mem": 1000
      },
      "medium": {
        "Name": "Medium",
        "Partitions": 32,
        "CPU": 100,
        "Mem": 1000
      },
      "small": {
        "Name": "Small",
        "Partitions": 8,
        "CPU": 50,
        "Mem": 500
      }
    }

When a new instance gets deployed, one of the configured plans must be selected. It will determine the resource requests & limits, the number of partitions in Kafka, and, based on the partitioning policy, the number of Metrictanks that will get deployed.

Using the default partitioning policy (ConstantWithNoOverlap) with its default values (partitionsPerReader/partitionsPerWriter set to 8). The number of Metrictanks read deployments (and write deployments) is then <number of partitions> / 8, so a medium instance with 32 partitions results in 4 shards. Each read deployment gets 2 Metrictank pods, and each write deployment 1 Metrictank pod.

The plan definitions may be modified by editing the config map with kubectl -n metrictank edit configmap hm-api-config.

Config Defaults

The config defaults are a long list of parameters, some of which affect the behavior of HM-API itself, but most are configuring the instances managed by HM-API. These parameters exist in a map under the key config-defaults.json. All of them can also be overridden in the instance specific configuration which is described here.

The parameters which configure one of the following services are prefixed by one string for each service:

PrefixService
MTMetrictank
MTIMPORTERData Importer Tool
GRAPHITEGraphite
GWTsdb-Gw

For example to configure the Metrictank setting -warm-up-period the accoding entry in config-defaults.json would look like MT_WARM_UP_PERIOD. A few service-specific settings get added to config-defaults.json by default, others can freely be added by using the same string format.

To get a detailed explanation for each of the settings for Tsdb-Gw, Metrictank, the importer tool and Graphite, please refer to their according documentation. This is an overview of all settings, with an explanation, which are not specific to one of those three services:

Configuration ParameterDefaultExplanation
CARBON_TSDB_ADDR""The address where each instance’s carbon-relay-ng should send its metrics to
CASS_KS_REPLICATION_FACTOR"3"The Cassandra keyspace replication factor which gets defined when a new keyspace is created
KAFKA_REPLICATION_FACTOR"3"The Kafka replication factor which gets defined when a new topic created
DOMAINhosted-metrics.grafana.netThe domain to append to instance names, this can also be configured via the cli parameter -default-domain.
ALIAS""Creates an additional ingress with the specified name. Usually only used in instance configs
SSL_SECRET_NAMEwildcard-grafana-netName of the secret where the wildcard SSL cert is stored. Will be used to terminate SSL in instance ingresses
CLUSTER""Name of cluster to be used in metrics reported by the created instances
POD_SERVICE_ACCOUNTdefaultThe service account as which service pods run
DEPLOYMENT_TIMEOUT12hHow long deployment jobs wait for any each deployment to become fully ready
METRICTANK_VERSIONlatestMetrictank version
GRAPHITE_VERSIONlatestGraphite version
TSDB_GW_VERSIONlatestTsdb-Gw version
CARBON_RELAY_VERSIONlatestCarbon-Relay-Ng version
STATSDAEMON_VERSIONlatestStatsdaemon version
MTIMPORTER_VERSION$METRICTANK_VERSIONMt Importer utility version
METRICTANK_IMAGEus.gcr.io/metrictank-gcr/metrictankMetrictank docker image
MTIMPORTER_IMAGE$METRICTANK_IMAGEMt Importer utility docker image
TSDB_GW_IMAGEdocker.io/raintank/tsdb-gwTsdb-Gw docker image
GRAPHITE_IMAGEdocker.io/raintank/graphite-mtGraphite docker image
CARBON_RELAY_IMAGEus.gcr.io/metrictank-gcr/carbon-relay-ngCarbon-Relay-Ng docker image
STATSDAEMON_IMAGEus.gcr.io/metrictank-gcr/statsdaemonStatsdaemon docker image
HOSTED_METRICS_API_IMAGEus.gcr.io/metrictank-gcr/hosted-metrics-apiHosted Metrics API docker image
MT_WRITE_NODESELECTORS""Optional setting to define node selectors which will get added to the write pods. For details please read the concepts documentation.
MT_READ_NODESELECTORS""Same as the MT_WRITE_NODESELECTORS but for read deployments

Instance ConfigMap

Each instance has a configmap which stores the settings from which the instance resources are generated. The name format of that configmap looks like instance-config-<org>-<name>, so for example the name of the configmap for an instance of the org 22 with the name myinstance would look like instance-config-22-myinstance.

This configmap contains 5 keys, but only the value of deployment-config.json is relevant for configuring the instance. The values of the other 4 keys of that configmap are generated from the value of deployment-config.json. So editing them is not recommended, because they will get overwritten again.

KeyExplanation
deployment-config.jsonJson string with instance configurations, such as name, orgId, plan etc. Also Includes the values from which the following 4 keys get generated.
index-rules.confMounted as config file in Metrictank. Generated from the value at the key indexRules in deployment-config.json.
storage-aggregations.confMounted as config file in Metrictank. Generated from the value at the key storage.aggregations in deployment-config.json.
storage-schemas.confMounted as config file in Metrictank. Generated from the value at the key storage.schemas in deployment-config.json.
tsdb-auth.iniMounted as config file in Tsdb-Gw. Generated from the value at the key tsdbAuth in deployment-config.json.

For an explanation of the contents of the Metrictank and Tsdb-Gw configuration files, please refer to the documentation of the according service.

This is a list of the other configurations in deployment-config.json, together with an explanation for each of them:

KeyExplanation
orgAn integer which is the org id of the instance
nameA string which is the name of the instance
planA structure that describes the chosen plan for this instance, it includes a full copy of the plan
plan.NameThe name of the plan of the instance
plan.ScaleFactorA factor by which the plan CPU & Memory get multiplied to scale the instance up
plan.PartitionsThe number of partitions this instance has in Kafka
plan.CPUThe amount of CPU the Metrictank pods request, the limit is always set to 8 (1 per partition)
plan.MemThe amount of Memory the Metrictank pods request, the limit is always set to 10x of that value
overridesA map of keys and values, each of which override / extend the settings from HM-API’s default values. See overrides
overridesAA map of keys and values, similar to overrides, but these overrides are only applied in Metrictank pods with the color a. For an explanation of the concept of colors, please read the concepts documentation
overridesBSame as overridesA, but for B
adminKeyAn admin key that usually gets auto-generated at the time the instance is created. It can be used to authenticate against the Tsdb-Gw
colorThe color which is currently active. For an explanation of what “color” means in this context, please read the concepts documentation
revisionThe revision is a counter which starts at 0, and at every change / update to the instance it gets increased
stateA string which shows the current state of the instance. Can be one of active, deploying, deleting, failed
importerUsed by the Importer utility to store its state, refer to the concepts documentation for more details about how the importer works
jobDataUsed to pass data to asynchronous jobs that get created by HM-API, these jobs execute long running tasks such as deploying/deleting/modifying instances
rolloutConcurrencyLimits how many Metrictank deployments ever get updated at once. For details please read the concepts documentation
partitioningPolicyA structure than defines which partitioning policy to use and its options. For details please read the concepts documentation
partitioningPolicy.nameThe name of the partitioning policy to use for the instance
partitioningPolicy.optionsA map of keys/values passed to the partitioning policy when it is instantiated
enableIndexPruningCronjobA bool to enable/disable the optional index pruning cronjob
indexPruningCronjobScheduleThe schedule of the index pruning cronjob, if enabled. A random one will be generated if none is specified

Overrides

The overrides, overridesA and overridesB can contain parameters to override default values.

KeyExplanation
MT_*override for any metrictank environment variable (for all pods, read/write/query as appropriate)
MTIMPORTER_*override for any metrictank environment variable which should only be applied in the importer tool’s pod
GW_*override for any tsdb-gw environment variable
GRAPHITE_*override for any graphite environment variable
MTIMPORTER_NUM_PARTITIONSnumber of partitions the importer writes to, by default this is generated based on the instance plan
MTIMPORTER_TTLSTTLs as a comma-separated list. default value is generated based on storage-schemas.conf
MTIMPORTER_HTTP_ENDPOINThttp endpoint the importer pod will listen on, in the format <host>:<port>
MTIMPORTER_PARTITION_SCHEMEpartition schema, such as bySeries
MTIMPORTER_URI_PATHthe URI at which the importer should expect posts, such as /chunks
MEM_METRICTANKfor read/write: mem request in MiB (limit is 10x this), for query nodes: mem limit in MiB (requested is always 25% of this) (default: plan.mem * scalefactor)
MEM_METRICTANK_READmem request in MiB (limit is 10x this) for read nodes; overrides MEM_METRICTANK
MEM_METRICTANK_WRITEmem request in MiB (limit is 10x this) for write nodes; overrides MEM_METRICTANK
MEM_METRICTANK_QUERYmem request in MiB (limit is 10x this) for query nodes; overrides MEM_METRICTANK
CPU_METRICTANKcpu request in millis (1/1000 of a core) (for query deployments it is 25% of this) (default: plan.cpu * scalefactor)
CPU_METRICTANK_READcpu request in millis (1/1000 of a core) for read nodes; overrides CPU_METRICTANK
CPU_METRICTANK_WRITEcpu request in millis (1/1000 of a core) for write nodes; overrides CPU_METRICTANK
CPU_METRICTANK_QUERYcpu request in millis (1/1000 of a core) for query nodes; overrides CPU_METRICTANK
NODES_METRICTANK_QUERYnumber of query deployments (default: partitions/8)
NODES_METRICTANK_READnumber of read deployments (default: partitions/8), ignored by partitioning policy ConstantWithNoOverlap
NODES_METRICTANK_WRITEnumber of write deployments (default: partitions/8), ignored by partitioning policy ConstantWithNoOverlap
MEM_TSDBmem request for tsdb-gw in MiB (limit is 5x this) (default: 250 + 100* (plan.scalefactor -1))
CPU_TSDBcpu request in millis (1/1000 of a core) (default: plan.cpu * scalefactor)
MEM_GRAPHITEmem request for graphite in MiB (defalut: 250 + 100* (plan.scalefactor -1))
CPU_GRAPHITEcpu request in millis (1/1000 of a core) for graphite (default: plan.cpu * scalefactor/2)
NODES_TSDBnumber of tsdb-gw pods (default: max(2, partitions/8))
NODES_GRAPHITEnumber of graphite pods (default: max(2, partitions/8))
CARBON_ORG_IDwhich org-Id to set for the carbon-relay-ng container (it uses this for auth when submitting data to tsdb-gw) (default: same as instance org id)
CARBON_RELAY_IMAGEdocker image for carbon-relay-ng (default: see hm-api config map)
CARBON_RELAY_VERSIONdocker image version for carbon-relay-ng (default: see hm-api config map)
CARBON_RELAY_MEM_BASEcarbon-relay-ng request memory “base” in Mi (memory request will be base * max(2, partitions/8)) (default: 100)
CARBON_RELAY_CPU_BASEcarbon-relay-ng request CPU “base” in millis (cpu request will be base * max(2, partitions/8)) (default: 8)
MEM_CARBON_RELAYcarbon-relay-ng request memory override in Mi (limit will be 5x this)
CPU_CARBON_RELAYcarbon-relay-ng request CPU override in millis
GRAPHITE_IMAGEdocker image for graphite (default: see hm-api config map)
GRAPHITE_VERSIONdocker image version for graphite (default: see hm-api config map)
HOSTED_METRICS_API_IMAGEdocker image for hosted-metrics-api (default: see hm-api config map)
METRICTANK_IMAGEdocker image for metrictank (default: see hm-api config map)
METRICTANK_VERSIONdocker image version for metrictank (default: see hm-api config map)
MTIMPORTER_IMAGEdocker image for metrictank-importer (default: see hm-api config map)
MTIMPORTER_VERSIONdocker image version for metrictank-importer (default: see hm-api config map)
STATSD_IMAGEdocker image for statsd (default: see hm-api config map)
STATSD_VERSIONdocker image version for statsd (default: see hm-api config map)
TSDB_GW_IMAGEdocker image for tsdb-gw (default: see hm-api config map)
TSDB_GW_VERSIONdocker image version for tsdb-gw (default: see hm-api config map)
ALIASalias hostname for ingress (default: see hm-api config map)
CLUSTERused in metrics and jaeger traces
POD_SERVICE_ACCOUNTname of the ServiceAccount to use to run pods
DOMAINTODO
SSL_SECRET_NAMEname of k8s secret to load for TLS
CARBON_TSDB_ADDRvalue to set carbon-relay-ng TSDB_ADDR value to
CASS_KS_REPLICATION_FACTORreplication factor for Cassandra (default 3)
KAFKA_REPLICATION_FACTORreplication factor for Kafka (default 3)
MEM_REQ_IDX_PRUNINGmem request in MiB for the optional index pruning cron job in MB
MEM_LIM_IDX_PRUNINGmem limit in MiB for the optional index pruning cron job in MB
CPU_REQ_IDX_PRUNINGcpu request in millis (1/1000 of a core) for the optional index pruning cron job in MB