Concepts

Concepts

Read / Write Metrictank

HM-API creates separate Metrictank deployments for writing data and for reading data. The write Metrictanks never handle any queries, they only consume data from Kafka and persist it into the backend store (Cassandra / BigTable). The read Metrictanks read the same data from Kafka, but instead of writing it they only keep as much in memory as might not have been written to the backend store yet. All the user queries get handled by the read Metrictanks, which may query each other or fetch data from the backend store if they can’t serve the query from memory.

Colors

HM-API supports colored deployments of Metrictank read-clusters, the two supported colors are called a and b. The colored deployments are completely separate Metrictank read clusters, which can be running in parallel to each other and which can run with different configuration values, but only one color can be active at any point in time (active means that it receives queries). This makes it possible to start a second cluster (another color) of read Metrictanks with a different configuration, while the first one keeps running and stays available to handle queries. Once the second color is fully started and ready it can be activated and it will then receive all the user queries. If the new color performs as expected, the old color can be shut down, otherwise switching back to the old color is very fast.

To give the two colors different configurations, edit the according instance overrides overridesA and overridesB of the instance config, which are documented here. Note that the same instance config map also has a color property which shows the currently active color.

A common scenario would look like this:

  • Check what color is currently active by looking at the instance config. Let’s say it is a
  • Edit the instance config and add overrides to the property overridesB to define a different configuration for the other color (b)
  • Call the standby-on endpoint to activate the currently inactive color (b), this endpoint is documented here. This creates new read Metrictank deployments, which get the configuration defined in overridesB. Note that this will consume additional resources on your Kubernetes cluster.
  • Wait for the Metrictanks of color b to become ready, their status can be checked with kubectl -n metrictank get pods -l app=metrictank,color=b,instance=<instanceId>
  • Once the Metrictanks of color b are ready, call the HM-API endpoint to switch colors, the endpoint is documented here. This modifies the configuration of the according Kubernetes service resources to send all the user traffic to the b-colored Metrictanks
  • If you are happy with how the color b performs, color a can be shut down. Since we switched the colors, color a is now the “standby”, so you can call the endpoint to shutdown the standby which is documented here
  • If you are not happy with color b and you want to switch back to color a, just call the endpoint switchcolor endpoint one more time to switch back

Rollout Concurrency

The rollout concurrency defines the maximum number of Metrictank deployments that may have pods in a non-ready state at the same time during a deployment. The default value is 2, so for example in a typical scenario with 4 read Metrictank deployments it would first update the ones with id 0 and 1, and then it would wait until all the pods of at least one of them are ready again before it continues updating more deployments. Once all the read Metrictanks have been updated, it does the same with the write ones. That way the rollout is slower, but it consumes less temporary resources on the Kubernetes cluster and it puts less load on Kafka. Every time a new Metrictank starts up it needs tor replay its backlog from Kafka, so if all Metrictank deployments would get updated at once, this could overload Kafka and slow down the data ingestion.

Every instance configuration has a property called rolloutConcurrency, by default it gets set to 2. When updating an instance this property can be set to control the trade off between speed of deployment and smoothing the spike on the resource usage of Kafka and the whole Kubernetes cluster. The property is documented here

Partitioning policy

Each Metrictank handles multiple partitions. Partition assignment to Metrictanks is defined by the partitioning policy chosen for the instance.

A partitioning policy can choose to ignore the number of read/write Metrictank deployments requested via NODES_METRICTANK_READ/NODES_METRICTANK_WRITE.

Available partitioning policies are:

ConstantWithNoOverlap

Option Explanation Mandatory
partitionsPerReader number of partitions per metrictank read deployment yes
partitionsPerWriter number of partitions per metrictank write deployment yes
replicationFactor for each partition, number of read deployments containing it yes

By default, the ConstantWithNoOverlap policy is used with 8 partitions per Metrictank read and write deployment and a replicationFactor set to 2, only affecting read deployments, to achieve high availability.

Importer Utility

Node Selectors