<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Storage on Grafana Labs</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/</link><description>Recent content in Storage on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/enterprise-logs/v1.9.x/loki/operations/storage/index.xml" rel="self" type="application/rss+xml"/><item><title>Log Entry Deletion</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/logs-deletion/</link><pubDate>Tue, 16 Jul 2024 15:42:20 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/logs-deletion/</guid><content><![CDATA[&lt;h1 id=&#34;log-entry-deletion&#34;&gt;Log Entry Deletion&lt;/h1&gt;
&lt;p&gt;Grafana Loki supports the deletion of log entries from a specified stream.
Log entries that fall within a specified time window and match an optional line filter are those that will be deleted.&lt;/p&gt;
&lt;p&gt;Log entry deletion is supported &lt;em&gt;only&lt;/em&gt; when the BoltDB Shipper is configured for the index store.&lt;/p&gt;
&lt;p&gt;The compactor component exposes REST &lt;a href=&#34;../../../api/#compactor&#34;&gt;endpoints&lt;/a&gt; that process delete requests.
Hitting the endpoint specifies the streams and the time window.
The deletion of the log entries takes place after a configurable cancellation time period expires.&lt;/p&gt;
&lt;p&gt;Log entry deletion relies on configuration of the custom logs retention workflow as defined for the &lt;a href=&#34;../retention#compactor&#34;&gt;compactor&lt;/a&gt;. The compactor looks at unprocessed requests which are past their cancellation period to decide whether a chunk is to be deleted or not.&lt;/p&gt;
&lt;h2 id=&#34;configuration&#34;&gt;Configuration&lt;/h2&gt;
&lt;p&gt;Enable log entry deletion by setting &lt;code&gt;retention_enabled&lt;/code&gt; to true and &lt;code&gt;deletion_mode&lt;/code&gt; to &lt;code&gt;filter-only&lt;/code&gt; or &lt;code&gt;filter-and-delete&lt;/code&gt; in the compactor&amp;rsquo;s configuration.&lt;/p&gt;
&lt;p&gt;With &lt;code&gt;filter-only&lt;/code&gt;, log lines matching the query in the delete request are filtered out when querying Loki. They are not removed from storage.
With &lt;code&gt;filter-and-delete&lt;/code&gt;, log lines matching the query in the delete request are filtered out when querying Loki, and they are also removed from storage.&lt;/p&gt;
&lt;p&gt;A delete request may be canceled within a configurable cancellation period. Set the &lt;code&gt;delete_request_cancel_period&lt;/code&gt; in the compactor&amp;rsquo;s YAML configuration or on the command line when invoking Loki. Its default value is 24h.&lt;/p&gt;
&lt;p&gt;Access to the deletion API can be enabled per tenant via the &lt;code&gt;allow_deletes&lt;/code&gt; setting.&lt;/p&gt;
]]></content><description>&lt;h1 id="log-entry-deletion">Log Entry Deletion&lt;/h1>
&lt;p>Grafana Loki supports the deletion of log entries from a specified stream.
Log entries that fall within a specified time window and match an optional line filter are those that will be deleted.&lt;/p></description></item><item><title>Filesystem</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/filesystem/</link><pubDate>Tue, 16 Jul 2024 15:42:20 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/filesystem/</guid><content><![CDATA[&lt;h1 id=&#34;filesystem-object-store&#34;&gt;Filesystem Object Store&lt;/h1&gt;
&lt;p&gt;The filesystem object store is the easiest to get started with Grafana Loki but there are some pros/cons to this approach.&lt;/p&gt;
&lt;p&gt;Very simply it stores all the objects (chunks) in the specified directory:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;storage_config:
  filesystem:
    directory: /tmp/loki/&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;A folder is created for every tenant all the chunks for one tenant are stored in that directory.&lt;/p&gt;
&lt;p&gt;If Loki is run in single-tenant mode, all the chunks are put in a folder named &lt;code&gt;fake&lt;/code&gt; which is the synthesized tenant name used for single tenant mode.&lt;/p&gt;
&lt;h2 id=&#34;pros&#34;&gt;Pros&lt;/h2&gt;
&lt;p&gt;Very simple, no additional software required to use Loki when paired with the BoltDB index store.&lt;/p&gt;
&lt;p&gt;Great for low volume applications, proof of concepts, and just playing around with Loki.&lt;/p&gt;
&lt;h2 id=&#34;cons&#34;&gt;Cons&lt;/h2&gt;
&lt;h3 id=&#34;scaling&#34;&gt;Scaling&lt;/h3&gt;
&lt;p&gt;At some point there is a limit to how many chunks can be stored in a single directory, for example see &lt;a href=&#34;https://github.com/grafana/loki/issues/1502&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;issue #1502&lt;/a&gt; which explains how a Loki user ran into a strange error with about &lt;strong&gt;5.5 million chunk files&lt;/strong&gt; in their file store (and also a workaround for the problem).&lt;/p&gt;
&lt;p&gt;However, if you keep your streams low (remember loki writes a chunk per stream) and use configs like &lt;code&gt;chunk_target_size&lt;/code&gt; (around 1MB), &lt;code&gt;max_chunk_age&lt;/code&gt; (increase beyond 1h), &lt;code&gt;chunk_idle_period&lt;/code&gt; (increase to match &lt;code&gt;max_chunk_age&lt;/code&gt;) can be tweaked to reduce the number of chunks flushed (although they will trade for more memory consumption).&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s still very possible to store terabytes of log data with the filestore, but realize there are limitations to how many files a filesystem will want to store in a single directory.&lt;/p&gt;
&lt;h3 id=&#34;durability&#34;&gt;Durability&lt;/h3&gt;
&lt;p&gt;The durability of the objects is at the mercy of the filesystem itself where other object stores like S3/GCS do a lot behind the scenes to offer extremely high durability to your data.&lt;/p&gt;
&lt;h3 id=&#34;high-availability&#34;&gt;High Availability&lt;/h3&gt;
&lt;p&gt;Running Loki clustered is not possible with the filesystem store unless the filesystem is shared in some fashion (NFS for example).  However using shared filesystems is likely going to be a bad experience with Loki just as it is for almost every other application.&lt;/p&gt;
]]></content><description>&lt;h1 id="filesystem-object-store">Filesystem Object Store&lt;/h1>
&lt;p>The filesystem object store is the easiest to get started with Grafana Loki but there are some pros/cons to this approach.&lt;/p>
&lt;p>Very simply it stores all the objects (chunks) in the specified directory:&lt;/p></description></item><item><title>Retention</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/retention/</link><pubDate>Tue, 16 Jul 2024 15:42:20 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/retention/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-loki-storage-retention&#34;&gt;Grafana Loki Storage Retention&lt;/h1&gt;
&lt;p&gt;Retention in Grafana Loki is achieved either through the &lt;a href=&#34;#table-manager&#34;&gt;Table Manager&lt;/a&gt; or the &lt;a href=&#34;#compactor&#34;&gt;Compactor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Retention through the &lt;a href=&#34;../table-manager/&#34;&gt;Table Manager&lt;/a&gt; is achieved by relying on the object store TTL feature, and will work for both &lt;a href=&#34;../boltdb-shipper&#34;&gt;boltdb-shipper&lt;/a&gt; store and chunk/index store. However retention through the &lt;a href=&#34;../boltdb-shipper#compactor&#34;&gt;Compactor&lt;/a&gt; is supported only with the &lt;a href=&#34;../boltdb-shipper&#34;&gt;boltdb-shipper&lt;/a&gt; store.&lt;/p&gt;
&lt;p&gt;The Compactor retention will become the default and have long term support. It supports more granular retention policies on per tenant and per stream use cases.&lt;/p&gt;
&lt;h2 id=&#34;compactor&#34;&gt;Compactor&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;../boltdb-shipper#compactor&#34;&gt;Compactor&lt;/a&gt; can deduplicate index entries. It can also apply granular retention. When applying retention with the Compactor, the &lt;a href=&#34;../table-manager/&#34;&gt;Table Manager&lt;/a&gt; is unnecessary.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Run the compactor as a singleton (a single instance).&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Compaction and retention are idempotent. If the compactor restarts, it will continue from where it left off.&lt;/p&gt;
&lt;p&gt;The Compactor loops to apply compaction and retention at every &lt;code&gt;compaction_interval&lt;/code&gt;, or as soon as possible if running behind.&lt;/p&gt;
&lt;p&gt;The compactor&amp;rsquo;s algorithm to update the index:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For each table within each day:
&lt;ul&gt;
&lt;li&gt;Compact the table into a single index file.&lt;/li&gt;
&lt;li&gt;Traverse the entire index. Use the tenant configuration to identify and mark chunks that need to be removed.&lt;/li&gt;
&lt;li&gt;Remove marked chunks from the index and save their reference in a file on disk.&lt;/li&gt;
&lt;li&gt;Upload the new modified index files.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The retention algorithm is applied to the index. Chunks are not deleted while applying the retention algorithm. The chunks will be deleted by the compactor asynchronously when swept.&lt;/p&gt;
&lt;p&gt;Marked chunks will only  be deleted after &lt;code&gt;retention_delete_delay&lt;/code&gt; configured is expired because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;boltdb-shipper indexes are refreshed from the shared store on components using it (querier and ruler) at a specific interval. This means deleting chunks instantly could lead to components still having reference to old chunks and so they could fails to execute queries. Having a delay allows for components to refresh their store and so remove gracefully their reference of those chunks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It provides a short window of time in which to cancel chunk deletion in the case of a configuration mistake.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Marker files (containing chunks to delete) should be stored on a persistent disk, since the disk will be the sole reference to them.&lt;/p&gt;
&lt;h3 id=&#34;retention-configuration&#34;&gt;Retention Configuration&lt;/h3&gt;
&lt;p&gt;This compactor configuration example activates retention.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;compactor:
  working_directory: /data/retention
  shared_store: gcs
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150
schema_config:
    configs:
      - from: &amp;#34;2020-07-31&amp;#34;
        index:
            period: 24h
            prefix: loki_index_
        object_store: gcs
        schema: v11
        store: boltdb-shipper
storage_config:
    boltdb_shipper:
        active_index_directory: /data/index
        cache_location: /data/boltdb-cache
        shared_store: gcs
    gcs:
        bucket_name: loki&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;Note that retention is only available if the index period is 24h.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Set &lt;code&gt;retention_enabled&lt;/code&gt; to true. Without this, the Compactor will only compact tables.&lt;/p&gt;
&lt;p&gt;Define &lt;code&gt;schema_config&lt;/code&gt; and &lt;code&gt;storage_config&lt;/code&gt; to access the storage.&lt;/p&gt;
&lt;p&gt;The index period must be 24h.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;working_directory&lt;/code&gt; is the directory where marked chunks and temporary tables will be saved.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;compaction_interval&lt;/code&gt; dictates how often compaction and/or retention is applied. If the Compactor falls behind, compaction and/or retention occur as soon as possible.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;retention_delete_delay&lt;/code&gt; is the delay after which the compactor will delete marked chunks.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;retention_delete_worker_count&lt;/code&gt; specifies the maximum quantity of goroutine workers instantiated to delete chunks.&lt;/p&gt;
&lt;h4 id=&#34;configuring-the-retention-period&#34;&gt;Configuring the retention period&lt;/h4&gt;
&lt;p&gt;Retention period is configured within the &lt;a href=&#34;./../../../configuration/#limits_config&#34;&gt;&lt;code&gt;limits_config&lt;/code&gt;&lt;/a&gt; configuration section.&lt;/p&gt;
&lt;p&gt;There are two ways of setting retention policies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;retention_period&lt;/code&gt; which is applied globally.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;retention_stream&lt;/code&gt; which is only applied to chunks matching the selector&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;The minimum retention period is 24h.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;This example configures global retention:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;...
limits_config:
  retention_period: 744h
  retention_stream:
  - selector: &amp;#39;{namespace=&amp;#34;dev&amp;#34;}&amp;#39;
    priority: 1
    period: 24h
  per_tenant_override_config: /etc/overrides.yaml
...&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Per tenant retention can be defined using the &lt;code&gt;/etc/overrides.yaml&lt;/code&gt; files. For example:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;overrides:
    &amp;#34;29&amp;#34;:
        retention_period: 168h
        retention_stream:
        - selector: &amp;#39;{namespace=&amp;#34;prod&amp;#34;}&amp;#39;
          priority: 2
          period: 336h
        - selector: &amp;#39;{container=&amp;#34;loki&amp;#34;}&amp;#39;
          priority: 1
          period: 72h
    &amp;#34;30&amp;#34;:
        retention_stream:
        - selector: &amp;#39;{container=&amp;#34;nginx&amp;#34;}&amp;#39;
          priority: 1
          period: 24h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;A rule to apply is selected by choosing the first in this list that matches:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;If a per-tenant &lt;code&gt;retention_stream&lt;/code&gt; matches the current stream, the highest priority is picked.&lt;/li&gt;
&lt;li&gt;If a global &lt;code&gt;retention_stream&lt;/code&gt; matches the current stream, the highest priority is picked.&lt;/li&gt;
&lt;li&gt;If a per-tenant &lt;code&gt;retention_period&lt;/code&gt; is specified, it will be applied.&lt;/li&gt;
&lt;li&gt;The global &lt;code&gt;retention_period&lt;/code&gt; will be selected if nothing else matched.&lt;/li&gt;
&lt;li&gt;If no global &lt;code&gt;retention_period&lt;/code&gt; is specified, the default value of &lt;code&gt;744h&lt;/code&gt; (30days) retention is used.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Stream matching uses the same syntax as Prometheus label matching:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;=&lt;/code&gt;: Select labels that are exactly equal to the provided string.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;!=&lt;/code&gt;: Select labels that are not equal to the provided string.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;=~&lt;/code&gt;: Select labels that regex-match the provided string.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;!~&lt;/code&gt;: Select labels that do not regex-match the provided string.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The example configurations will set these rules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;All tenants except &lt;code&gt;29&lt;/code&gt; and &lt;code&gt;30&lt;/code&gt; in the &lt;code&gt;dev&lt;/code&gt; namespace will have a retention period of &lt;code&gt;24h&lt;/code&gt; hours.&lt;/li&gt;
&lt;li&gt;All tenants except &lt;code&gt;29&lt;/code&gt; and &lt;code&gt;30&lt;/code&gt; that are not in the &lt;code&gt;dev&lt;/code&gt; namespace will have the retention period of &lt;code&gt;744h&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;For tenant &lt;code&gt;29&lt;/code&gt;:
&lt;ul&gt;
&lt;li&gt;All streams except those in the container &lt;code&gt;loki&lt;/code&gt; or in the namespace &lt;code&gt;prod&lt;/code&gt; will have retention period of &lt;code&gt;168h&lt;/code&gt; (1 week).&lt;/li&gt;
&lt;li&gt;All streams in the &lt;code&gt;prod&lt;/code&gt; namespace will have a retention period of &lt;code&gt;336h&lt;/code&gt; (2 weeks), even if the container label is &lt;code&gt;loki&lt;/code&gt;, since the priority of the &lt;code&gt;prod&lt;/code&gt; rule is higher.&lt;/li&gt;
&lt;li&gt;Streams that have the container label &lt;code&gt;loki&lt;/code&gt; but are not in the namespace &lt;code&gt;prod&lt;/code&gt; will have a &lt;code&gt;72h&lt;/code&gt; retention period.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;For tenant &lt;code&gt;30&lt;/code&gt;:
&lt;ul&gt;
&lt;li&gt;All streams except those having the container label &lt;code&gt;nginx&lt;/code&gt; will have the global retention period of &lt;code&gt;744h&lt;/code&gt;, since there is no override specified.&lt;/li&gt;
&lt;li&gt;Streams that have the label &lt;code&gt;nginx&lt;/code&gt; will have a retention period of &lt;code&gt;24h&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;table-manager&#34;&gt;Table Manager&lt;/h2&gt;
&lt;p&gt;In order to enable the retention support, the Table Manager needs to be
configured to enable deletions and a retention period. Refer to the
&lt;a href=&#34;../../../configuration#table_manager&#34;&gt;&lt;code&gt;table_manager&lt;/code&gt;&lt;/a&gt;
section of the Loki configuration reference for all available options.
Alternatively, the &lt;code&gt;table-manager.retention-period&lt;/code&gt; and
&lt;code&gt;table-manager.retention-deletes-enabled&lt;/code&gt; command line flags can be used. The
provided retention period needs to be a duration represented as a string that
can be parsed using the Prometheus common model &lt;a href=&#34;https://pkg.go.dev/github.com/prometheus/common/model#ParseDuration&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;ParseDuration&lt;/a&gt;. Examples: &lt;code&gt;7d&lt;/code&gt;, &lt;code&gt;1w&lt;/code&gt;, &lt;code&gt;168h&lt;/code&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;WARNING&lt;/strong&gt;: The retention period must be a multiple of the index and chunks table
&lt;code&gt;period&lt;/code&gt;, configured in the &lt;a href=&#34;../../../configuration#period_config&#34;&gt;&lt;code&gt;period_config&lt;/code&gt;&lt;/a&gt;
block. See the &lt;a href=&#34;../table-manager#retention&#34;&gt;Table Manager&lt;/a&gt; documentation for
more information.&lt;/p&gt;&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: To avoid querying of data beyond the retention period,
&lt;code&gt;max_look_back_period&lt;/code&gt; config in &lt;a href=&#34;../../../configuration#chunk_store_config&#34;&gt;&lt;code&gt;chunk_store_config&lt;/code&gt;&lt;/a&gt; must be set to a value less than or equal to
what is set in &lt;code&gt;table_manager.retention_period&lt;/code&gt;.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;When using S3 or GCS, the bucket storing the chunks needs to have the expiry
policy set correctly. For more details check
&lt;a href=&#34;https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;S3&amp;rsquo;s documentation&lt;/a&gt;
or
&lt;a href=&#34;https://cloud.google.com/storage/docs/managing-lifecycles&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;GCS&amp;rsquo;s documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Currently, the retention policy can only be set globally. A per-tenant retention
policy with an API to delete ingested logs is still under development.&lt;/p&gt;
&lt;p&gt;Since a design goal of Loki is to make storing logs cheap, a volume-based
deletion API is deprioritized. Until this feature is released, if you suddenly
must delete ingested logs, you can delete old chunks in your object store. Note,
however, that this only deletes the log content and keeps the label index
intact; you will still be able to see related labels but will be unable to
retrieve the deleted log content.&lt;/p&gt;
&lt;p&gt;For further details on the Table Manager internals, refer to the
&lt;a href=&#34;../table-manager/&#34;&gt;Table Manager&lt;/a&gt; documentation.&lt;/p&gt;
&lt;h2 id=&#34;example-configuration&#34;&gt;Example Configuration&lt;/h2&gt;
&lt;p&gt;Example configuration with GCS with a 28 day retention:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;schema_config:
  configs:
  - from: 2018-04-15
    store: bigtable
    object_store: gcs
    schema: v11
    index:
      prefix: loki_index_
      period: 168h

storage_config:
  bigtable:
    instance: BIGTABLE_INSTANCE
    project: BIGTABLE_PROJECT
  gcs:
    bucket_name: GCS_BUCKET_NAME

chunk_store_config:
  max_look_back_period: 672h

table_manager:
  retention_deletes_enabled: true
  retention_period: 672h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="grafana-loki-storage-retention">Grafana Loki Storage Retention&lt;/h1>
&lt;p>Retention in Grafana Loki is achieved either through the &lt;a href="#table-manager">Table Manager&lt;/a> or the &lt;a href="#compactor">Compactor&lt;/a>.&lt;/p>
&lt;p>Retention through the &lt;a href="../table-manager/">Table Manager&lt;/a> is achieved by relying on the object store TTL feature, and will work for both &lt;a href="../boltdb-shipper">boltdb-shipper&lt;/a> store and chunk/index store. However retention through the &lt;a href="../boltdb-shipper#compactor">Compactor&lt;/a> is supported only with the &lt;a href="../boltdb-shipper">boltdb-shipper&lt;/a> store.&lt;/p></description></item><item><title>Single Store (boltdb-shipper)</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/boltdb-shipper/</link><pubDate>Tue, 16 Jul 2024 15:42:20 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/boltdb-shipper/</guid><content><![CDATA[&lt;h1 id=&#34;single-store-loki-boltdb-shipper-index-type&#34;&gt;Single Store Loki (boltdb-shipper index type)&lt;/h1&gt;
&lt;p&gt;BoltDB Shipper lets you run Grafana Loki without any dependency on NoSQL stores for storing index.
It locally stores the index in BoltDB files instead and keeps shipping those files to a shared object store i.e the same object store which is being used for storing chunks.
It also keeps syncing BoltDB files from shared object store to a configured local directory for getting index entries created by other services of same Loki cluster.
This helps run Loki with one less dependency and also saves costs in storage since object stores are likely to be much cheaper compared to cost of a hosted NoSQL store or running a self hosted instance of Cassandra.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; BoltDB shipper works best with 24h periodic index files. It is a requirement to have index period set to 24h for either active or upcoming usage of boltdb-shipper.
If boltdb-shipper already has created index files with 7 days period, and you want to retain previous data then just add a new schema config using boltdb-shipper with a future date and index files period set to 24h.&lt;/p&gt;
&lt;h2 id=&#34;example-configuration&#34;&gt;Example Configuration&lt;/h2&gt;
&lt;p&gt;Example configuration with GCS:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;schema_config:
  configs:
    - from: 2018-04-15
      store: boltdb-shipper
      object_store: gcs
      schema: v11
      index:
        prefix: loki_index_
        period: 24h

storage_config:
  gcs:
    bucket_name: GCS_BUCKET_NAME

  boltdb_shipper:
    active_index_directory: /loki/index
    shared_store: gcs
    cache_location: /loki/boltdb-cache&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This would run Loki with BoltDB Shipper storing BoltDB files locally at &lt;code&gt;/loki/index&lt;/code&gt; and chunks at configured &lt;code&gt;GCS_BUCKET_NAME&lt;/code&gt;.
It would also keep shipping BoltDB files periodically to same configured bucket.
It would also keep downloading BoltDB files from shared bucket uploaded by other ingesters to &lt;code&gt;/loki/boltdb-cache&lt;/code&gt; folder locally.&lt;/p&gt;
&lt;h2 id=&#34;operational-details&#34;&gt;Operational Details&lt;/h2&gt;
&lt;p&gt;Loki can be configured to run as just a single vertically scaled instance or as a cluster of horizontally scaled single binary(running all Loki services) instances or in micro-services mode running just one of the services in each instance.
When it comes to reads and writes, Ingesters are the ones which writes the index and chunks to stores and Queriers are the ones which reads index and chunks from the store for serving requests.&lt;/p&gt;
&lt;p&gt;Before we get into more details, it is important to understand how Loki manages index in stores. Loki shards index as per configured period which defaults to seven days i.e when it comes to table based stores like Bigtable/Cassandra/DynamoDB there would be separate table per week containing index for that week.
In the case of BoltDB Shipper, a table is defined by a collection of many smaller BoltDB files, each file storing just 15 mins worth of index. Tables created per day are identified by a configured &lt;code&gt;prefix_&lt;/code&gt; &#43; &lt;code&gt;&amp;lt;period-number-since-epoch&amp;gt;&lt;/code&gt;.
Here &lt;code&gt;&amp;lt;period-number-since-epoch&amp;gt;&lt;/code&gt; in case of boltdb-shipper would be day number since epoch.
For example, if you have a prefix set to &lt;code&gt;loki_index_&lt;/code&gt; and a write request comes in on 20th April 2020, it would be stored in a table named loki_index_18372 because it has been &lt;code&gt;18371&lt;/code&gt; days since the epoch, and we are in &lt;code&gt;18372&lt;/code&gt;th day.
Since sharding of index creates multiple files when using BoltDB, BoltDB Shipper would create a folder per day and add files for that day in that folder and names those files after ingesters which created them.&lt;/p&gt;
&lt;p&gt;To reduce the size of files which help with faster transfer speeds and reduced storage costs, they are stored after compressing them with gzip.&lt;/p&gt;
&lt;p&gt;To show how BoltDB files in shared object store would look like, let us consider 2 ingesters named &lt;code&gt;ingester-0&lt;/code&gt; and &lt;code&gt;ingester-1&lt;/code&gt; running in a Loki cluster, and
they both having shipped files for day &lt;code&gt;18371&lt;/code&gt; and &lt;code&gt;18372&lt;/code&gt; with prefix &lt;code&gt;loki_index_&lt;/code&gt;, here is how the files would look like:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;└── index
    ├── loki_index_18371
    │   ├── ingester-0-1587254400.gz
    │   └── ingester-1-1587255300.gz
    |   ...
    └── loki_index_18372
        ├── ingester-0-1587254400.gz
        └── ingester-1-1587254400.gz
        ...&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; We also add a timestamp to names of the files to randomize the names to avoid overwriting files when running Ingesters with same name and not have a persistent storage. Timestamps not shown here for simplification.&lt;/p&gt;
&lt;p&gt;Let us talk about more in depth about how both Ingesters and Queriers work when running them with BoltDB Shipper.&lt;/p&gt;
&lt;h3 id=&#34;ingesters&#34;&gt;Ingesters&lt;/h3&gt;
&lt;p&gt;Ingesters keep writing the index to BoltDB files in &lt;code&gt;active_index_directory&lt;/code&gt; and BoltDB Shipper keeps looking for new and updated files in that directory every 15 Minutes to upload them to the shared object store.
When running Loki in clustered mode there could be multiple ingesters serving write requests hence each of them generating BoltDB files locally.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; To avoid any loss of index when Ingester crashes it is recommended to run Ingesters as statefulset(when using k8s) with a persistent storage for storing index files.&lt;/p&gt;
&lt;p&gt;Another important detail to note is when chunks are flushed they are available for reads in object store instantly while index is not since we only upload them every 15 Minutes with BoltDB shipper.
Ingesters expose a new RPC for letting Queriers query the Ingester&amp;rsquo;s local index for chunks which were recently flushed but its index might not be available yet with Queriers.
For all the queries which require chunks to be read from the store, Queriers also query Ingesters over RPC for IDs of chunks which were recently flushed which is to avoid missing any logs from queries.&lt;/p&gt;
&lt;h3 id=&#34;queriers&#34;&gt;Queriers&lt;/h3&gt;
&lt;p&gt;To avoid running Queriers as a StatefulSet with persistent storage, we recommend running an Index Gateway. An Index Gateway will download and synchronize the index, and it will serve it over gRPC to Queriers and Rulers.&lt;/p&gt;
&lt;p&gt;Queriers lazily loads BoltDB files from shared object store to configured &lt;code&gt;cache_location&lt;/code&gt;.
When a querier receives a read request, the query range from the request is resolved to period numbers and all the files for those period numbers are downloaded to &lt;code&gt;cache_location&lt;/code&gt;, if not already.
Once we have downloaded files for a period we keep looking for updates in shared object store and download them every 5 Minutes by default.
Frequency for checking updates can be configured with &lt;code&gt;resync_interval&lt;/code&gt; config.&lt;/p&gt;
&lt;p&gt;To avoid keeping downloaded index files forever there is a ttl for them which defaults to 24 hours, which means if index files for a period are not used for 24 hours they would be removed from cache location.
ttl can be configured using &lt;code&gt;cache_ttl&lt;/code&gt; config.&lt;/p&gt;
&lt;p&gt;Within Kubernetes, if you are not using an Index Gateway, we recommend running Queriers as a StatefulSet with persistent storage for downloading and querying index files. This will obtain better read performance, and it will avoid using node disk.&lt;/p&gt;
&lt;h3 id=&#34;index-gateway&#34;&gt;Index Gateway&lt;/h3&gt;
&lt;p&gt;An Index Gateway downloads and synchronizes the BoltDB index from the Object Storage in order to serve index queries to the Queriers and Rulers over gRPC.
This avoids running Queriers and Rulers with a disk for persistence. Disks can become costly in a big cluster.&lt;/p&gt;
&lt;p&gt;To run an Index Gateway, configure &lt;a href=&#34;../../../configuration/#storage_config&#34;&gt;StorageConfig&lt;/a&gt; and set the &lt;code&gt;-target&lt;/code&gt; CLI flag to &lt;code&gt;index-gateway&lt;/code&gt;.
To connect Queriers and Rulers to the Index Gateway, set the address (with gRPC port) of the Index Gateway with the &lt;code&gt;-boltdb.shipper.index-gateway-client.server-address&lt;/code&gt; CLI flag or its equivalent YAML value under &lt;a href=&#34;../../../configuration/#storage_config&#34;&gt;StorageConfig&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When using the Index Gateway within Kubernetes, we recommend using a StatefulSet with persistent storage for downloading and querying index files. This can obtain better read performance, avoids &lt;a href=&#34;https://en.wikipedia.org/wiki/Cloud_computing_issues#Performance_interference_and_noisy_neighbors&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;noisy neighbor problems&lt;/a&gt; by not using the node disk, and avoids the time consuming index downloading step on startup after rescheduling to a new node.&lt;/p&gt;
&lt;h3 id=&#34;write-deduplication-disabled&#34;&gt;Write Deduplication disabled&lt;/h3&gt;
&lt;p&gt;Loki does write deduplication of chunks and index using Chunks and WriteDedupe cache respectively, configured with &lt;a href=&#34;../../../configuration/#chunk_store_config&#34;&gt;ChunkStoreConfig&lt;/a&gt;.
The problem with write deduplication when using &lt;code&gt;boltdb-shipper&lt;/code&gt; though is ingesters only keep uploading boltdb files periodically to make them available to all the other services which means there would be a brief period where some of the services would not have received updated index yet.
The problem due to that is if an ingester which first wrote the chunks and index goes down and all the other ingesters which were part of replication scheme skipped writing those chunks and index due to deduplication, we would end up missing those logs from query responses since only the ingester which had the index went down.
This problem would be faced even during rollouts which is quite common.&lt;/p&gt;
&lt;p&gt;To avoid this, Loki disables deduplication of index when the replication factor is greater than 1 and &lt;code&gt;boltdb-shipper&lt;/code&gt; is an active or upcoming index type.
While using &lt;code&gt;boltdb-shipper&lt;/code&gt; avoid configuring WriteDedupe cache since it is used purely for the index deduplication, so it would not be used anyways.&lt;/p&gt;
&lt;h3 id=&#34;compactor&#34;&gt;Compactor&lt;/h3&gt;
&lt;p&gt;Compactor is a BoltDB Shipper specific service that reduces the index size by deduping the index and merging all the files to a single file per table.
We recommend running a Compactor since a single Ingester creates 96 files per day which include a lot of duplicate index entries and querying multiple files per table adds up the overall query latency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There should be only 1 compactor instance running at a time that otherwise could create problems and may lead to data loss.&lt;/p&gt;
&lt;p&gt;Example compactor configuration with GCS:&lt;/p&gt;
&lt;h4 id=&#34;delete-permissions&#34;&gt;Delete Permissions&lt;/h4&gt;
&lt;p&gt;The compactor is an optional but suggested component that combines and deduplicates the boltdb-shipper index files. When compacting index files, the compactor writes a new file and deletes unoptimized files. Ensure that the compactor has appropriate permissions for deleting files, for example, s3:DeleteObject permission for AWS S3.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;compactor:
  working_directory: /loki/compactor
  shared_store: gcs

storage_config:
  gcs:
    bucket_name: GCS_BUCKET_NAME&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="single-store-loki-boltdb-shipper-index-type">Single Store Loki (boltdb-shipper index type)&lt;/h1>
&lt;p>BoltDB Shipper lets you run Grafana Loki without any dependency on NoSQL stores for storing index.
It locally stores the index in BoltDB files instead and keeps shipping those files to a shared object store i.e the same object store which is being used for storing chunks.
It also keeps syncing BoltDB files from shared object store to a configured local directory for getting index entries created by other services of same Loki cluster.
This helps run Loki with one less dependency and also saves costs in storage since object stores are likely to be much cheaper compared to cost of a hosted NoSQL store or running a self hosted instance of Cassandra.&lt;/p></description></item><item><title>Table manager</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/table-manager/</link><pubDate>Tue, 16 Jul 2024 15:42:20 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/table-manager/</guid><content><![CDATA[&lt;h1 id=&#34;table-manager&#34;&gt;Table Manager&lt;/h1&gt;
&lt;p&gt;Grafana Loki supports storing indexes and chunks in table-based data storages. When
such a storage type is used, multiple tables are created over the time: each
table - also called periodic table - contains the data for a specific time
range.&lt;/p&gt;
&lt;p&gt;This design brings two main benefits:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Schema config changes&lt;/strong&gt;: each table is bounded to a schema config and
version, so that changes can be introduced over the time and multiple schema
configs can coexist&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Retention&lt;/strong&gt;: the retention is implemented deleting an entire table, which
allows to have fast delete operations&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The &lt;strong&gt;Table Manager&lt;/strong&gt; is a Loki component which takes care of creating a
periodic table before its time period begins, and deleting it once its data
time range exceeds the retention period.&lt;/p&gt;
&lt;p&gt;The Table Manager supports the following backends:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Index store&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;../boltdb-shipper/&#34;&gt;Single Store (boltdb-shipper)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://aws.amazon.com/dynamodb&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Amazon DynamoDB&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cloud.google.com/bigtable&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Google Bigtable&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cassandra.apache.org&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Apache Cassandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/boltdb/bolt&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;BoltDB&lt;/a&gt; (primarily used for local environments)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chunk store&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://aws.amazon.com/dynamodb&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Amazon DynamoDB&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cloud.google.com/bigtable&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Google Bigtable&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cassandra.apache.org&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Apache Cassandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Filesystem (primarily used for local environments)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The object storages - like Amazon S3 and Google Cloud Storage - supported by Loki
to store chunks, are not managed by the Table Manager, and a custom bucket policy
should be set to delete old data.&lt;/p&gt;
&lt;p&gt;For detailed information on configuring the Table Manager, refer to the
&lt;a href=&#34;../../../configuration#table_manager&#34;&gt;&lt;code&gt;table_manager&lt;/code&gt;&lt;/a&gt;
section in the Loki configuration document.&lt;/p&gt;
&lt;h2 id=&#34;tables-and-schema-config&#34;&gt;Tables and schema config&lt;/h2&gt;
&lt;p&gt;A periodic table stores the index or chunk data relative to a specific period
of time. The duration of the time range of the data stored in a single table and
its storage type is configured in the
&lt;a href=&#34;../../../configuration#schema_config&#34;&gt;&lt;code&gt;schema_config&lt;/code&gt;&lt;/a&gt; configuration
block.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;../../../configuration#schema_config&#34;&gt;&lt;code&gt;schema_config&lt;/code&gt;&lt;/a&gt; can contain
one or more &lt;code&gt;configs&lt;/code&gt;. Each config, defines the storage used between the day
set in &lt;code&gt;from&lt;/code&gt; (in the format &lt;code&gt;yyyy-mm-dd&lt;/code&gt;) and the next config, or &amp;ldquo;now&amp;rdquo;
in the case of the last schema config entry.&lt;/p&gt;
&lt;p&gt;This allows to have multiple non-overlapping schema configs over the time, in
order to perform schema version upgrades or change storage settings (including
changing the storage type).&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;../table-manager-periodic-tables.png&#34;
  alt=&#34;periodic_tables&#34;/&gt;&lt;/p&gt;
&lt;p&gt;The write path hits the table where the log entry timestamp falls into (usually
the last table, except short periods close to the end of a table and the
beginning of the next one), while the read path hits the tables containing data
for the query time range.&lt;/p&gt;
&lt;h3 id=&#34;schema-config-example&#34;&gt;Schema config example&lt;/h3&gt;
&lt;p&gt;For example, the following &lt;code&gt;schema_config&lt;/code&gt; defines two configurations: the first
one using the schema &lt;code&gt;v10&lt;/code&gt; and the current one using the &lt;code&gt;v11&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The first config stores data between &lt;code&gt;2019-01-01&lt;/code&gt; and &lt;code&gt;2019-04-14&lt;/code&gt; (included),
then a new config has been added - to upgrade the schema version to &lt;code&gt;v11&lt;/code&gt; -
storing data using the &lt;code&gt;v11&lt;/code&gt; schema from &lt;code&gt;2019-04-15&lt;/code&gt; on.&lt;/p&gt;
&lt;p&gt;For each config, multiple tables are created, each one storing data for
&lt;code&gt;period&lt;/code&gt; time (168 hours = 7 days).&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;schema_config:
  configs:
    - from:   2019-01-01
      store:  dynamo
      schema: v10
      index:
        prefix: loki_
        period: 168h
    - from:   2019-04-15
      store:  dynamo
      schema: v11
      index:
        prefix: loki_
        period: 168h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;table-creation&#34;&gt;Table creation&lt;/h3&gt;
&lt;p&gt;The Table Manager creates new tables slightly ahead of their start period, in
order to make sure that the new table is ready once the current table end
period is reached.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;creation_grace_period&lt;/code&gt; property - in the
&lt;a href=&#34;../../../configuration#table_manager&#34;&gt;&lt;code&gt;table_manager&lt;/code&gt;&lt;/a&gt;
configuration block - defines how long before a table should be created.&lt;/p&gt;
&lt;h2 id=&#34;retention&#34;&gt;Retention&lt;/h2&gt;
&lt;p&gt;The retention - managed by the Table Manager - is disabled by default, due to
its destructive nature. You can enable the data retention explicitly enabling
it in the configuration and setting a &lt;code&gt;retention_period&lt;/code&gt; greater than zero:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;table_manager:
  retention_deletes_enabled: true
  retention_period: 336h&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The Table Manager implements the retention deleting the entire tables whose
data exceeded the &lt;code&gt;retention_period&lt;/code&gt;. This design allows to have fast delete
operations, at the cost of having a retention granularity controlled by the
table&amp;rsquo;s &lt;code&gt;period&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Given each table contains data for &lt;code&gt;period&lt;/code&gt; of time and that the entire table
is deleted, the Table Manager keeps the last tables alive using this formula:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;number_of_tables_to_keep = floor(retention_period / table_period) &amp;#43; 1&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;../table-manager-retention.png&#34;
  alt=&#34;retention&#34;/&gt;&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s important to note that - due to the internal implementation - the table
&lt;code&gt;period&lt;/code&gt; and &lt;code&gt;retention_period&lt;/code&gt; &lt;strong&gt;must&lt;/strong&gt; be multiples of &lt;code&gt;24h&lt;/code&gt; in order to get
the expected behavior.&lt;/p&gt;
&lt;p&gt;For detailed information on configuring the retention, refer to the
&lt;a href=&#34;../retention/&#34;&gt;Loki Storage Retention&lt;/a&gt;
documentation.&lt;/p&gt;
&lt;h2 id=&#34;active--inactive-tables&#34;&gt;Active / inactive tables&lt;/h2&gt;
&lt;p&gt;A table can be active or inactive.&lt;/p&gt;
&lt;p&gt;A table is considered &lt;strong&gt;active&lt;/strong&gt; if the current time is within the range:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Table start period - &lt;a href=&#34;../../../configuration#table_manager&#34;&gt;&lt;code&gt;creation_grace_period&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Table end period &#43; max chunk age (hardcoded to &lt;code&gt;12h&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;../table-manager-active-vs-inactive-tables.png&#34;
  alt=&#34;active_vs_inactive_tables&#34;/&gt;&lt;/p&gt;
&lt;p&gt;Currently, the difference between an active and inactive table &lt;strong&gt;only applies
to the DynamoDB storage&lt;/strong&gt; settings: capacity mode (on-demand or provisioned),
read/write capacity units and autoscaling.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;DynamoDB&lt;/th&gt;
              &lt;th&gt;Active table&lt;/th&gt;
              &lt;th&gt;Inactive table&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;Capacity mode&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;enable_ondemand_throughput_mode&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;enable_inactive_throughput_on_demand_mode&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;Read capacity unit&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;provisioned_read_throughput&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;inactive_read_throughput&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;Write capacity unit&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;provisioned_write_throughput&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;&lt;code&gt;inactive_write_throughput&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;Autoscaling&lt;/td&gt;
              &lt;td&gt;Enabled (if configured)&lt;/td&gt;
              &lt;td&gt;Always disabled&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;dynamodb-provisioning&#34;&gt;DynamoDB Provisioning&lt;/h2&gt;
&lt;p&gt;When configuring DynamoDB with the Table Manager, the default &lt;a href=&#34;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;on-demand
provisioning&lt;/a&gt;
capacity units for reads are set to 300 and writes are set to 3000. The
defaults can be overwritten:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;YAML&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-yaml&#34;&gt;table_manager:
  index_tables_provisioning:
    provisioned_write_throughput: 10
    provisioned_read_throughput: 10
  chunk_tables_provisioning:
    provisioned_write_throughput: 10
    provisioned_read_throughput: 10&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;If Table Manager is not automatically managing DynamoDB, old data cannot easily
be erased and the index will grow indefinitely. Manual configurations should
ensure that the primary index key is set to &lt;code&gt;h&lt;/code&gt; (string) and the sort key is set
to &lt;code&gt;r&lt;/code&gt; (binary). The &amp;ldquo;period&amp;rdquo; attribute in the configuration YAML should be set
to &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;table-manager-deployment-mode&#34;&gt;Table Manager deployment mode&lt;/h2&gt;
&lt;p&gt;The Table Manager can be executed in two ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Implicitly executed when Loki runs in monolithic mode (single process)&lt;/li&gt;
&lt;li&gt;Explicitly executed when Loki runs in microservices mode&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;monolithic-mode&#34;&gt;Monolithic mode&lt;/h3&gt;
&lt;p&gt;When Loki runs in &lt;a href=&#34;../../../fundamentals/architecture#modes-of-operation&#34;&gt;monolithic mode&lt;/a&gt;,
the Table Manager is also started as component of the entire stack.&lt;/p&gt;
&lt;h3 id=&#34;microservices-mode&#34;&gt;Microservices mode&lt;/h3&gt;
&lt;p&gt;When Loki runs in &lt;a href=&#34;../../../fundamentals/architecture#modes-of-operation&#34;&gt;microservices mode&lt;/a&gt;,
the Table Manager should be started as separate service named &lt;code&gt;table-manager&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can check out a production grade deployment example at
&lt;a href=&#34;https://github.com/grafana/loki/tree/master/production/ksonnet/loki/table-manager.libsonnet&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;&lt;code&gt;table-manager.libsonnet&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="table-manager">Table Manager&lt;/h1>
&lt;p>Grafana Loki supports storing indexes and chunks in table-based data storages. When
such a storage type is used, multiple tables are created over the time: each
table - also called periodic table - contains the data for a specific time
range.&lt;/p></description></item><item><title>Write Ahead Log</title><link>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/wal/</link><pubDate>Mon, 14 Apr 2025 21:05:47 +0000</pubDate><guid>https://grafana.com/docs/enterprise-logs/v1.9.x/loki/operations/storage/wal/</guid><content><![CDATA[&lt;h1 id=&#34;write-ahead-log-wal&#34;&gt;Write Ahead Log (WAL)&lt;/h1&gt;
&lt;p&gt;Ingesters temporarily store data in memory. In the event of a crash, there could be data loss. The WAL helps fill this gap in reliability.&lt;/p&gt;
&lt;p&gt;The WAL in Grafana Loki records incoming data and stores it on the local file system in order to guarantee persistence of acknowledged data in the event of a process crash. Upon restart, Loki will &amp;ldquo;replay&amp;rdquo; all of the data in the log before registering itself as ready for subsequent writes. This allows Loki to maintain the performance &amp;amp; cost benefits of buffering data in memory &lt;em&gt;and&lt;/em&gt; durability benefits (it won&amp;rsquo;t lose data once a write has been acknowledged).&lt;/p&gt;
&lt;p&gt;This section will use Kubernetes as a reference deployment paradigm in the examples.&lt;/p&gt;
&lt;h2 id=&#34;disclaimer--wal-nuances&#34;&gt;Disclaimer &amp;amp; WAL nuances&lt;/h2&gt;
&lt;p&gt;The Write Ahead Log in Loki takes a few particular tradeoffs compared to other WALs you may be familiar with. The WAL aims to add additional durability guarantees, but &lt;em&gt;not at the expense of availability&lt;/em&gt;. Particularly, there are two scenarios where the WAL sacrifices these guarantees.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Corruption/Deletion of the WAL prior to replaying it&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In the event the WAL is corrupted/partially deleted, Loki will not be able to recover all of it&amp;rsquo;s data. In this case, Loki will attempt to recover any data it can, but will not prevent Loki from starting.&lt;/p&gt;
&lt;p&gt;Note: the Prometheus metric &lt;code&gt;loki_ingester_wal_corruptions_total&lt;/code&gt; can be used to track and alert when this happens.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;No space left on disk&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In the event the underlying WAL disk is full, Loki will not fail incoming writes, but neither will it log them to the WAL. In this case, the persistence guarantees across process restarts will not hold.&lt;/p&gt;
&lt;p&gt;Note: the Prometheus metric &lt;code&gt;loki_ingester_wal_disk_full_failures_total&lt;/code&gt; can be used to track and alert when this happens.&lt;/p&gt;
&lt;h3 id=&#34;backpressure&#34;&gt;Backpressure&lt;/h3&gt;
&lt;p&gt;The WAL also includes a backpressure mechanism to allow a large WAL to be replayed within a smaller memory bound. This is helpful after bad scenarios (i.e. an outage) when a WAL has grown past the point it may be recovered in memory. In this case, the ingester will track the amount of data being replayed and once it&amp;rsquo;s passed the &lt;code&gt;ingester.wal-replay-memory-ceiling&lt;/code&gt; threshold, will flush to storage. When this happens, it&amp;rsquo;s likely that Loki&amp;rsquo;s attempt to deduplicate chunks via content addressable storage will suffer. We deemed this efficiency loss an acceptable tradeoff considering how it simplifies operation and that it should not occur during regular operation (rollouts, rescheduling) where the WAL can be replayed without triggering this threshold.&lt;/p&gt;
&lt;h3 id=&#34;metrics&#34;&gt;Metrics&lt;/h3&gt;
&lt;h2 id=&#34;changes-to-deployment&#34;&gt;Changes to deployment&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Since ingesters need to have the same persistent volume across restarts/rollout, all the ingesters should be run on &lt;a href=&#34;https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;statefulset&lt;/a&gt; with fixed volumes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Following flags needs to be set&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;--ingester.wal-enabled&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt; which enables writing to WAL during ingestion.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--ingester.wal-dir&lt;/code&gt; to the directory where the WAL data should be stored and/or recovered from. Note that this should be on the mounted volume.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--ingester.checkpoint-duration&lt;/code&gt; to the interval at which checkpoints should be created.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--ingester.wal-replay-memory-ceiling&lt;/code&gt; (default 4GB) may be set higher/lower depending on your resource settings. It handles memory pressure during WAL replays, allowing a WAL many times larger than available memory to be replayed. This is provided to minimize reconciliation time after very bad situations, i.e. an outage, and will likely not impact regular operations/rollouts &lt;em&gt;at all&lt;/em&gt;. We suggest setting this to a high percentage (~75%) of available memory.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;changes-in-lifecycle-when-wal-is-enabled&#34;&gt;Changes in lifecycle when WAL is enabled&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Flushing of data to chunk store during rollouts or scale down is disabled. This is because during a rollout of statefulset there are no ingesters that are simultaneously leaving and joining, rather the same ingester is shut down and brought back again with updated config. Hence flushing is skipped and the data is recovered from the WAL.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;disk-space-requirements&#34;&gt;Disk space requirements&lt;/h2&gt;
&lt;p&gt;Based on tests in real world:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Numbers from an ingester with 5000 series ingesting ~5mb/s.&lt;/li&gt;
&lt;li&gt;Checkpoint period was 5mins.&lt;/li&gt;
&lt;li&gt;disk utilization on a WAL-only disk was steady at ~10-15GB.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You should not target 100% disk utilisation.&lt;/p&gt;
&lt;h2 id=&#34;migrating-from-stateless-deployments&#34;&gt;Migrating from stateless deployments&lt;/h2&gt;
&lt;p&gt;The ingester &lt;em&gt;deployment without WAL&lt;/em&gt; and &lt;em&gt;statefulset with WAL&lt;/em&gt; should be scaled down and up respectively in sync without transfer of data between them to ensure that any ingestion after migration is reliable immediately.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s take an example of 4 ingesters. The migration would look something like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Bring up one stateful ingester &lt;code&gt;ingester-0&lt;/code&gt; and wait until it&amp;rsquo;s ready (accepting read and write requests).&lt;/li&gt;
&lt;li&gt;Scale down the old ingester deployment to 3 and wait until the leaving ingester flushes all the data to chunk store.&lt;/li&gt;
&lt;li&gt;Once that ingester has disappeared from &lt;code&gt;kc get pods ...&lt;/code&gt;, add another stateful ingester and wait until it&amp;rsquo;s ready. Now you have &lt;code&gt;ingester-0&lt;/code&gt; and &lt;code&gt;ingester-1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Repeat step 2 to reduce remove another ingester from old deployment.&lt;/li&gt;
&lt;li&gt;Repeat step 3 to add another stateful ingester. Now you have &lt;code&gt;ingester-0 ingester-1 ingester-2&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Repeat step 4 and 5, and now you will finally have &lt;code&gt;ingester-0 ingester-1 ingester-2 ingester-3&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;how-to-scale-updown&#34;&gt;How to scale up/down&lt;/h2&gt;
&lt;h3 id=&#34;scale-up&#34;&gt;Scale up&lt;/h3&gt;
&lt;p&gt;Scaling up is same as what you would do without WAL or statefulsets. Nothing to change here.&lt;/p&gt;
&lt;h3 id=&#34;scale-down&#34;&gt;Scale down&lt;/h3&gt;
&lt;p&gt;When scaling down, we must ensure existing data on the leaving ingesters are flushed to storage instead of just the WAL. This is because we won&amp;rsquo;t be replaying the WAL on an ingester that will no longer exist and we need to make sure the data is not orphaned.&lt;/p&gt;
&lt;p&gt;Consider you have 4 ingesters &lt;code&gt;ingester-0 ingester-1 ingester-2 ingester-3&lt;/code&gt; and you want to scale down to 2 ingesters, the ingesters which will be shutdown according to statefulset rules are &lt;code&gt;ingester-3&lt;/code&gt; and then &lt;code&gt;ingester-2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Hence before actually scaling down in Kubernetes, port forward those ingesters and hit the &lt;a href=&#34;/docs/enterprise-logs/v1.9.x/loki/api/#post-ingesterflush_shutdown&#34;&gt;&lt;code&gt;/ingester/flush_shutdown&lt;/code&gt; endpoint&lt;/a&gt;. This will flush the chunks and remove itself from the ring, after which it will register as unready and may be deleted.&lt;/p&gt;
&lt;p&gt;After hitting the endpoint for &lt;code&gt;ingester-2 ingester-3&lt;/code&gt;, scale down the ingesters to 2.&lt;/p&gt;
&lt;h2 id=&#34;additional-notes&#34;&gt;Additional notes&lt;/h2&gt;
&lt;h3 id=&#34;kubernetes-hacking&#34;&gt;Kubernetes hacking&lt;/h3&gt;
&lt;p&gt;Statefulsets are significantly more cumbersome to work with/upgrade/etc. Much of this stems from immutable fields on the specification. For example, if one wants to start using the WAL with single store Loki and wants separate volume mounts for the WAL and the boltdb-shipper, you may see immutability errors when attempting updates the Kubernetes statefulsets.&lt;/p&gt;
&lt;p&gt;In this case, try &lt;code&gt;kubectl -n &amp;lt;namespace&amp;gt; delete sts ingester --cascade=false&lt;/code&gt;. This will leave the pods alive but delete the statefulset. Then you may recreate the (updated) statefulset and one-by-one start deleting the &lt;code&gt;ingester-0&lt;/code&gt; through &lt;code&gt;ingester-n&lt;/code&gt; pods &lt;em&gt;in that order&lt;/em&gt;, allowing the statefulset to spin up new pods to replace them.&lt;/p&gt;
&lt;h3 id=&#34;non-kubernetes-or-baremetal-deployments&#34;&gt;Non-Kubernetes or baremetal deployments&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;When the ingester restarts for any reason (upgrade, crash, etc), it should be able to attach to the same volume in order to recover back the WAL and tokens.&lt;/li&gt;
&lt;li&gt;2 ingesters should not be working with the same volume/directory for the WAL.&lt;/li&gt;
&lt;li&gt;A Rollout should bring down an ingester completely and then start the new ingester, not the other way around.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="write-ahead-log-wal">Write Ahead Log (WAL)&lt;/h1>
&lt;p>Ingesters temporarily store data in memory. In the event of a crash, there could be data loss. The WAL helps fill this gap in reliability.&lt;/p></description></item></channel></rss>