ElasticSearch

ElasticSearch cluster stats

ElasticSearch screenshot 1
ElasticSearch screenshot 2
ElasticSearch screenshot 3
ElasticSearch screenshot 4
ElasticSearch screenshot 5
<h4 id="you-will-need-the-prometheus-exporter-plugin-for-elasticsearch----to-run-this-dashboard">You will need the Prometheus exporter plugin for ElasticSearch ( <a href="https://github.com/justwatchcom/elasticsearch_exporter" target="_blank" rel="noopener noreferrer">https://github.com/justwatchcom/elasticsearch_exporter</a> ) to run this dashboard</h4> <h3 id="how-to-make-elasticsearch_exporter-on-centos-7">HOW TO make elasticsearch_exporter on centos-7:</h3> <div class="code-snippet "><div class="lang-toolbar"> <span class="lang-toolbar__item lang-toolbar__item-active">sh</span> <span class="code-clipboard"> <button x-data="app_code_snippet()" x-init="init()" @click="copy()"> <img class="code-clipboard__icon" src="/media/images/icons/icon-copy-small-2.svg" alt="Copy code to clipboard" width="14" height="13"> <span>Copy</span> </button> </span> <div class="lang-toolbar__border"></div> </div><div class="code-snippet "> <pre data-expanded="false"><code class="language-sh">yum -y install golang GOPATH=/usr/local go get -u github.com/justwatchcom/elasticsearch_exporter</code></pre> </div> </div> <h3 id="run">RUN:</h3> <div class="code-snippet "><div class="lang-toolbar"> <span class="lang-toolbar__item lang-toolbar__item-active">sh</span> <span class="code-clipboard"> <button x-data="app_code_snippet()" x-init="init()" @click="copy()"> <img class="code-clipboard__icon" src="/media/images/icons/icon-copy-small-2.svg" alt="Copy code to clipboard" width="14" height="13"> <span>Copy</span> </button> </span> <div class="lang-toolbar__border"></div> </div><div class="code-snippet "> <pre data-expanded="false"><code class="language-sh">cat &lt;&lt; EOF &gt; /etc/systemd/system/elasticsearch_exporter.service [Unit] Description=Prometheus elasticsearch_exporter After=local-fs.target network-online.target network.target Wants=local-fs.target network-online.target network.target [Service] User=root Nice=10 ExecStart = /usr/local/bin/elasticsearch_exporter -es.all -es.indices -es.timeout 20s ExecStop= /usr/bin/killall elasticsearch_exporter [Install] WantedBy=default.target EOF systemctl daemon-reload systemctl enable elasticsearch_exporter.service systemctl start elasticsearch_exporter.service</code></pre> </div> </div> <h3 id="exampe-config-for-prometheusyml">Exampe config for prometheus.yml:</h3> <div class="code-snippet "><div class="lang-toolbar"> <span class="lang-toolbar__item lang-toolbar__item-active">sh</span> <span class="code-clipboard"> <button x-data="app_code_snippet()" x-init="init()" @click="copy()"> <img class="code-clipboard__icon" src="/media/images/icons/icon-copy-small-2.svg" alt="Copy code to clipboard" width="14" height="13"> <span>Copy</span> </button> </span> <div class="lang-toolbar__border"></div> </div><div class="code-snippet "> <pre data-expanded="false"><code class="language-sh"> - job_name: elasticsearch scrape_interval: 60s scrape_timeout: 30s metrics_path: &#34;/metrics&#34; static_configs: - targets: - elastic2.test.lan:9108 - elastic-log2.prod.lan:9108 labels: service: elasticsearch relabel_configs: - source_labels: [__address__] regex: &#39;(.*)\:9108&#39; target_label: &#39;instance&#39; replacement: &#39;$1&#39; - source_labels: [__address__] regex: &#39;.*\.(.*)\.lan.*&#39; target_label: &#39;environment&#39; replacement: &#39;$1&#39;</code></pre> </div> </div> <h3 id="exampe-config-for--prometheus-alertsrules">Exampe config for prometheus alerts.rules:</h3> <div class="code-snippet "><div class="lang-toolbar"> <span class="lang-toolbar__item lang-toolbar__item-active">sh</span> <span class="code-clipboard"> <button x-data="app_code_snippet()" x-init="init()" @click="copy()"> <img class="code-clipboard__icon" src="/media/images/icons/icon-copy-small-2.svg" alt="Copy code to clipboard" width="14" height="13"> <span>Copy</span> </button> </span> <div class="lang-toolbar__border"></div> </div><div class="code-snippet "> <pre data-expanded="false"><code class="language-sh">ALERT Elastic_UP IF elasticsearch_up{job=&#34;elasticsearch&#34;} != 1 FOR 120s LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;Instance {{ $labels.instance }}: Elasticsearch instance status is not 1&#34;, description = &#34;This server&#39;s Elasticsearch instance status has a value of {{ $value }}.&#34;, } ALERT Elastic_Cluster_Health_RED IF elasticsearch_cluster_health_status{color=&#34;red&#34;}==1 FOR 300s LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}&#34;, description = &#34;Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}.&#34;, } ALERT Elastic_Cluster_Health_Yellow IF elasticsearch_cluster_health_status{color=&#34;yellow&#34;}==1 FOR 300s LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}&#34;, description = &#34;Instance {{ $labels.instance }}: not all primary and replica shards are allocated in elasticsearch cluster {{ $labels.cluster }}.&#34;, } ALERT Elasticsearch_JVM_Heap_Too_High IF elasticsearch_jvm_memory_used_bytes{area=&#34;heap&#34;} / elasticsearch_jvm_memory_max_bytes{area=&#34;heap&#34;} &gt; 0.8 FOR 15m LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }} heap usage is high&#34;, description = &#34;The heap in {{ $labels.instance }} is over 80% for 15m.&#34;, } ALERT Elasticsearch_health_up IF elasticsearch_cluster_health_up !=1 FOR 1m LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node: {{ $labels.instance }} last scrape of the ElasticSearch cluster health failed&#34;, description = &#34;ElasticSearch node: {{ $labels.instance }} last scrape of the ElasticSearch cluster health failed&#34;, } ALERT Elasticsearch_Too_Few_Nodes_Running IF elasticsearch_cluster_health_number_of_nodes &lt; 3 FOR 5m LABELS { severity=&#34;alert&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { description=&#34;There are only {{$value}} &lt; 3 ElasticSearch nodes running&#34;, summary=&#34;ElasticSearch running on less than 3 nodes&#34; } ALERT Elasticsearch_Count_of_JVM_GC_Runs IF rate(elasticsearch_jvm_gc_collection_seconds_count{}[5m])&gt;5 FOR 60s LABELS { severity=&#34;warning&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }}: Count of JVM GC runs &gt; 5 per sec and has a value of {{ $value }}&#34;, description = &#34;ElasticSearch node {{ $labels.instance }}: Count of JVM GC runs &gt; 5 per sec and has a value of {{ $value }}&#34;, } ALERT Elasticsearch_GC_Run_Time IF rate(elasticsearch_jvm_gc_collection_seconds_sum[5m])&gt;0.3 FOR 60s LABELS { severity=&#34;warning&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }}: GC run time in seconds &gt; 0.3 sec and has a value of {{ $value }}&#34;, description = &#34;ElasticSearch node {{ $labels.instance }}: GC run time in seconds &gt; 0.3 sec and has a value of {{ $value }}&#34;, } ALERT Elasticsearch_json_parse_failures IF elasticsearch_cluster_health_json_parse_failures&gt;0 FOR 60s LABELS { severity=&#34;warning&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }}: json parse failures &gt; 0 and has a value of {{ $value }}&#34;, description = &#34;ElasticSearch node {{ $labels.instance }}: json parse failures &gt; 0 and has a value of {{ $value }}&#34;, } ALERT Elasticsearch_breakers_tripped IF rate(elasticsearch_breakers_tripped{}[5m])&gt;0 FOR 60s LABELS { severity=&#34;warning&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }}: breakers tripped &gt; 0 and has a value of {{ $value }}&#34;, description = &#34;ElasticSearch node {{ $labels.instance }}: breakers tripped &gt; 0 and has a value of {{ $value }}&#34;, } ALERT Elasticsearch_health_timed_out IF elasticsearch_cluster_health_timed_out&gt;0 FOR 60s LABELS { severity=&#34;warning&#34;, value = &#34;{{$value}}&#34; } ANNOTATIONS { summary = &#34;ElasticSearch node {{ $labels.instance }}: Number of cluster health checks timed out &gt; 0 and has a value of {{ $value }}&#34;, description = &#34;ElasticSearch node {{ $labels.instance }}: Number of cluster health checks timed out &gt; 0 and has a value of {{ $value }}&#34;, }</code></pre> </div> </div>
Revisions
RevisionDescriptionCreated
Elasticsearch

Elasticsearch

by Grafana Labs
Grafana Labs solution

Easily monitor Elasticsearch, a distributed, multitenant full-text search engine, with Grafana Cloud's out-of-the-box monitoring solution.

Learn more

Get this dashboard

Import the dashboard template

or

Download JSON

Datasource
Dependencies