1 SLS MySQL监控 v2020.08.13

Dashboard

MySQL指标监控
Last updated: 3 months ago

Downloads: 546

Reviews: 4

  • 1.1.png
    1.1.png
  • 1-1-1.png
    1-1-1.png

SLS MySQL监控仪表盘

SLS支持使用Telegraf采集MySQL监控数据并存储到SLS的时序库中,支持各种可视化方式以及告警规则。

问题反馈和技术支持请加钉钉群或者微信联系作者。

使用限制

  • 仅支持MySQL 5.5及以上版本。
  • 不支持Windows版本。

步骤1:创建Logtail采集配置

  • 登录日志服务控制台

  • 接入数据区域,选择MySQL监控

  • 选择日志空间页签中,选择目标Project和MetricStore,单击下一步。您也可以单击 立即创建,重新创建Project和MetricStore,详情请参见 创建Project创建MetricStore

  • 创建机器组页签中,创建机器组。

    • 如果您已有可用的机器组,请单击使用现有机器组
    • 如果您还没有可用的机器组,请执行以下操作(以ECS为例):
      1. 选择ECS实例安装Logtail,详情请参见安装Logtail(ECS实例)。如果已在ECS上安装Logtail,请直接单击 确认安装完毕

说明 如果是自建集群、其他云厂商服务器,需要手动安装Logtail,详情请参见 安装Logtail(Linux系统)

  1. 安装完成后,单击**确认安装完毕**。
  1. 创建机器组,详情请参见[创建IP地址机器组](https://help.aliyun.com/document_detail/28966.html#task-wc3-xn1-ry)或[创建用户自定义标识机器组](https://help.aliyun.com/document_detail/28983.html#concept-gyy-k3q-zdb)。
  • 机器组配置页签中,应用机器组。选择一个机器组,将该机器组从 源机器组移动到 应用机器组

  • Logtail配置页签中,配置配置名称插件配置inputs为Logtail采集配置,必选项,请根据您的数据源配置。

说明 一个 inputs中只允许配置一个类型的数据源。

{
    "inputs": [
        {
            "detail": {
                "Format": "influx",
                "Address": ":8476"
            },
            "type": "service_http_server"
        }
    ],
    "global": {
        "AlwaysOnline": true,
        "DelayStopSec": 500
    }
}
参数类型是否必选参数说明
typestring数据源类型,固定为service_http_server。
Formatstring数据类型,固定为influx。
Addressstring监听地址与端口,格式为ip:port
  • 单击下一步,完成配置。

步骤2:安装Telegraf

  • 登录服务器。
  • 安装Telegraf。
    • 如果您的服务器是CentOS或AliOS系统,请执行如下命令安装Telegraf。
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.15.2.x86_64.rpm
yum localinstall influxdb-1.15.2.x86_64.rpm
systemctl enable --now influxdb
systemctl status influxd
  • 如果您的服务器是Ubuntu系统,请执行如下命令安装Telegraf。
wget https://dl.influxdata.com/telegraf/releases/telegraf_1.15.2-1_amd64.deb
dpkg -i telegraf_1.15.2-1_amd64.deb
systemctl enable --now influxdb
systemctl status influxd

步骤3:配置Telegraf

  • 登录服务器。
  • 配置Logtail端口。
    1. 打开/etc/telegraf/telegraf.conf文件。
    2. 将/etc/telegraf/telegraf.conf文件中的内容替换下如下脚本。其中 [[outputs.influxdb]]下的urls为您在步骤6中配置的Address地址,默认为"http://127.0.0.1:8476"。
# Global tags can be specified here in key="value" format.
[global_tags]
  # dc = "us-east-1" # will tag all metrics with dc=us-east-1
  # rack = "1a"
  ## Environment variables can be used as tags, and throughout the config file
  # user = "$USER"
# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true
  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000
  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000
  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"
  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"
  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""
  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5
  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
      
###############################################################################
#                            OUTPUT PLUGINS                                   #
###############################################################################
# Configuration for sending metrics to Logtail's InfluxDB receiver
[[outputs.influxdb]]
  ## The full HTTP Logtail listen address
  urls = ["http://127.0.0.1:8476"]
  ## Always be true
  skip_database_creation = true
  • 增加MySQL监控配置。
    1. 在/etc/telegraf/telegraf.d目录下创建mysql.conf配置文件。
    2. 在mysql.conf配置文件中填入如下MySQL监控相关信息,并根据实际情况替换。
[[inputs.mysql]]
  ## specify servers via a url matching:
  ##  [username[:password]@][protocol[(address)]]/[?tls=[true|false|skip-verify|custom]]
  ##  see https://github.com/go-sql-driver/mysql#dsn-data-source-name
  ##  e.g.
  ##    servers = ["user:passwd@tcp(127.0.0.1:3306)/?tls=false"]
  ##    servers = ["user@tcp(127.0.0.1:3306)/?tls=false"]
  #
  ## If no servers are specified, then localhost is used as the host.
  servers = ["user:passwd@tcp(127.0.0.1:3306)/?tls=false"]
  metric_version = 2
  ## if the list is empty, then metrics are gathered from all databasee tables
  table_schema_databases = []
  ## gather metrics from INFORMATION_SCHEMA.TABLES for databases provided above list
  gather_table_schema = false
  ## gather thread state counts from INFORMATION_SCHEMA.PROCESSLIST
  gather_process_list = false
  ## gather user statistics from INFORMATION_SCHEMA.USER_STATISTICS
  gather_user_statistics = false
  ## gather auto_increment columns and max values from information schema
  gather_info_schema_auto_inc = false
  ## gather metrics from INFORMATION_SCHEMA.INNODB_METRICS
  gather_innodb_metrics = true
  ## gather metrics from SHOW SLAVE STATUS command output
  gather_slave_status = false
  ## gather metrics from SHOW BINARY LOGS command output
  gather_binary_logs = false
  ## gather metrics from SHOW GLOBAL VARIABLES command output
  gather_global_variables = true
  ## gather metrics from PERFORMANCE_SCHEMA.TABLE_IO_WAITS_SUMMARY_BY_TABLE
  gather_table_io_waits = false
  ## gather metrics from PERFORMANCE_SCHEMA.TABLE_LOCK_WAITS
  gather_table_lock_waits = false
  ## gather metrics from PERFORMANCE_SCHEMA.TABLE_IO_WAITS_SUMMARY_BY_INDEX_USAGE
  gather_index_io_waits = false
  ## gather metrics from PERFORMANCE_SCHEMA.EVENT_WAITS
  gather_event_waits = false
  ## gather metrics from PERFORMANCE_SCHEMA.FILE_SUMMARY_BY_EVENT_NAME
  gather_file_events_stats = false
  ## gather metrics from PERFORMANCE_SCHEMA.EVENTS_STATEMENTS_SUMMARY_BY_DIGEST
  gather_perf_events_statements = false
  ## the limits for metrics form perf_events_statements
  perf_events_statements_digest_text_limit = 120
  perf_events_statements_limit = 250
  perf_events_statements_time_limit = 86400
  ## Some queries we may want to run less often (such as SHOW GLOBAL VARIABLES)
  ##   example: interval_slow = "30m"
  interval_slow = ""
  ## Optional TLS Config (will be used if tls=custom parameter specified in server uri)
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false omit_hostname = false
[[processors.strings]]
  namepass = ["mysql", "mysql_innodb"]
  [[processors.strings.replace]]
    tag = "server"
    old = "127.0.0.1:3306"
    new = "mysql-dev"
  [[processors.strings.replace]]
    tag = "server"
    old = "192.168.1.98:3306"
    new = "mysql-prod"
  • 重要参数说明如下表所示。
参数类型是否必选说明
servers数组监听的MySQL列表,格式为user:passwd@tcp(host:port)/?tls=true/false
processors.strings.replace对象默认情况下,上报的数据中Server字段为host:port格式。您可以通过processors.strings.replace,将host:port替换成易识别的值,例如mysql-prod。该对象支持设置多个。
  • 执行如下命令触发Telegraf重新加载配置。
systemctl reload telegraf

2020.08.28

  • 初始化版本

作者

  • 邮箱:davidzhang.zc@alibaba-inc.com
  • 微信:davidzhang-zc