参考文档
安装 1 2 wget https://dl.influxdata.com/telegraf/releases/telegraf-1.20.2-1.x86_64.rpm sudo yum localinstall telegraf-1.20.2-1.x86_64.rpm
debug 1 2 telegraf --debug telegraf --test
log logfile = “/var/log/telegraf/telegraf.log”
net_response https://github.com/influxdata/telegraf/tree/master/plugins/inputs/net_response
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 [[inputs.net_response]] protocol = "tcp" address = "10.194.99.9:8050" [[inputs.net_response]] protocol = "tcp" address = "10.194.99.10:8050" [[inputs.net_response]] protocol = "tcp" address = "10.194.99.11:8050" [[inputs.net_response]] protocol = "tcp" address = "10.194.99.12:8050" [[inputs.net_response]] protocol = "tcp" address = "10.194.99.13:8050" [[inputs.net_response]] protocol = "tcp" address = "10.194.99.14:8050"
procstat https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat
1 2 3 4 5 6 7 8 9 10 11 [[inputs.procstat]] exe = "mysqld" process_name = "10.194.99.2-mysqld" [[inputs.procstat]] exe = "mongod" process_name = "10.194.99.3-mongod" [[inputs.procstat]] pattern = "splash" process_name = "10.194.99.4-minio"
docker https://github.com/influxdata/telegraf/tree/master/plugins/inputs/docker
1 usermod -aG docker telegraf
haproxy https://github.com/influxdata/telegraf/tree/master/plugins/inputs/haproxy https://cbonte.github.io/haproxy-dconv/1.8/management.html#9.1
1 2 [[inputs.haproxy]] servers = ["http://10.194.99.9:8036/","http://10.194.99.10:8036/","http://10.194.99.11:8036/","http://10.194.99.12:8036/","http://10.194.99.13:8036/","http://10.194.99.14:8036/"]
邮件配置 1 2 3 4 5 6 7 8 9 10 11 [smtp] enabled = true host = smtp.163.com:25 user = youraccount@163.com password = yourpass skip_verify = true from_address = youraccount@163.com from_name = 运维告警 [emails] content_types = text/plain
邮件模板 /usr/share/grafana/public/emails/alert_notification.txt
1 2 3 4 5 6 7 8 9 10 11 {{if ne .State "ok" }}异常列表:{{range .EvalMatches}} {{.Metric}} : {{.Value}} {{end}}{{end}} {{.Message}} {{if ne .Error "" }} 报错信息:{{.Error}} {{end}} ---------- 如有问题烦请联系,谢谢。 Best Regards.
告警参数配置
客户端上报间隔 10s (每10秒上报一个点)
告警评估间隔 10s (越短越好,但太短影响Server性能)
告警评估持续时长 1m (当持续1分钟后,由pending转为alert并发邮件)
告警采样时间段 5m (逆推300个点)
告警采样域值 0.93 (允许丢失20个点)