BlueXIII's Blog

热爱技术,持续学习

0%

Telegraf告警配置

参考文档

安装

1
2
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.20.2-1.x86_64.rpm
sudo yum localinstall telegraf-1.20.2-1.x86_64.rpm

debug

1
2
telegraf --debug
telegraf --test

log

logfile = “/var/log/telegraf/telegraf.log”

net_response

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/net_response

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.9:8050"

[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.10:8050"

[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.11:8050"

[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.12:8050"

[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.13:8050"

[[inputs.net_response]]
protocol = "tcp"
address = "10.194.99.14:8050"

procstat

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat

1
2
3
4
5
6
7
8
9
10
11
[[inputs.procstat]]
exe = "mysqld"
process_name = "10.194.99.2-mysqld"

[[inputs.procstat]]
exe = "mongod"
process_name = "10.194.99.3-mongod"

[[inputs.procstat]]
pattern = "splash"
process_name = "10.194.99.4-minio"

docker

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/docker

1
[[inputs.docker]]
1
usermod -aG docker telegraf

haproxy

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/haproxy
https://cbonte.github.io/haproxy-dconv/1.8/management.html#9.1

1
2
[[inputs.haproxy]]
servers = ["http://10.194.99.9:8036/","http://10.194.99.10:8036/","http://10.194.99.11:8036/","http://10.194.99.12:8036/","http://10.194.99.13:8036/","http://10.194.99.14:8036/"]

邮件配置

1
2
3
4
5
6
7
8
9
10
11
[smtp]
enabled = true
host = smtp.163.com:25
user = youraccount@163.com
password = yourpass
skip_verify = true
from_address = youraccount@163.com
from_name = 运维告警

[emails]
content_types = text/plain

邮件模板

/usr/share/grafana/public/emails/alert_notification.txt

1
2
3
4
5
6
7
8
9
10
11
{{if ne .State "ok" }}异常列表:{{range .EvalMatches}}
{{.Metric}} : {{.Value}}
{{end}}{{end}}
{{.Message}}
{{if ne .Error "" }}
报错信息:{{.Error}}
{{end}}
----------

如有问题烦请联系,谢谢。
Best Regards.

告警参数配置

  • 客户端上报间隔 10s (每10秒上报一个点)
  • 告警评估间隔 10s (越短越好,但太短影响Server性能)
  • 告警评估持续时长 1m (当持续1分钟后,由pending转为alert并发邮件)
  • 告警采样时间段 5m (逆推300个点)
  • 告警采样域值 0.93 (允许丢失20个点)