7.实现Nginx、Mysql的监控
7.1 zabbix实现Nginx的监控
脚本:nginx_status.sh
模板:nginx-template-magedu-jiege.xml
对nginx的活动连接和当前状态等运行状态进行监控
配置示例:
location /nginx_status {
stub_status;
allow 172.31.0.0/16;
allow 127.0.0.1;
deny all;
}
状态页用于输出nginx的基本状态信息:
输出信息示例:
Active connections: 291
server accepts handled requests
16630948 16630948 31070465
上面三个数字分别对应accepts,handled,requests三个值
Reading: 6 Writing: 179 Waiting: 106
Active connections: 当前处于活动状态的客户端连接数,包括连接等待空闲连接数。
accepts:统计总值,Nginx自启动后已经接受的客户端请求的总数。
handled:统计总值,Nginx自启动后已经处理完成的客户端请求的总数,通常等于accepts,除非有因
worker_connections限制等被拒绝的连接。
requests:统计总值,Nginx自启动后客户端发来的总的请求数。
Reading:当前状态,正在读取客户端请求报文首部的连接的连接数。
Writing:当前状态,正在向客户端发送响应报文过程中的连接数。
Waiting:当前状态,正在等待客户端发出请求的空闲连接数,开启 keep-alive的情况下,这个值等于 active –
(reading+writing),
7.1.1 在zabbix-agent上部署Nginx服务
[root@zabbix-agent ~]#apt install nginx -y
[root@zabbix-agent ~]#vim /apps/nginx/conf/nginx.conf
......
server {
listen 80;
listen [::]:80;
server_name _;
root /usr/share/nginx/html;
location / { root html; index index.html index.htm; } location /nginx_status { stub_status; allow 172.31.0.0/16; allow 127.0.0.1; deny all; }
........
[root@zabbix-agent ~]#systemctl restart zabbix-agent nginx
[root@zabbix-agent ~]#curl http://127.0.0.1:80/nginx_status
Active connections: 1
server accepts handled requests
4 4 4
Reading: 0 Writing: 1 Waiting: 0
7.1.2 监控项脚本
脚本放在zabbix-agent主机上.
zabbix-agent主机:
[root@zabbix-agent ~]#cat nginx_status.sh
nginx_status_fun(){
NGINX_PORT=$1
NGINX_COMMAND=$2
nginx_active(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null| grep 'Active' | awk '{print $NF}' } nginx_reading(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | grep 'Reading' | awk '{print $2}' } nginx_writing(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | grep 'Writing' | awk '{print $4}' } nginx_waiting(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | grep 'Waiting' | awk '{print $6}' } nginx_accepts(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | awk NR==3 | awk '{print $1}' } nginx_handled(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | awk NR==3 | awk '{print $2}' } nginx_requests(){ /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2> /dev/null | awk NR==3 | awk '{print $3}' } case $NGINX_COMMAND in active) nginx_active; ;; reading) nginx_reading; ;; writing) nginx_writing; ;; waiting) nginx_waiting; ;; accepts) nginx_accepts; ;; handled) nginx_handled; ;; requests) nginx_requests; esac
}
main (){
case $1 in
nginx_status)
nginx_status_fun $2 $3;
;;
*)
echo $"Usage: $0 {nginx_status key}"
esac
}
main $1 $2 $3
在zabbix-agent-nginx上测试
[root@zabbix-agent ~]#bash nginx_status.sh nginx_status 80 active
1
[root@zabbix-agent ~]#bash nginx_status.sh nginx_status 80 handled
4
[root@zabbix-agent ~]#bash nginx_status.sh nginx_status 80 writing
1
[root@zabbix-agent ~]#bash nginx_status.sh nginx_status 80 waiting
0
[root@zabbix-agent ~]#bash nginx_status.sh nginx_status 80 reading
0
脚本放到zabbix-agent的指定目录/etc/zabbix/zabbix_agentd.d/下,给执行权限
[root@zabbix-agent ~]#cp nginx_status.sh /etc/zabbix/zabbix_agentd.d/
[root@zabbix-agent ~]# chmod +x nginx_status.sh
7.1.3 zabbix agent添加自定义监控项
[root@zabbix-agent ~]# vim /etc/zabbix/zabbix_agentd.conf
.......
UserParameter=nginx.status,systemctl status nginx | awk NR==5'{print $3}'| awk -F '(' '{print $2}'| awk -F')' '{print $1}'
UserParameter=nginx.status[*],/etc/zabbix/zabbix_agentd.d/nginx_status.sh "$1" "$2" "$3"
####### LOADABLE MODULES #######
.......
[root@zabbix-server ~]#systemctl restart zabbix-agent nginx
7.1.4 zabbix server测试监控项数据
安装zabbix-get
[root@zabbix-server ~]# apt install zabbix-get
[root@zabbix-server ~]#zabbix_get -s 172.31.0.108 -p 10050 -k "nginx.status["nginx_status",80,"active"]"
1
[root@zabbix-server ~]#systemctl restart zabbix-server apache2
7.1.5 zabbix web界面添加被监控主机
浏览器访问: http://172.31.0.101/zabbix/
7.1.6 zabbix web导入模板
配置-模板-导入
7.1.7 将TCP监控模板关联至主机
7.1.8 验证监控数据
总结:
本次实验图表一开始获取不到数据,解决办法:
(1)实现各个主机之间时间同步。
(2)zabbix-agent中的key的的名称nginx.status要与模版中的key名称要一致,且在zabbix-server上测试成功。
[root@zabbix-agent ~]#vim /etc/zabbix/zabbix_agentd.conf
UserParameter=nginx.status[*],/etc/zabbix/zabbix_agentd.d/nginx_status.sh "$1" "$2" "$3"
[root@zabbix-server ~]#zabbix_get -s 172.31.0.108 -p 10050 -k "nginx.status[nginx_status,80,active]"
1
(3)看日志分析问题,可以看出问题所在
[root@zabbix-agent /var/log/zabbix-agent]#tail zabbix_agentd.log
2761:20220827:195829.406 TLS support: YES
2761:20220827:195829.406 **************************
2761:20220827:195829.406 using configuration file: /etc/zabbix/zabbix_agentd.conf
2761:20220827:195829.407 agent #0 started [main process]
2773:20220827:195829.411 agent #5 started [active checks #1]
2773:20220827:195829.411 active check configuration update from [127.0.0.1:10051] started to fail (cannot connect to [[127.0.0.1]:10051]: [111] Connection refused)
2772:20220827:195829.412 agent #4 started [listener #3]
2771:20220827:195829.412 agent #3 started [listener #2]
2770:20220827:195829.412 agent #2 started [listener #1]
2769:20220827:195829.412 agent #1 started [collector]
(4)更新zabbix-agent配置文件配置,,重启zabbix-server和zabbix-agent
[root@zabbix-agent ~]#vim /etc/zabbix/zabbix_agentd.conf
ListenPort=10050
Server=172.31.0.101 #zabbxi-server IP
StartAgents=3
ServerActive=172.31.0.101 #zabbxi-server IP
Hostname=172.31.0.108 #zabbxi-agent IP
[root@zabbix-server ~]#visystemctl restart zabbix-server apache2
[root@zabbix-agent ~]#visystemctl restart zabbix-agent nginx
再次想看日志:
[root@zabbix-agent /var/log/zabbix-agent]#tail zabbix_agentd.log -f
3015:20220827:201547.054 TLS support: YES
3015:20220827:201547.054 **************************
3015:20220827:201547.054 using configuration file: /etc/zabbix/zabbix_agentd.conf
3015:20220827:201547.054 agent #0 started [main process]
3026:20220827:201547.055 agent #1 started [collector]
3030:20220827:201547.055 agent #5 started [active checks #1]
3029:20220827:201547.056 agent #4 started [listener #3]
3028:20220827:201547.056 agent #3 started [listener #2]
3027:20220827:201547.057 agent #2 started [listener #1]
3030:20220827:201550.056 active check configuration update from [172.31.0.101:10051] started to fail (ZBX_TCP_READ() timed out)
3030:20220827:201650.644 active check configuration update from [172.31.0.101:10051] is working again