一、环境部署,见
prometheus 邮件告警 第一节
https://blog.csdn.net/oToyix/article/details/120160633
二、process-export进程监控
1、process-export 下载、配置、启动
wget -c https://github.com/ncabatoff/process-exporter/releases/download/v0.7.5/process-exporter-0.7.5.linux-amd64.tar.gz tar -xf process-exporter-0.7.5.linux-amd64.tar.gz -C /usr/local/ cd /usr/local/ ln -s process-exporter-0.7.5.linux-amd64 process-exporter nohup ./process-exporter & firewall-cmd --add-port=9256/tcp --permanent firewall-cmd --reload cd process-exporter
进程配置文件
vim process-exporter.yaml process_names: - name: "{{.Matches}}" cmdline: - 'mysqld' - name: "{{.Matches}}" cmdline: - 'nginx' - name: "{{.Matches}}" cmdline: - 'php-fpm.conf'
启动
nohup /usr/local/process-exporter/process-exporter -config.path=/usr/local/process-exporter/process-exporter.yaml &
3、prometheus服务端配置
添加告警规则 之 文件发现
vim prometheus.yml
- job_name: "proess" file_sd_configs: - files: - targets/proess-*.yaml refresh_interval: 2m
cat targets/proess-all.yaml
- targets: - 192.168.0.63:9256 labels: app: node-process job: process
告警规则,当进程数为0时 告警
cat alert_rules/process_down.yaml
groups: - name: Allprocess rules: - alert: InproessDown expr: namedprocess_namegroup_num_procs == 0 for: 1m annotations: title: "process down" description: 'process has been down for more than 1 m .' labels: severity: 'critical'
----------------end