kube-prometheus是coreos的一个开源项目,用于监控kubernetes集群
yum install git -y
git clone https://github.com/coreos/kube-prometheus
[root@k8s-master guoxb]# cd kube-prometheus/manifests/ [root@k8s-master manifests]# ll total 1652 -rw-r--r-- 1 root root 406 Mar 30 19:38 alertmanager-alertmanager.yaml -rw-r--r-- 1 root root 1017 Mar 30 19:38 alertmanager-secret.yaml -rw-r--r-- 1 root root 101 Mar 30 19:38 alertmanager-serviceAccount.yaml -rw-r--r-- 1 root root 268 Mar 30 19:38 alertmanager-serviceMonitor.yaml -rw-r--r-- 1 root root 287 Mar 30 19:38 alertmanager-service.yaml -rw-r--r-- 1 root root 558 Mar 30 19:38 grafana-dashboardDatasources.yaml -rw-r--r-- 1 root root 1391276 Mar 30 19:38 grafana-dashboardDefinitions.yaml -rw-r--r-- 1 root root 475 Mar 30 19:38 grafana-dashboardSources.yaml -rw-r--r-- 1 root root 7737 Mar 30 19:38 grafana-deployment.yaml -rw-r--r-- 1 root root 91 Mar 30 19:38 grafana-serviceAccount.yaml -rw-r--r-- 1 root root 220 Mar 30 19:38 grafana-serviceMonitor.yaml -rw-r--r-- 1 root root 215 Mar 30 19:38 grafana-service.yaml -rw-r--r-- 1 root root 390 Mar 30 19:38 kube-state-metrics-clusterRoleBinding.yaml -rw-r--r-- 1 root root 1856 Mar 30 19:38 kube-state-metrics-clusterRole.yaml -rw-r--r-- 1 root root 1980 Mar 30 19:38 kube-state-metrics-deployment.yaml -rw-r--r-- 1 root root 199 Mar 30 19:38 kube-state-metrics-serviceAccount.yaml -rw-r--r-- 1 root root 860 Mar 30 19:38 kube-state-metrics-serviceMonitor.yaml -rw-r--r-- 1 root root 421 Mar 30 19:38 kube-state-metrics-service.yaml -rw-r--r-- 1 root root 278 Mar 30 19:38 node-exporter-clusterRoleBinding.yaml -rw-r--r-- 1 root root 300 Mar 30 19:38 node-exporter-clusterRole.yaml -rw-r--r-- 1 root root 2833 Mar 30 19:38 node-exporter-daemonset.yaml -rw-r--r-- 1 root root 97 Mar 30 19:38 node-exporter-serviceAccount.yaml -rw-r--r-- 1 root root 739 Mar 30 19:38 node-exporter-serviceMonitor.yaml -rw-r--r-- 1 root root 372 Mar 30 19:38 node-exporter-service.yaml -rw-r--r-- 1 root root 305 Mar 30 19:38 prometheus-adapter-apiService.yaml -rw-r--r-- 1 root root 414 Mar 30 19:38 prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml -rw-r--r-- 1 root root 316 Mar 30 19:38 prometheus-adapter-clusterRoleBindingDelegator.yaml -rw-r--r-- 1 root root 293 Mar 30 19:38 prometheus-adapter-clusterRoleBinding.yaml -rw-r--r-- 1 root root 199 Mar 30 19:38 prometheus-adapter-clusterRoleServerResources.yaml -rw-r--r-- 1 root root 235 Mar 30 19:38 prometheus-adapter-clusterRole.yaml -rw-r--r-- 1 root root 1314 Mar 30 19:38 prometheus-adapter-configMap.yaml -rw-r--r-- 1 root root 1379 Mar 30 19:38 prometheus-adapter-deployment.yaml -rw-r--r-- 1 root root 338 Mar 30 19:38 prometheus-adapter-roleBindingAuthReader.yaml -rw-r--r-- 1 root root 102 Mar 30 19:38 prometheus-adapter-serviceAccount.yaml -rw-r--r-- 1 root root 250 Mar 30 19:38 prometheus-adapter-service.yaml -rw-r--r-- 1 root root 281 Mar 30 19:38 prometheus-clusterRoleBinding.yaml -rw-r--r-- 1 root root 231 Mar 30 19:38 prometheus-clusterRole.yaml -rw-r--r-- 1 root root 643 Mar 30 19:38 prometheus-operator-serviceMonitor.yaml -rw-r--r-- 1 root root 785 Jun 3 20:51 prometheus-prometheus.yaml -rw-r--r-- 1 root root 306 Mar 30 19:38 prometheus-roleBindingConfig.yaml -rw-r--r-- 1 root root 1025 Mar 30 19:38 prometheus-roleBindingSpecificNamespaces.yaml -rw-r--r-- 1 root root 200 Mar 30 19:38 prometheus-roleConfig.yaml -rw-r--r-- 1 root root 871 Mar 30 19:38 prometheus-roleSpecificNamespaces.yaml -rw-r--r-- 1 root root 72376 Mar 30 19:38 prometheus-rules.yaml -rw-r--r-- 1 root root 98 Mar 30 19:38 prometheus-serviceAccount.yaml -rw-r--r-- 1 root root 6903 Mar 30 19:38 prometheus-serviceMonitorApiserver.yaml -rw-r--r-- 1 root root 414 Mar 30 19:38 prometheus-serviceMonitorCoreDNS.yaml -rw-r--r-- 1 root root 6227 Mar 30 19:38 prometheus-serviceMonitorKubeControllerManager.yaml -rw-r--r-- 1 root root 6855 Mar 30 19:38 prometheus-serviceMonitorKubelet.yaml -rw-r--r-- 1 root root 365 Mar 30 19:38 prometheus-serviceMonitorKubeScheduler.yaml -rw-r--r-- 1 root root 261 Mar 30 19:38 prometheus-serviceMonitor.yaml -rw-r--r-- 1 root root 276 Mar 30 19:38 prometheus-service.yaml drwxr-xr-x 2 root root 4096 Mar 30 19:38 setup
国外镜像源某些镜像无法拉取,我们这里修改prometheus-operator,prometheus,alertmanager,kube-state-metrics,node-exporter,prometheus-adapter的镜像源为国内镜像源。我这里使用的是中科大的镜像源。
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' setup/prometheus-operator-deployment.yaml sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-prometheus.yaml sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' alertmanager-alertmanager.yaml sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' kube-state-metrics-deployment.yaml sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' node-exporter-daemonset.yaml sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-adapter-deployment.yaml
为了可以从外部访问prometheus,alertmanager,grafana,我们这里修改promethes,alertmanager,grafana的service类型为NodePort类型。
修改prometheus的service
[root@k8s-master manifests]# cat prometheus-service.yaml apiVersion: v1 kind: Service metadata: labels: prometheus: k8s name: prometheus-k8s namespace: monitoring spec: type: NodePort # 新增 ports: - name: web port: 9090 targetPort: web nodePort: 30090 # 新增 selector: app: prometheus prometheus: k8s sessionAffinity: ClientIP
修改alertmanager的service
[root@k8s-master manifests]# cat alertmanager-service.yaml apiVersion: v1 kind: Service metadata: labels: alertmanager: main name: alertmanager-main namespace: monitoring spec: type: NodePort # 新增 ports: - name: web port: 9093 targetPort: web nodePort: 30093 # 新增 selector: alertmanager: main app: alertmanager sessionAffinity: ClientIP
修改grafana的service
[root@k8s-master manifests]# cat grafana-service.yaml apiVersion: v1 kind: Service metadata: labels: app: grafana name: grafana namespace: monitoring spec: type: NodePort # 新增 ports: - name: http port: 3000 targetPort: http nodePort: 32000 # 新增 selector: app: grafana
安装CRD和prometheus-operator
[root@k8s-master manifests]# kubectl apply -f setup/ namespace/monitoring created customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created clusterrole.rbac.authorization.k8s.io/prometheus-operator created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created service/prometheus-operator created serviceaccount/prometheus-operator created [root@k8s-master manifests]#
下载prometheus-operator镜像需要花费几分钟,这里等待几分钟,直到prometheus-operator变成running状态
[root@k8s-master manifests]# kubectl get pod -n monitoring NAME READY STATUS RESTARTS AGE prometheus-operator-c5c9679cd-4wwf7 2/2 Running 0 13s
安装prometheus, alertmanager, grafana, kube-state-metrics, node-exporter等资源
[root@k8s-master manifests]# kubectl apply -f . alertmanager.monitoring.coreos.com/main created secret/alertmanager-main created service/alertmanager-main created serviceaccount/alertmanager-main created servicemonitor.monitoring.coreos.com/alertmanager created secret/grafana-datasources created configmap/grafana-dashboard-apiserver created configmap/grafana-dashboard-cluster-total created configmap/grafana-dashboard-controller-manager created configmap/grafana-dashboard-k8s-resources-cluster created configmap/grafana-dashboard-k8s-resources-namespace created configmap/grafana-dashboard-k8s-resources-node created configmap/grafana-dashboard-k8s-resources-pod created configmap/grafana-dashboard-k8s-resources-workload created configmap/grafana-dashboard-k8s-resources-workloads-namespace created configmap/grafana-dashboard-kubelet created configmap/grafana-dashboard-namespace-by-pod created configmap/grafana-dashboard-namespace-by-workload created configmap/grafana-dashboard-node-cluster-rsrc-use created configmap/grafana-dashboard-node-rsrc-use created configmap/grafana-dashboard-nodes created configmap/grafana-dashboard-persistentvolumesusage created configmap/grafana-dashboard-pod-total created configmap/grafana-dashboard-prometheus-remote-write created configmap/grafana-dashboard-prometheus created configmap/grafana-dashboard-proxy created configmap/grafana-dashboard-scheduler created configmap/grafana-dashboard-statefulset created configmap/grafana-dashboard-workload-total created configmap/grafana-dashboards created deployment.apps/grafana created service/grafana created serviceaccount/grafana created servicemonitor.monitoring.coreos.com/grafana created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created deployment.apps/kube-state-metrics created service/kube-state-metrics created serviceaccount/kube-state-metrics created servicemonitor.monitoring.coreos.com/kube-state-metrics created clusterrole.rbac.authorization.k8s.io/node-exporter created clusterrolebinding.rbac.authorization.k8s.io/node-exporter created daemonset.apps/node-exporter created service/node-exporter created serviceaccount/node-exporter created servicemonitor.monitoring.coreos.com/node-exporter created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created clusterrole.rbac.authorization.k8s.io/prometheus-adapter created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created configmap/adapter-config created deployment.apps/prometheus-adapter created rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created service/prometheus-adapter created serviceaccount/prometheus-adapter created clusterrole.rbac.authorization.k8s.io/prometheus-k8s created clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created servicemonitor.monitoring.coreos.com/prometheus-operator created prometheus.monitoring.coreos.com/k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s-config created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created service/prometheus-k8s created serviceaccount/prometheus-k8s created servicemonitor.monitoring.coreos.com/prometheus created servicemonitor.monitoring.coreos.com/kube-apiserver created servicemonitor.monitoring.coreos.com/coredns created servicemonitor.monitoring.coreos.com/kube-controller-manager created servicemonitor.monitoring.coreos.com/kube-scheduler created servicemonitor.monitoring.coreos.com/kubelet created
下载镜像比较花费时间,可以先去泡杯咖啡,等上半小时再回来,然后查看命名空间monitoring下面的pod状态,直到monitoring命名空间下所有pod都变为running状态,就大功告成了。
[root@k8s-master manifests]# kubectl get pod -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 0 6m30s alertmanager-main-1 2/2 Running 0 6m30s alertmanager-main-2 2/2 Running 0 6m30s grafana-5c55845445-x8s96 1/1 Running 0 6m30s kube-state-metrics-5848b95f69-jwtkj 3/3 Running 0 6m30s node-exporter-82zqh 2/2 Running 0 6m30s node-exporter-j97g4 2/2 Running 0 6m30s node-exporter-tg7c9 2/2 Running 0 6m30s prometheus-adapter-7d68d6f886-wjjx4 1/1 Running 0 6m30s prometheus-k8s-0 3/3 Running 1 6m29s prometheus-k8s-1 3/3 Running 1 6m29s prometheus-operator-c5c9679cd-4wwf7 2/2 Running 0 88m
浏览器打开http://192.168.92.201:30090,192.168.92.201为master的IP
浏览器打开http://192.168.92.201:30093
浏览器打开http://192.168.92.201:32000
用户名/密码:admin/admin
https://github.com/coreos/kube-prometheus
https://www.jianshu.com/p/2fbbe767870d