Rancher是一个开源的企业级多集群Kubernetes管理平台,实现了Kubernetes集群在混合云+本地数据中心的集中部署与管理,以确保集群的安全性,加速企业数字化转型。
超过40,000家企业每天使用Rancher快速创新。
官网:https://docs.rancher.cn/
Rancher和k8s都是用来作为容器的调度与编排系统。但是rancher不仅能够管理应用容器,更重要的一点是能够管理k8s集群。Rancher2.x底层基于k8s调度引擎,通过Rancher的封装,用户可以在不熟悉k8s概念的情况下轻松的通过Rancher来部署容器到k8s集群当中。
K8S集群角色 | Ip | 主机名 | 版本 |
---|---|---|---|
控制节点 | 192.168.40.180 | k8s-master1 | v1.20.6 |
工作节点 | 192.168.40.181 | k8s-node1 | v1.20.6 |
工作节点 | 192.168.40.182 | k8s-node2 | v1.20.6 |
rancher | 192.168.40.138 | rancher | v2.5.7 |
[root@k8s-master1 ~]# docker pull rancher/rancher-agent:v2.5.7 [root@rancher ~]# docker pull rancher/rancher:v2.5.7 # 注:unless-stopped,在容器退出时总是重启容器,但是不考虑在Docker守护进程启动时就已经停止了的容器 [root@rancher ~]# docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged --name rancher rancher/rancher:v2.5.7 [root@rancher ~]# docker ps -a|grep rancher a893cc6d7bc3 rancher/rancher:v2.5.7 "entrypoint.sh" 3 seconds ago Up 2 seconds 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp rancher
在浏览器访问rancher
的ip地址:由于未使用授信证书,会有报警,忽略即可
设置中文:
选择添加集群,并导入存在的集群
在k8s控制节点k8s-master1上执行上面箭头所指的命令
[root@k8s-master1 ~]# curl --insecure -sfL https://192.168.40.138/v3/import/7jzb5nnjjjpqnqnpv9g6p26z4j4c5qncgbttwlr8s2gfl2qk7th6x6_c-n5w99.yaml | kubectl apply -f - error: no objects passed to apply # 再执行一次: [root@k8s-master1 ~]# curl --insecure -sfL https://192.168.40.138/v3/import/7jzb5nnjjjpqnqnpv9g6p26z4j4c5qncgbttwlr8s2gfl2qk7th6x6_c-n5w99.yaml | kubectl apply -f - clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver created clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master created namespace/cattle-system created serviceaccount/cattle created clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created secret/cattle-credentials-6539558 created clusterrole.rbac.authorization.k8s.io/cattle-admin created deployment.apps/cattle-cluster-agent created [root@k8s-master1 ~]# kubectl get ns NAME STATUS AGE cattle-system Active 7m4s default Active 5d1h fleet-system Active 5m34s kube-node-lease Active 5d1h kube-public Active 5d1h kube-system Active 5d1h [root@k8s-master1 ~]# kubectl get pods -n cattle-system NAME READY STATUS RESTARTS AGE cattle-cluster-agent-6bdf9bfddd-77vtd 1/1 Running 0 6m5s [root@k8s-master1 ~]# kubectl get pods -n fleet-system NAME READY STATUS RESTARTS AGE fleet-agent-55bfc495bd-8xgsd 1/1 Running 0 3m55s
组件不健康问题解决:
# 原因 [root@k8s-master1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused etcd-0 Healthy {"health":"true"} # 修改kube-scheduler的配置文件 [root@k8s-master1 prometheus]# vim /etc/kubernetes/manifests/kube-scheduler.yaml # 修改如下内容 1)把--bind-address=127.0.0.1变成--bind-address=192.168.40.180 #192.168.40.180是k8s的控制节点k8s-master1的ip 2)把httpGet:字段下的hosts由127.0.0.1变成192.168.40.180(有两处) 3)把—port=0删除 # 重启各个节点的kubelet [root@k8s-node1 ~]# systemctl restart kubelet [root@k8s-node2 ~]# systemctl restart kubelet # 相应的端口已经被物理机监听了 [root@k8s-master1 prometheus]# ss -antulp | grep :10251 tcp LISTEN 0 128 :::10251 :::* users:(("kube-scheduler",pid=36945,fd=7)) # 修改kube-controller-manager的配置文件 [root@k8s-master1 prometheus]# vim /etc/kubernetes/manifests/kube-controller-manager.yaml # 修改如下内容 1)把--bind-address=127.0.0.1变成--bind-address=192.168.40.180 #192.168.40.180是k8s的控制节点k8s-master1的ip 2)把httpGet:字段下的hosts由127.0.0.1变成192.168.40.180(有两处) 3)把—port=0删除 # 重启各个节点的kubelet [root@k8s-node1 ~]# systemctl restart kubelet [root@k8s-node2 ~]# systemctl restart kubelet # 查看状态 [root@k8s-master1 prometheus]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true"} [root@k8s-master1 prometheus]# ss -antulp | grep :10252 tcp LISTEN 0 128 :::10252 :::* users:(("kube-controller",pid=41653,fd=7))
1)启用Rancher集群级别监控,启动监控时间可能比较长,需要等10-20分钟
监控组件版本选择0.2.1,其他的默认就可以了,点启用监控
2)集群监控
3)kubernetes组件监控
4)Rancher日志收集功能监控
1)创建名称空间namespace
2)创建Deployment资源
3)创建service
4)点击节点端口 30180/TCP,可以访问内部的tomcat了