角色 | IP | 组件 |
---|---|---|
k8s-master1 | 192.168.80.45 | etcd, api-server, controller-manager, scheduler, docker |
k8s-node01 | 192.168.80.46 | etcd, kubelet, kube-proxy, docker |
k8s-node02 | 192.168.80.47 | etcd, kubelet, kube-proxy, docker |
软件版本:
软件 | 版本 | 备注 |
---|---|---|
OS | Ubuntu 16.04.6 LTS | |
Kubernetes | 1.19.11 | |
Etcd | v3.4.15 | |
Docker | 19.03.9 |
# 1. 修改主机名 hostnamectl set-hostname k8s-master1 hostnamectl set-hostname k8s-node01 hostnamectl set-hostname k8s-node02 # 2. 主机名解析 cat >> /etc/hosts <<EOF 192.168.80.45 k8s-master1 192.168.80.46 k8s-node01 192.168.80.47 k8s-node02 EOF # 3. 禁用 swap swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # 4. 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 5. 域名解析 echo "nameserver 8.8.8.8" >> /etc/resolv.conf # 6. 时间同步 apt install ntpdate -y ntpdate ntp1.aliyun.com crontab -e */30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1 # 7. 日志目录 mkdir -p /var/log/kubernetes
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install # 1. 下载安装包 wget https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz tar zxvf docker-19.03.9.tgz mv docker/* /usr/bin docker version # 2. 开机启动配置 cat > /lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com After=network-online.target firewalld.service Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd ExecReload=/bin/kill -s HUP $MAINPID LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TimeoutStartSec=0 Delegate=yes KillMode=process Restart=on-failure StartLimitBurst=3 StartLimitInterval=60s [Install] WantedBy=multi-user.target EOF # 3. 启动 systemctl daemon-reload systemctl start docker systemctl status docker systemctlenable docker
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -O /usr/local/bin/cfssl wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -O /usr/local/bin/cfssljson wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 -O /usr/bin/cfssl-certinfo chmod +x /usr/bin/cfssl* /usr/local/bin/cfssl*
生成的 CA 证书和秘钥文件如下:
组件 | 证书 | 密钥 | 备注 |
---|---|---|---|
etcd | ca.pem、etcd.pem | etcd-key.pem | |
apiserver | ca.pem、apiserver.pem | apiserver-key.pem | |
controller-manager | ca.pem、kube-controller-manager.pem | ca-key.pem、kube-controller-manager-key.pem | kubeconfig |
scheduler | ca.pem、kube-scheduler.pem | kube-scheduler-key.pem | kubeconfig |
kubelet | ca.pem | kubeconfig+token | |
kube-proxy | ca.pem、kube-proxy.pem | kube-proxy-key.pem | kubeconfig |
kubectl | ca.pem、admin.pem | kube-proxy-key.pem |
CA: Certificate Authority
mkdir -p /root/ssl && cd /root/ssl # 1. CA 配置文件 cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF # 2. CA 证书签名请求文件 cat > ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "System" } ], "ca": { "expiry": "87600h" } } EOF # 3. 生成CA证书和密钥 cfssl gencert -initca ca-csr.json | cfssljson -bare ca ls ca* #ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
注意:hosts 中的IP地址,分别指定了 etcd
集群的主机 IP
# 1. 证书签名请求文件 cat > etcd-csr.json <<EOF { "CN": "etcd", "hosts": [ "127.0.0.1", "localhost", "192.168.80.45", "192.168.80.46", "192.168.80.47" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "etcd", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
注意:hosts 中的IP地址,分别指定了 kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.254.0.1)
# 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "localhost", "192.168.80.1", "192.168.80.2", "192.168.80.45", "192.168.80.46", "192.168.80.47", "10.254.0.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver
# 1. 证书签名请求文件 cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
# 1. 证书签名请求文件 cat > kube-scheduler-csr.json << EOF { "CN": "system:kube-scheduler", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
O 指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
# 1. 证书签名请求文件 cat > admin-csr.json <<EOF { "CN": "admin", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin ls admin* # admin.csr admin-csr.json admin-key.pem admin.pem
搭建完 kubernetes 集群后,可以通过命令: kubectl get clusterrolebinding cluster-admin -o yaml ,查看到 clusterrolebinding cluster-admin 的 subjects 的 kind 是 Group,name 是 system:masters。 roleRef 对象是 ClusterRole cluster-admin。 即 system:masters Group 的 user 或者 serviceAccount 都拥有 cluster-admin 的角色。 因此在使用 kubectl 命令时候,才拥有整个集群的管理权限。
kubectl get clusterrolebinding cluster-admin -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" creationTimestamp: 2017-04-11T11:20:42Z labels: kubernetes.io/bootstrapping: rbac-defaults name: cluster-admin resourceVersion: "52" selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/cluster-admin uid: e61b97b2-1ea8-11e7-8cd7-f4e9d49f8ed0 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:masters
CN 指定该证书的 User 为 system:kube-proxy;
kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
# 1. 证书签名请求文件 cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
cfssl-certinfo -cert apiserver.pem { "subject": { "common_name": "kubernetes", "country": "CN", "organization": "k8s", "organizational_unit": "System", "locality": "BeiJing", "province": "BeiJing", "names": [ "CN", "BeiJing", "BeiJing", "k8s", "System", "kubernetes" ] }, "issuer": { "common_name": "kubernetes", "country": "CN", "organization": "k8s", "organizational_unit": "System", "locality": "BeiJing", "province": "BeiJing", "names": [ "CN", "BeiJing", "BeiJing", "k8s", "System", "kubernetes" ] }, "serial_number": "275867496157961939649344217740970264800633176866", "sans": [ "localhost", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local", "127.0.0.1", "192.168.80.1", "192.168.80.2", "192.168.80.45", "192.168.80.46", "192.168.80.47", "10.254.0.1" ], "not_before": "2021-06-09T05:20:00Z", "not_after": "2031-06-07T05:20:00Z", "sigalg": "SHA256WithRSA", "authority_key_id": "", "subject_key_id": "E3:84:0F:9C:00:07:4A:8F:5C:B2:35:45:A0:50:4D:3E:9D:C0:B4:D0", "pem": "-----BEGIN CERTIFICATE-----\nMIIEezCCA2OgAwIBAgIUMFJTjEXe9sDDDpPXcAiUBt5+QyIwDQYJKoZIhvcNAQEL\nBQAwZTELMAkGA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0Jl\naUppbmcxDDAKBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwpr\ndWJlcm5ldGVzMB4XDTIxMDYwOTA1MjAwMFoXDTMxMDYwNzA1MjAwMFowZTELMAkG\nA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0JlaUppbmcxDDAK\nBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwprdWJlcm5ldGVz\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAw0BpjZQNEd6Oqu8ubEWG\nhbdwJecOTCfdbY+VLIKEm0Tys8ZBlu7OrtZ8Rj5OAZTXil0ZJz+hvHo8YTNJJ16g\njHV88VSpfoXD5DE59PITSFwfY1lWHVctC3Ddo9CM9cU9Ty+Kf29XcrLbc/VNGZTB\ncvKXoM3b6NkBKOdKphVjUvafhKC6ls2ac5uub3uqZTpPgBs/1PvINKNZkP5U6lUV\noTBMAT+qbQ9aggA+bA+WegL3jHU78ngo1XMnsb1HfAjwKDOf66smNJ/K+YjD+Cul\ngjpyqOQKGlz5xqXUcBgIMO9djI4f5hvaMsSje1aSJ/oh5AfQbxQsGjajlS80ED08\nxwIDAQABo4IBITCCAR0wDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUF\nBwMBBggrBgEFBQcDAjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBTjhA+cAAdKj1yy\nNUWgUE0+ncC00DCBvgYDVR0RBIG2MIGzgglsb2NhbGhvc3SCCmt1YmVybmV0ZXOC\nEmt1YmVybmV0ZXMuZGVmYXVsdIIWa3ViZXJuZXRlcy5kZWZhdWx0LnN2Y4Iea3Vi\nZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVygiRrdWJlcm5ldGVzLmRlZmF1bHQu\nc3ZjLmNsdXN0ZXIubG9jYWyHBH8AAAGHBMCoUAGHBMCoUAKHBMCoUC2HBMCoUC6H\nBMCoUC+HBAr+AAEwDQYJKoZIhvcNAQELBQADggEBAG+RUKp4cxz4EOqmAPiczkl2\nHciAg01RbCavoLoUWmoDDAQf7PIhQF2pLewFCwR5w6SwvCJAVdg+eHdefJ2MBtJr\nKQgbmEOBXd4Z5ZqBeSP6ViHvb1pKtRSldznZLfxjsVd0bN3na/JmS4TZ90SqLLtL\nN4CgGfTs2AfrtbtWIqewDMS9aWjBK8VePzLBmsdLddD4WYQOnl+QjdrX9bbqYRCG\nQo3CKvJ3JZqh6AJHcgKsm0702uMU/TCJwe1M8I8SpYrwA74uCBy3O9jXed1rZlrp\nRVURB6Ro7SMLjiadTJyf6AbLPMmZcPKHhZ1XG07q8Od2Kd+KVx1PxF3et6OOteE=\n-----END CERTIFICATE-----\n" }
所有节点
mkdir -p /etc/kubernetes/pki cp *.pem /etc/kubernetes/pki tar cvf pki.tar /etc/kubernetes/pki scp pki.tar root@192.168.80.46:/root scp pki.tar root@192.168.80.47:/root sudo -i cd / && mv /root/pki.tar / && tar xvf pki.tar && rm -f pki.tar
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install # 1. 下载并安装 wget https://github.com/etcd-io/etcd/releases/download/v3.4.15/etcd-v3.4.15-linux-amd64.tar.gz tar zxvf etcd-v3.4.15-linux-amd64.tar.gz mv etcd-v3.4.15-linux-amd64/{etcd,etcdctl} /usr/bin/ # 2. 配置文件 mkdir -p /etc/etcd cat > /etc/etcd/etcd.conf << EOF #[Member] ETCD_NAME="etcd-1" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://192.168.80.45:2380" ETCD_LISTEN_CLIENT_URLS="https://192.168.80.45:2379,https://127.0.0.1:2379" #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.80.45:2380" ETCD_ADVERTISE_CLIENT_URLS="https://192.168.80.45:2379" ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.80.45:2380,etcd-2=https://192.168.80.46:2380,etcd-3=https://192.168.80.47:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new" EOF # 3. 开机启动 cat > /lib/systemd/system/etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=-/etc/etcd/etcd.conf ExecStart=/usr/bin/etcd \ --cert-file=/etc/kubernetes/pki/etcd.pem \ --key-file=/etc/kubernetes/pki/etcd-key.pem \ --peer-cert-file=/etc/kubernetes/pki/etcd.pem \ --peer-key-file=/etc/kubernetes/pki/etcd-key.pem \ --trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --peer-trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --logger=zap Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF # 4. 准备克隆文件 tar cvf etcd-clone.tar /usr/bin/etcd* /etc/etcd /lib/systemd/system/etcd.service scp etcd-clone.tar root@192.168.80.46:/root scp etcd-clone.tar root@192.168.80.47:/root
# 1. 解压克隆文件 sudo -i cd / && mv /root/etcd-clone.tar / && tar xvf etcd-clone.tar && rm -f etcd-clone.tar # 2. 修改配置文件 vim /etc/etcd/etcd.conf #[Member] ETCD_NAME="etcd-2" # change to local ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://192.168.80.46:2380" # change to local ETCD_LISTEN_CLIENT_URLS="https://192.168.80.46:2379,https://127.0.0.1:2379" # change to local #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.80.46:2380" # change to local ETCD_ADVERTISE_CLIENT_URLS="https://192.168.80.46:2379" # change to local ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.80.45:2380,etcd-2=https://192.168.80.46:2380,etcd-3=https://192.168.80.47:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new"
# 1. 开机启动 systemctl daemon-reload systemctl start etcd systemctl status etcd systemctl enable etcd # 2. 运行状态 etcdctl member list --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --write-out=table +------------------+---------+--------+----------------------------+----------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------+----------------------------+----------------------------+------------+ | 46bc5ad35e418584 | started | etcd-1 | https://192.168.80.45:2380 | https://192.168.80.45:2379 | false | | 8f347c1327049bc8 | started | etcd-3 | https://192.168.80.47:2380 | https://192.168.80.47:2379 | false | | b01e7a29099f3eb8 | started | etcd-2 | https://192.168.80.46:2380 | https://192.168.80.46:2379 | false | +------------------+---------+--------+----------------------------+----------------------------+------------+ # 3. 健康状态 etcdctl endpoint health --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --cluster --write-out=table +----------------------------+--------+-------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +----------------------------+--------+-------------+-------+ | https://192.168.80.47:2379 | true | 20.973639ms | | | https://192.168.80.46:2379 | true | 29.842299ms | | | https://192.168.80.45:2379 | true | 30.564766ms | | +----------------------------+--------+-------------+-------+ # 4. 查看LEADER
kubernetes master 节点组件:
kube-apiserver
kube-scheduler
kube-controller-manager
kubelet (非必须,但必要)
kube-proxy(非必须,但必要)
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install wget https://dl.k8s.io/v1.19.11/kubernetes-server-linux-amd64.tar.gz tar zxvf kubernetes-server-linux-amd64.tar.gz cd kubernetes/server/bin cp kube-apiserver kube-scheduler kube-controller-manager kubectl kubelet kube-proxy /usr/bin
启用 TLS Bootstrapping 机制:
TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。
TLS bootstraping
工作流程:
BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ') # 格式:token,用户名,UID,用户组 cat > /etc/kubernetes/token.csv <<EOF ${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:node-bootstrapper" EOF
--service-cluster-ip-range=10.254.0.0/16
: Service IP 段
cat > /etc/kubernetes/kube-apiserver.conf << EOF KUBE_APISERVER_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --etcd-servers=https://192.168.80.45:2379,https://192.168.80.46:2379,https://192.168.80.47:2379 \\ --bind-address=192.168.80.45 \\ --secure-port=6443 \\ --advertise-address=192.168.80.45 \\ --allow-privileged=true \\ --service-cluster-ip-range=10.254.0.0/16 \\ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\ --authorization-mode=RBAC,Node \\ --enable-bootstrap-token-auth=true \\ --token-auth-file=/etc/kubernetes/token.csv \\ --service-node-port-range=30000-32767 \\ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \\ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \\ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \\ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --client-ca-file=/etc/kubernetes/pki/ca.pem \\ --service-account-key-file=/etc/kubernetes/pki/ca-key.pem \\ --service-account-issuer=api \\ --service-account-signing-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --etcd-cafile=/etc/kubernetes/pki/ca.pem \\ --etcd-certfile=/etc/kubernetes/pki/etcd.pem \\ --etcd-keyfile=/etc/kubernetes/pki/etcd-key.pem \\ --requestheader-client-ca-file=/etc/kubernetes/pki/ca.pem \\ --proxy-client-cert-file=/etc/kubernetes/pki/apiserver.pem \\ --proxy-client-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --requestheader-allowed-names=kubernetes \\ --requestheader-extra-headers-prefix=X-Remote-Extra- \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --enable-aggregator-routing=true \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/var/log/kubernetes/k8s-audit.log" EOF
# 1. 系统管理 cat > /lib/systemd/system/kube-apiserver.service << EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/etc/kubernetes/kube-apiserver.conf ExecStart=/usr/bin/kube-apiserver \$KUBE_APISERVER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF # 2. 启动 systemctl daemon-reload systemctl start kube-apiserver systemctl status kube-apiserver systemctl enable kube-apiserver
KUBE_CONFIG="/etc/kubernetes/kube-controller-manager.kubeconfig" KUBE_APISERVER="https://192.168.80.45:6443" kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-controller-manager \ --client-certificate=/etc/kubernetes/pki/kube-controller-manager.pem \ --client-key=/etc/kubernetes/pki/kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-controller-manager \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
--cluster-cidr=10.244.0.0/16
: Pod IP 段
--service-cluster-ip-range=10.254.0.0/16
: Service IP 段
cat > /etc/kubernetes/kube-controller-manager.conf << EOF KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --leader-elect=true \\ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --bind-address=127.0.0.1 \\ --allocate-node-cidrs=true \\ --cluster-cidr=10.244.0.0/16 \\ --service-cluster-ip-range=10.254.0.0/16 \\ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \\ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \\ --root-ca-file=/etc/kubernetes/pki/ca.pem \\ --service-account-private-key-file=/etc/kubernetes/pki/ca-key.pem \\ --cluster-signing-duration=87600h0m0s" EOF
cat > /lib/systemd/system/kube-controller-manager.service << EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/kube-controller-manager.conf ExecStart=/usr/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-controller-manager systemctl status kube-controller-manager systemctl enable kube-controller-manager
KUBE_CONFIG="/etc/kubernetes/kube-scheduler.kubeconfig" KUBE_APISERVER="https://192.168.80.45:6443" kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-scheduler \ --client-certificate=/etc/kubernetes/pki/kube-scheduler.pem \ --client-key=/etc/kubernetes/pki/kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-scheduler \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
cat > /etc/kubernetes/kube-scheduler.conf << EOF KUBE_SCHEDULER_OPTS="--logtostderr=false \ --v=2 \ --log-dir=/var/log/kubernetes \ --leader-elect \ --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --bind-address=127.0.0.1" EOF
cat > /lib/systemd/system/kube-scheduler.service << EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/kube-scheduler.conf ExecStart=/usr/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-scheduler systemctl enable kube-scheduler systemctl status kube-scheduler
cat > /etc/kubernetes/kubelet-config.yml << EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 readOnlyPort: 10255 cgroupDriver: cgroupfs clusterDNS: - 10.254.0.2 clusterDomain: cluster.local failSwapOn: false authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% maxOpenFiles: 1000000 maxPods: 110 EOF
BOOTSTRAP_TOKEN=$(cat /etc/kubernetes/token.csv | awk -F, '{print $1}') KUBE_CONFIG="/etc/kubernetes/bootstrap.kubeconfig" KUBE_APISERVER="https://192.168.80.45:6443" # 生成 kubelet bootstrap kubeconfig 配置文件 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials "kubelet-bootstrap" \ --token=${BOOTSTRAP_TOKEN} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user="kubelet-bootstrap" \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
其中:--kubeconfig=/etc/kubernetes/kubelet.kubeconfig
在加入集群时自动生成
cat > /etc/kubernetes/kubelet.conf << EOF KUBELET_OPTS="--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --hostname-override=k8s-master1 \\ --network-plugin=cni \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \\ --config=/etc/kubernetes/kubelet-config.yml \\ --cert-dir=/etc/kubernetes/pki \\ --pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.1" EOF
防止错误:failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
kubectl create clusterrolebinding kubelet-bootstrap \ --clusterrole=system:node-bootstrapper \ --user=kubelet-bootstrap
cat > /lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet After=docker.service [Service] EnvironmentFile=/etc/kubernetes/kubelet.conf ExecStart=/usr/bin/kubelet \$KUBELET_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kubelet systemctl enable kubelet systemctl status kubelet
# 查看kubelet证书请求 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8 25s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending # 批准申请 kubectl certificate approve node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk # 再次查看证书 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk 53m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued # 查看节点(由于网络插件还没有部署,节点会没有准备就绪 NotReady) kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady <none> 4m8s v1.19.11
clusterCIDR: 10.254.0.0/16
: Service IP 段,与apiserver & controller-manager 的--service-cluster-ip-range
一致
cat > /etc/kubernetes/kube-proxy-config.yml << EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 metricsBindAddress: 0.0.0.0:10249 clientConnection: kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig hostnameOverride: k8s-master1 clusterCIDR: 10.254.0.0/16 EOF
KUBE_CONFIG="/etc/kubernetes/kube-proxy.kubeconfig" KUBE_APISERVER="https://192.168.80.45:6443" kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-proxy \ --client-certificate=/etc/kubernetes/pki/kube-proxy.pem \ --client-key=/etc/kubernetes/pki/kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
cat > /etc/kubernetes/kube-proxy.conf << EOF KUBE_PROXY_OPTS="--logtostderr=false \ --v=2 \ --log-dir=/var/log/kubernetes \ --config=/etc/kubernetes/kube-proxy-config.yml" EOF
cat > /lib/systemd/system/kube-proxy.service << EOF [Unit] Description=Kubernetes Proxy After=network.target [Service] EnvironmentFile=/etc/kubernetes/kube-proxy.conf ExecStart=/usr/bin/kube-proxy \$KUBE_PROXY_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-proxy systemctl enable kube-proxy systemctl status kube-proxy
apiserver
访问 kubelet
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install cat > apiserver-to-kubelet-rbac.yaml << EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics - pods/log verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF kubectl apply -f apiserver-to-kubelet-rbac.yaml
mkdir -p /root/.kube KUBE_CONFIG=/root/.kube/config KUBE_APISERVER="https://192.168.80.45:6443" kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials cluster-admin \ --client-certificate=/etc/kubernetes/pki/admin.pem \ --client-key=/etc/kubernetes/pki/admin-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=cluster-admin \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
kubectl config view apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://192.168.80.45:6443 name: kubernetes contexts: - context: cluster: kubernetes user: cluster-admin name: default current-context: default kind: Config preferences: {} users: - name: cluster-admin user: client-certificate-data: REDACTED client-key-data: REDACTED
kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-1 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"}
apt install -y bash-completion source /usr/share/bash-completion/bash_completion source <(kubectl completion bash) echo "source <(kubectl completion bash)" >> ~/.bashrc
Kubernetes node节点组件:
kubelet
kube-proxy
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install tar cvf worker-node-clone.tar /usr/bin/{kubelet,kube-proxy} /lib/systemd/system/{kubelet,kube-proxy}.service /etc/kubernetes/kubelet* /etc/kubernetes/kube-proxy* /etc/kubernetes/pki /etc/kubernetes/bootstrap.kubeconfig scp worker-node-clone.tar root@192.168.80.46:/root scp worker-node-clone.tar root@192.168.80.47:/root
cd / && mv /root/worker-node-clone.tar / && tar xvf worker-node-clone.tar && rm -f worker-node-clone.tar # 删除证书申请审批后自动生成的文件,后面重新生成 rm -f /etc/kubernetes/kubelet.kubeconfig rm -f /etc/kubernetes/pki/kubelet* # 日志目录 mkdir -p /var/log/kubernetes
按实际节点名称修改
# kubelet vim /etc/kubernetes/kubelet.conf --hostname-override=k8s-node01 # kube-proxy vim /etc/kubernetes/kube-proxy-config.yml hostnameOverride: k8s-node01
systemctl daemon-reload systemctl start kubelet kube-proxy systemctl enable kubelet kube-proxy systemctl status kubelet kube-proxy
# 1. 节点信息 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-j51DeSAxg95ZULzX0rm8RBIjUQU1O3d4gxBYcAsZkHk 28s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending node-csr-oK3jPn4eE3vsNrO88g4vSq2Z66k-8nhEJhDAKPgWZ5k 14s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk 14m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued # 2. 批准加入 kubectl certificate approve node-csr-j51DeSAxg95ZULzX0rm8RBIjUQU1O3d4gxBYcAsZkHk kubectl certificate approve node-csr-oK3jPn4eE3vsNrO88g4vSq2Z66k-8nhEJhDAKPgWZ5k # 3. 集群节点 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady <none> 45m v1.19.11 k8s-node01 NotReady <none> 6s v1.19.11 k8s-node02 NotReady <none> 10s v1.19.11 # 4. 设置标签,即更改节点角色 kubectl label node k8s-master1 node-role.kubernetes.io/master= kubectl label node k8s-node01 node-role.kubernetes.io/node= kubectl label node k8s-node02 node-role.kubernetes.io/node= kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady master 49m v1.19.11 k8s-node01 NotReady node 3m45s v1.19.11 k8s-node02 NotReady node 3m49s v1.19.11 # 5. 设置污点:是master节点无法创建pod kubectl taint nodes k8s-master1 node-role.kubernetes.io/master=:NoSchedule kubectl describe node k8s-master1 Taints: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoSchedule
# 节点状态 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady master 49m v1.19.11 k8s-node01 NotReady node 3m45s v1.19.11 k8s-node02 NotReady node 3m49s v1.19.11 # 检查日志,发现网络插件未安装 journalctl -u kubelet -f Jun 02 14:24:29 k8s-master1 kubelet[75636]: W0602 14:24:29.172144 75636 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d Jun 02 14:24:32 k8s-master1 kubelet[75636]: E0602 14:24:32.958021 75636 kubelet.go:2129] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
其中涉及的IP段,要与 kube-controller-manager中 “–cluster-cidr” 一致
所有节点都要操作
mkdir -p $HOME/k8s-install/network && cd $_ wget https://github.com/containernetworking/plugins/releases/download/v0.9.1/cni-plugins-linux-amd64-v0.9.1.tgz mkdir -p /opt/cni/bin tar zxvf cni-plugins-linux-amd64-v0.9.1.tgz -C /opt/cni/bin
Calico
是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。
注意:镜像pending时需要先手动将镜像拉取到本地
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network # 1. 下载插件 wget https://docs.projectcalico.org/manifests/calico.yaml # CIDR的值,与 kube-controller-manager中“--cluster-cidr=10.244.0.0/16” 一致 vi calico.yaml 3680 # The default IPv4 pool to create on startup if none exists. Pod IPs will be 3681 # chosen from this range. Changing this value after installation will have 3682 # no effect. This should fall within `--cluster-cidr`. 3683 - name: CALICO_IPV4POOL_CIDR 3684 value: "10.244.0.0/16" # 2. 安装网络插件 kubectl apply -f calico.yaml # 3. 检查是否启动 kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-7f4f5bf95d-tgklk 1/1 Running 0 2m7s calico-node-fwv5x 1/1 Running 0 2m8s calico-node-ttt2c 1/1 Running 0 2m8s calico-node-xjvjf 1/1 Running 0 2m8s # 4. 节点状态正常 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 65m v1.19.11 k8s-node01 Ready node 20m v1.19.11 k8s-node02 Ready node 20m v1.19.11
这个也是一个网络组件方案可以和calico插件二选一
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network # FQ访问 wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml kubectl apply -f kube-flannel.yml vim kube-flannel.yml "Network": "10.244.0.0/16", kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE kube-flannel-ds-8qnnx 1/1 Running 0 10s kube-flannel-ds-979lc 1/1 Running 0 16m kube-flannel-ds-kgmgg 1/1 Running 0 16m kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 85m v1.19.11 k8s-node01 Ready node 40m v1.19.11 k8s-node02 Ready node 40m v1.19.11
源文件:
cat <<EOF > kube-flannel.yml --- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: "/etc/cni/net.d" - pathPrefix: "/etc/kube-flannel" - pathPrefix: "/run/flannel" readOnlyRootFilesystem: false # Users and groups runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny # Privilege Escalation allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false # Capabilities allowedCapabilities: ['NET_ADMIN'] defaultAddCapabilities: [] requiredDropCapabilities: [] # Host namespaces hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 # SELinux seLinux: # SELinux is unused in CaaSP rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel rules: - apiGroups: ['extensions'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: ['psp.flannel.unprivileged'] - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - amd64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-arm64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-arm64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-arm command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-arm command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-ppc64le namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - ppc64le hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-ppc64le command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-ppc64le command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-s390x namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - s390x hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-s390x command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-s390x command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg EOF
CoreDNS用于集群内部Service名称解析
安装方式一:
有问题解决链接:https://blog.51cto.com/hexiaoshuai/2812394
官网:https://github.com/coredns/deployment/tree/master/kubernetes
apt install jq -y mkdir -p $HOME/k8s-install/coredns && cd $HOME/k8s-install/coredns git clone https://github.com/coredns/deployment.git export CLUSTER_DNS_SVC_IP="10.254.0.2" export CLUSTER_DNS_DOMAIN="cluster.local" # 修改coredns.yaml.sed文件 loop 去掉 # 执行 ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
安装方式二:
# 下载文件 wget https://storage.googleapis.com/kubernetes-the-hard-way/coredns.yaml # 修改文件内容cluster ip clusterIP: 10.254.0.2 # 启动文件 kubectl apply -f coredns.yaml # 查询状态 kubectl get pods -n kube-system | grep coredns coredns-7bb48b4bc5-42j9n 1/1 Running 0 3m36s coredns-7bb48b4bc5-8ppbl 1/1 Running 0 3m36s # 查询状态 kubectl get pods -n kube-system | grep coredns coredns-746fcb4bc5-nts2k 1/1 Running 0 6m2s # 验证 busybox1.28.4有问题 kubectl run -it --rm dns-test --image=busybox:1.28.4 /bin/sh If you don't see a command prompt, try pressing enter. / # nslookup kubernetes Server: 10.254.0.2 Address: 10.254.0.2:53 Name: kubernetes.default.svc.cluster.local Address: 10.0.0.1
DNS问题排查:
# dns service kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 13m # endpoints 是否正常 kubectl get endpoints kube-dns -n kube-system NAME ENDPOINTS AGE kube-dns 10.244.85.194:53,10.244.85.194:53,10.244.85.194:9153 13m # coredns 增加解析日志 CoreDNS 配置参数说明: errors: 输出错误信息到控制台。 health:CoreDNS 进行监控检测,检测地址为 http://localhost:8080/health 如果状态为不健康则让 Pod 进行重启。 ready: 全部插件已经加载完成时,将通过 endpoints 在 8081 端口返回 HTTP 状态 200。 kubernetes:CoreDNS 将根据 Kubernetes 服务和 pod 的 IP 回复 DNS 查询。 prometheus:是否开启 CoreDNS Metrics 信息接口,如果配置则开启,接口地址为 http://localhost:9153/metrics forward:任何不在Kubernetes 集群内的域名查询将被转发到预定义的解析器 (/etc/resolv.conf)。 cache:启用缓存,30 秒 TTL。 loop:检测简单的转发循环,如果找到循环则停止 CoreDNS 进程。 reload:监听 CoreDNS 配置,如果配置发生变化则重新加载配置。 loadbalance:DNS 负载均衡器,默认 round_robin。 # 编辑 coredns 配置 kubectl edit configmap coredns -n kube-system apiVersion: v1 data: Corefile: | .:53 { log # new add errors health { lameduck 5s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance } kind: ConfigMap metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","data":{"Corefile":".:53 {\n errors\n health {\n lameduck 5s\n }\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . /etc/resolv.conf {\n max_concurrent 1000\n }\n cache 30\n loop\n reload\n loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"coredns","namespace":"kube-system"}} creationTimestamp: "2021-05-13T11:57:45Z" name: coredns namespace: kube-system resourceVersion: "38460" selfLink: /api/v1/namespaces/kube-system/configmaps/coredns uid: c62a856d-1fc3-4fe9-b5f1-3ca0dbeb39c1
回滚操作(需要外网,根据部署方式一来的):
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/rollback.sh chmod +x rollback.sh export CLUSTER_DNS_SVC_IP="10.254.0.2" export CLUSTER_DNS_DOMAIN="cluster.local" # 这个建议能访问外网 ./rollback.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f - kubectl delete --namespace=kube-system deployment coredns
GitHub:https://github.com/kubernetes/dashboard/blob/master/aio/deploy/recommended.yaml
如果镜像拉不下来可以直接使用docker pull 镜像名
的方式
mkdir -p $HOME/k8s-install/dashboard && cd $HOME/k8s-install/dashboard # 1. 下载并安装 wget https://github.com/kubernetes/dashboard/blob/v2.5.1/aio/deploy/recommended.yaml kubectl apply -f recommended.yaml # 2. 检查运行状态 kubectl get pods -n kubernetes-dashboard -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dashboard-metrics-scraper-5b8896d7fc-58fgt 0/1 ContainerCreating 0 7s <none> k8s-node01 <none> <none> kubernetes-dashboard-7b5d774449-tn7hk 0/1 ContainerCreating 0 7s <none> k8s-master1 <none> <none> # 3. 检查服务状态 kubectl get svc -n kubernetes-dashboard -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dashboard-metrics-scraper ClusterIP 10.254.14.1 <none> 8000/TCP 24m k8s-app=dashboard-metrics-scraper kubernetes-dashboard ClusterIP 10.254.219.125 <none> 443/TCP 24m k8s-app=kubernetes-dashboard # 4. 服务改为NodePort方式 kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard type: ClusterIP => type: NodePort kubectl get svc -n kubernetes-dashboard -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dashboard-metrics-scraper ClusterIP 10.254.14.1 <none> 8000/TCP 3h30m k8s-app=dashboard-metrics-scraper kubernetes-dashboard NodePort 10.254.219.125 <none> 443:31639/TCP 3h30m k8s-app=kubernetes-dashboard # 5. 创建service account并绑定默认cluster-admin管理员集群角色: kubectl create serviceaccount dashboard-admin -n kube-system kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin # 6. 获取访问 token kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}') Name: dashboard-admin-token-xwd72 Namespace: kube-system Labels: <none> Annotations: kubernetes.io/service-account.name: dashboard-admin kubernetes.io/service-account.uid: 013e9f84-827f-4dc7-81b3-874a28bfebc6 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1310 bytes namespace: 11 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6InNQRElCQTlPRUZ5SU54STQ1QWllLXlKMTFCcmZieG0wVTJnRlpzYlBNLXcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4teHdkNzIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMDEzZTlmODQtODI3Zi00ZGM3LTgxYjMtODc0YTI4YmZlYmM2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.O-DI-0IlLFP2pDRKzQYJrZeDAnVvW1IjU-iVwGzvwID7BH0v6kXfWnti07qm8VkuGFJtpuQsmrf6v4sUeRDhr95kZlEVV8Rxnes6oixrkXdk3fR4xreh4lh6ZgCzbER6xI8pMG-j9KNjTRdY6gQPJuOThtI9ab13dpTT5AYpggA2O98DFfgcJ_DzD05hhk6TghOdoro00msHRSUrsEiH0CYa_3PiyPlkvmmY3MlJPsBTdO2pCDzcrjQ2L5EaJAvSh6OodkRY6ymOwfcbfPs3WwSocCEfwkogYOCAQhMC4NU3Jea_hoeFqzLdS1PK5R2rPT-wqemwjDKn0E6jUv6juw # 7. 访问 https://192.168.80.45:31639
角色 | IP | 组件 | 备注 |
---|---|---|---|
k8s-master1 | 192.168.80.45 | etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker | |
k8s-node01 | 192.168.80.46 | etcd, kubelet, kube-proxy, docker | |
k8s-node02 | 192.168.80.47 | etcd, kubelet, kube-proxy, docker | |
k8s-master2 | 192.168.80.49 | etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker | 新增节点 |
在新增节点的IP段未在证书中时需要如下操作:
mkdir -p /root/ssl && cd /root/ssl # 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "localhost", "192.168.80.1", "192.168.80.2", "192.168.80.3", "192.168.80.45", "192.168.80.46", "192.168.80.47", "192.168.80.48", "192.168.80.49", "10.254.0.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver # 3. 证书更新 cp apiserver*.pem /etc/kubernetes/pki scp apiserver*.pem root@192.168.80.46:/root scp apiserver*.pem root@192.168.80.47:/root # 4. node节点证书更新 chown root:root /root/apiserver*.pem mv /root/apiserver*.pem /etc/kubernetes/pki # 5. 重启 apiserver systemctl restart kube-apiserver systemctl status kube-apiserver
在 k8s-master1, k8s-node01, k8s-node02 上制作:
echo '192.168.80.49 k8s-master2' >> /etc/hosts
# 1. 修改主机名 hostnamectl set-hostname k8s-master2 # 2. 主机名解析 cat >> /etc/hosts <<EOF 192.168.80.45 k8s-master1 192.168.80.46 k8s-node01 192.168.80.47 k8s-node02 192.168.80.49 k8s-master2 EOF # 3. 禁用 swap swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # 4. 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 5. 域名解析 echo "nameserver 8.8.8.8" >> /etc/resolv.conf # 6. 时间同步 apt install ntpdate -y ntpdate ntp1.aliyun.com crontab -e */30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1 # 7. 日志目录 mkdir -p /var/log/kubernetes
# 1. k8s-master1 上执行 mkdir -p $HOME/k8s-install && cd $HOME/k8s-install tar zcvf master-node-clone.tar.gz /usr/bin/kube* /lib/systemd/system/kube*.service /etc/kubernetes /root/.kube/config /usr/bin/docker* /usr/bin/runc /usr/bin/containerd* /usr/bin/ctr /etc/docker /lib/systemd/system/docker.service scp master-node-clone.tar.gz root@192.168.80.49:/root # 2. k8s-master2 执行 cd / && mv /root/master-node-clone.tar.gz / && tar zxvf master-node-clone.tar.gz && rm -f master-node-clone.tar.gz rm -f /etc/kubernetes/kubelet.kubeconfig rm -f /etc/kubernetes/pki/kubelet*
vim /etc/kubernetes/kube-apiserver.conf --bind-address=192.168.80.49 \ --advertise-address=192.168.80.49 \ sed -i 's#k8s-master1#k8s-master2#' /etc/kubernetes/* sed -i 's#192.168.80.45:6443#192.168.80.49:6443#' /etc/kubernetes/* vi /root/.kube/config server: https://192.168.80.49:6443
systemctl daemon-reload systemctl start docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy systemctl status docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy systemctl enable docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"}
kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54 7m49s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending # 批准加入 kubectl certificate approve node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54 kubectl get node NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 27h v1.19.11 k8s-master2 NotReady <none> 11s v1.19.11 k8s-node01 Ready node 27h v1.19.11 k8s-node02 Ready node 27h v1.19.11
# 设置标签 kubectl label node k8s-master2 node-role.kubernetes.io/master= # 设置污点:是master节点无法创建pod kubectl taint nodes k8s-master2 node-role.kubernetes.io/master=:NoSchedule # 节点信息 kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS k8s-master1 Ready master 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master1,kubernetes.io/os=linux,node-role.kubernetes.io/master= k8s-master2 Ready master 2m33s v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master2,kubernetes.io/os=linux,node-role.kubernetes.io/master= k8s-node01 Ready node 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,node-role.kubernetes.io/node= k8s-node02 Ready node 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node02,kubernetes.io/os=linux,node-role.kubernetes.io/node=
Nginx
: 主流Web服务和反向代理服务器,这里用四层实现对apiserver实现负载均衡。
Keepalived: 主流高可用软件,基于VIP绑定实现服务器双机热备。Keepalived主要根据Nginx运行状态判断是否需要故障转移(漂移VIP),例如当Nginx主节点挂掉,VIP会自动绑定在Nginx备节点,从而保证VIP一直可用,实现Nginx高可用。
服务器规划:
角色 | IP | 组件 |
---|---|---|
k8s-master1 | 192.168.80.45 | kube-apiserver |
k8s-master2 | 192.168.80.49 | kube-apiserver |
k8s-loadbalancer1 | 192.168.80.2 | nginx, keepalived |
k8s-loadbalancer2 | 192.168.80.3 | nginx, keepalived |
VIP | 192.168.80.1 | 虚拟IP |
apt install nginx keepalived -y sudo useradd nginx -G www-data
解决stream问题:https://blog.csdn.net/qq_39043100/article/details/89644264
cat > /etc/nginx/nginx.conf << "EOF" load_module /usr/lib/nginx/modules/ngx_stream_module.so; user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } stream { log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; access_log /var/log/nginx/k8s-access.log main; upstream k8s-apiserver { server 192.168.80.45:6443; # Master1 APISERVER IP:PORT server 192.168.80.49:6443; # Master2 APISERVER IP:PORT } server { listen 16443; proxy_pass k8s-apiserver; } } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; server { listen 80 default_server; server_name _; location / { } } } EOF
cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_MASTER } # 检查脚本 vrrp_script check_nginx { script "/etc/keepalived/check_nginx.sh" } vrrp_instance VI_1 { state MASTER interface ens33 # 修改为实际网卡名 virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的 priority 100 # 优先级,备服务器设置 90 advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒 authentication { auth_type PASS auth_pass 1111 } # 虚拟IP virtual_ipaddress { 192.168.80.100/24 } track_script { check_nginx } } EOF
cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_BACKUP } # 检查脚本 vrrp_script check_nginx { script "/etc/keepalived/check_nginx.sh" } vrrp_instance VI_1 { state BACKUP interface ens33 # 修改为实际网卡名 virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的 priority 90 # 优先级,备服务器设置 90 advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒 authentication { auth_type PASS auth_pass 1111 } # 虚拟IP virtual_ipaddress { 192.168.80.100/24 } track_script { check_nginx } } EOF
cat > /etc/keepalived/check_nginx.sh << "EOF" #!/bin/bash count=$(ss -antp |grep 16443 |egrep -cv "grep|$$") if [ "$count" -eq 0 ];then exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_nginx.sh
systemctl daemon-reload systemctl start nginx keepalived systemctl enable nginx keepalived # 卸载nginx命令 sudo apt-get remove nginx nginx-common # 卸载删除除了配置文件以外的所有文件。 sudo apt-get purge nginx nginx-common # 卸载所有东东,包括删除配置文件。 sudo apt-get autoremove # 在上面命令结束后执行,主要是卸载删除Nginx的不再被使用的依赖包。 sudo apt-get remove nginx-full nginx-common #卸载删除两个主要的包。
ip addr curl -k https://192.168.80.100:16443/version { "major": "1", "minor": "19", "gitVersion": "v1.19.11", "gitCommit": "c6a2f08fc4378c5381dd948d9ad9d1080e3e6b33", "gitTreeState": "clean", "buildDate": "2021-05-12T12:19:22Z", "goVersion": "go1.15.12", "compiler": "gc", "platform": "linux/amd64" }
master01操作,在主机后面加上vip的ip地址
mkdir -p /root/ssl && cd /root/ssl # 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "localhost", "192.168.80.1", "192.168.80.2", "192.168.80.3", "192.168.80.45", "192.168.80.46", "192.168.80.47", "192.168.80.48", "192.168.80.49", "192.168.80.100", "10.254.0.1", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "System" } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver # 3. 证书更新 cp apiserver*.pem /etc/kubernetes/pki scp apiserver*.pem root@192.168.80.46:/root scp apiserver*.pem root@192.168.80.47:/root scp apiserver*.pem root@192.168.80.49:/root # 4. node节点证书更新 chown root:root /root/apiserver*.pem mv /root/apiserver*.pem /etc/kubernetes/pki # 5. 重启 apiserver systemctl restart kube-apiserver systemctl status kube-apiserver
node节点
sed -i 's#192.168.80.45:6443#192.168.80.100:16443#' /etc/kubernetes/* sed -i 's#192.168.80.45:6443#192.168.80.100:16443#' /etc/kubernetes/pki/* systemctl restart kubelet kube-proxy grep '192.168.80.' /etc/kubernetes/* kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 3d17h v1.19.11 k8s-master2 Ready master 2d16h v1.19.11 k8s-node01 Ready node 3d15h v1.19.11 k8s-node02 Ready node 3d15h v1.19.11
# 1. k8s-master2 上,停止kubelet进程 systemctl stop kubelet # 2. 检查 k8s-master2 是否已下线 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 NotReady master 12h v1.19.11 k8s-node01 Ready node 40h v1.19.11 k8s-node02 Ready node 40h v1.19.11 # 3. 删除节点 kubectl drain k8s-master2 node/k8s-master2 cordoned error: unable to drain node "k8s-master2", aborting command... There are pending nodes to be drained: k8s-master2 error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-lwj2r # 4. 强制下线 kubectl drain k8s-master2 --ignore-daemonsets node/k8s-master2 already cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-lwj2r node/k8s-master2 drained # 5. 下线状态 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 Ready,SchedulingDisabled master 12h v1.19.11 k8s-node01 Ready node 39h v1.19.11 k8s-node02 Ready node 39h v1.19.11 # 6. 恢复操作 (如有必要) kubectl uncordon k8s-master2 node/k8s-master2 uncordoned kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 Ready master 12h v1.19.11 k8s-node01 Ready node 39h v1.19.11 k8s-node02 Ready node 39h v1.19.11 # 7. 彻底删除 kubectl delete node k8s-master2 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 41h v1.19.11 k8s-node01 Ready node 40h v1.19.11 k8s-node02 Ready node 40h v1.19.11