ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

在运行一段时间的集群中加入新的节点(k8s-node)

2022-01-21 20:04:07  阅读:146  来源: 互联网

标签:node reset kubernetes gz kubelet k8s root 节点


前言:新部署的 k8s 集群添加 node 节点,只需要 kubeadm join 即可,如果一个集群运行一段时间后,再需要添加 node ,由于 token 和 sha256 编码没有记录,需要重新查看

1 查看现有集群 node 信息

root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE   VERSION
gz-cpu031   Ready    node     24d   v1.14.1-2
gz-cpu032   Ready    node     24d   v1.14.1-2
gz-cpu033   Ready    node     24d   v1.14.1-2
gz-gpu101   Ready    master   24d   v1.14.1-2
root@gz-gpu101:~#

2 查看 token

默认 token 的有效期为 24 小时,当过期之后,该 token 就不可用了,在 master 节点上执行 kubeadm token create 重新创建 token 即可

而,我们这个集群居然是永久有效,那就省略 create 的步骤了

root@gz-gpu101:~# kubeadm token list
TOKEN                     TTL         EXPIRES   USAGES                   DESCRIPTION   EXTRA GROUPS
hwfep2.iw7q7ltqdwxbati5   <forever>   <never>   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
root@gz-gpu101:~#

3 获取ca证书sha256编码hash值

root@gz-gpu101:~# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | awk '{print $2}'
72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
root@gz-gpu101:~#

4 新节点加入集群

各需要加入到 k8s master 集群中的 node 节点都要初始化(关闭防火墙、swap等),并安装 docker、kubeadm、kubelet,并启动 kubelet 服务

具体请查看,kubeadm 安装部署 k8s

在新的 node(gz-gpu082) 上执行 kubeadm init 添加命令

kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060

5 再次查看

root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE     VERSION
gz-cpu031   Ready    node     24d     v1.14.1-2
gz-cpu032   Ready    node     24d     v1.14.1-2
gz-cpu033   Ready    node     24d     v1.14.1-2
gz-gpu101   Ready    master   24d     v1.14.1-2
qa-gpu082   Ready    <none>   2m18s   v1.14.1
root@gz-gpu101:~#

6 其他

6.1 修改 ROLES

新加入的 node ROLES 显示 ,观感不佳,修改一下,以下操作都是在 k8s-master 上执行

# 查看标签
root@gz-gpu101:~# kubectl get node --show-labels|grep role

# 给新 node 添加标签(此操作需要根据上个命令的结果做参考设置)
root@gz-gpu101:~# kubectl label node qa-gpu082 kubernetes.io/role=node
node/qa-gpu082 labeled

# 查看 node 信息
root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE   VERSION
gz-cpu031   Ready    node     24d   v1.14.1-2
gz-cpu032   Ready    node     24d   v1.14.1-2
gz-cpu033   Ready    node     24d   v1.14.1-2
gz-gpu101   Ready    master   24d   v1.14.1-2
qa-gpu082   Ready    node     14m   v1.14.1
root@gz-gpu101:~# 
6.2 错误处理

错误信息

root@qa-gpu082:~# kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.7. Latest validated version: 18.09
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
	[ERROR FileAvailable--etc-kubernetes-bootstrap-kubelet.conf]: /etc/kubernetes/bootstrap-kubelet.conf already exists
	[ERROR Port-10250]: Port 10250 is in use
	[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
root@qa-gpu082:~#

这台机器以前可能被用过,所以需要 reset 一下

解决办法

root@qa-gpu082:~# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0120 10:00:56.461284  684201 reset.go:234] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually.
For example:
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

root@qa-gpu082:~#


# 再次执行 join
root@qa-gpu082:~# kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.7. Latest validated version: 18.09
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
	[WARNING RequiredIPVSKernelModulesAvailable]:

The IPVS proxier may not be used because the following required kernel modules are not loaded: [ip_vs_rr ip_vs_wrr ip_vs_sh ip_vs]
or no builtin kernel IPVS support was found: map[ip_vs:{} ip_vs_rr:{} ip_vs_sh:{} ip_vs_wrr:{} nf_conntrack:{}].
However, these modules may be loaded automatically by kube-proxy if they are available on your system.
To verify IPVS support:

   Run "lsmod | grep 'ip_vs|nf_conntrack'" and verify each of the above modules are listed.

If they are not listed, you can use the following methods to load them:

1. For each missing module run 'modprobe $modulename' (e.g., 'modprobe ip_vs', 'modprobe ip_vs_rr', ...)
2. If 'modprobe $modulename' returns an error, you will need to install the missing module support for your kernel.

[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

root@qa-gpu082:~#

新部署的 k8s 集群添加 node 节点,只需要 kubeadm join 即可,如果一个集群运行一段时间后,再需要添加 node ,由于 token 和 sha256 编码没有记录,需要重新查看

1 查看现有集群 node 信息

root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE   VERSION
gz-cpu031   Ready    node     24d   v1.14.1-2
gz-cpu032   Ready    node     24d   v1.14.1-2
gz-cpu033   Ready    node     24d   v1.14.1-2
gz-gpu101   Ready    master   24d   v1.14.1-2
root@gz-gpu101:~#

2 查看 token

默认 token 的有效期为 24 小时,当过期之后,该 token 就不可用了,在 master 节点上执行 kubeadm token create 重新创建 token 即可

而,我们这个集群居然是永久有效,那就省略 create 的步骤了

root@gz-gpu101:~# kubeadm token list
TOKEN                     TTL         EXPIRES   USAGES                   DESCRIPTION   EXTRA GROUPS
hwfep2.iw7q7ltqdwxbati5   <forever>   <never>   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
root@gz-gpu101:~#

3 获取ca证书sha256编码hash值

root@gz-gpu101:~# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | awk '{print $2}'
72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
root@gz-gpu101:~#

4 新节点加入集群

各需要加入到 k8s master 集群中的 node 节点都要初始化(关闭防火墙、swap等),并安装 docker、kubeadm、kubelet,并启动 kubelet 服务

具体请查看,kubeadm 安装部署 k8s

在新的 node(gz-gpu082) 上执行 kubeadm init 添加命令

kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060

5 再次查看

root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE     VERSION
gz-cpu031   Ready    node     24d     v1.14.1-2
gz-cpu032   Ready    node     24d     v1.14.1-2
gz-cpu033   Ready    node     24d     v1.14.1-2
gz-gpu101   Ready    master   24d     v1.14.1-2
qa-gpu082   Ready    <none>   2m18s   v1.14.1
root@gz-gpu101:~#

6 其他

6.1 修改 ROLES

新加入的 node ROLES 显示 ,观感不佳,修改一下,以下操作都是在 k8s-master 上执行

# 查看标签
root@gz-gpu101:~# kubectl get node --show-labels|grep role

# 给新 node 添加标签(此操作需要根据上个命令的结果做参考设置)
root@gz-gpu101:~# kubectl label node qa-gpu082 kubernetes.io/role=node
node/qa-gpu082 labeled

# 查看 node 信息
root@gz-gpu101:~# kubectl get node
NAME                 STATUS   ROLES    AGE   VERSION
gz-cpu031   Ready    node     24d   v1.14.1-2
gz-cpu032   Ready    node     24d   v1.14.1-2
gz-cpu033   Ready    node     24d   v1.14.1-2
gz-gpu101   Ready    master   24d   v1.14.1-2
qa-gpu082   Ready    node     14m   v1.14.1
root@gz-gpu101:~# 
6.2 错误处理

错误信息

root@qa-gpu082:~# kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.7. Latest validated version: 18.09
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
	[ERROR FileAvailable--etc-kubernetes-bootstrap-kubelet.conf]: /etc/kubernetes/bootstrap-kubelet.conf already exists
	[ERROR Port-10250]: Port 10250 is in use
	[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
root@qa-gpu082:~#

这台机器以前可能被用过,所以需要 reset 一下

解决办法

root@qa-gpu082:~# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0120 10:00:56.461284  684201 reset.go:234] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually.
For example:
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

root@qa-gpu082:~#


# 再次执行 join
root@qa-gpu082:~# kubeadm join 172.18.12.23:6443 --token hwfep2.iw7q7ltqdwxbati5 --discovery-token-ca-cert-hash sha256:72acd5b1545b4488ab6385bc9511854557d3460cc013b6595b1e307d8c88f060
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.7. Latest validated version: 18.09
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
	[WARNING RequiredIPVSKernelModulesAvailable]:

The IPVS proxier may not be used because the following required kernel modules are not loaded: [ip_vs_rr ip_vs_wrr ip_vs_sh ip_vs]
or no builtin kernel IPVS support was found: map[ip_vs:{} ip_vs_rr:{} ip_vs_sh:{} ip_vs_wrr:{} nf_conntrack:{}].
However, these modules may be loaded automatically by kube-proxy if they are available on your system.
To verify IPVS support:

   Run "lsmod | grep 'ip_vs|nf_conntrack'" and verify each of the above modules are listed.

If they are not listed, you can use the following methods to load them:

1. For each missing module run 'modprobe $modulename' (e.g., 'modprobe ip_vs', 'modprobe ip_vs_rr', ...)
2. If 'modprobe $modulename' returns an error, you will need to install the missing module support for your kernel.

[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

root@qa-gpu082:~#

标签:node,reset,kubernetes,gz,kubelet,k8s,root,节点
来源: https://blog.csdn.net/weixin_56752399/article/details/122628393

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有