ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

阿里云部署k8s集群

2021-10-14 11:31:09  阅读:326  来源: 互联网

标签:kube -- kubelet 阿里 集群 io docker k8s


首先说一下我的环境和配置:阿里云1核2G,系统是Ubuntu18.04(最好是2核,因为master有限制),node也是1核2G

好了开始进入正题吧

1,更新系统源

如果系统本身自带得镜像地址,服务器在国外,下载速度会很慢,可以打开 /etc/apt/sources.list 替换为国内得镜像源。

apt upgrade

2,更新软件包

将系统得软件组件更新至最新稳定版本。

apt update
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  libcurl4
The following packages will be upgraded:
  curl libcurl4
2 upgraded, 0 newly installed, 0 to remove and 46 not upgraded.
Need to get 378 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Ign:1 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 curl amd64 7.58.0-2ubuntu3.14
Ign:2 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 libcurl4 amd64 7.58.0-2ubuntu3.14
Err:1 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 curl amd64 7.58.0-2ubuntu3.14
  404  Not Found [IP: 100.100.2.148 80]
Err:2 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 libcurl4 amd64 7.58.0-2ubuntu3.14
  404  Not Found [IP: 100.100.2.148 80]
E: Failed to fetch http://mirrors.cloud.aliyuncs.com/ubuntu/pool/main/c/curl/curl_7.58.0-2ubuntu3.14_amd64.deb  404  Not Found [IP: 100.100.2.148 80]
E: Failed to fetch http://mirrors.cloud.aliyuncs.com/ubuntu/pool/main/c/curl/libcurl4_7.58.0-2ubuntu3.14_amd64.deb  404  Not Found [IP: 100.100.2.148 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

不更新的话会遇到这个问题,所以记得更新哦,而且上边已给了提示run apt-get update or try with --fix-missing

3,安装 Docker

也可以参考其它过程安装

apt-get install docker.io

如果需要配置为开机启动,可执行以下命令

systemcd enable docker

systemcd start docker

如果要配置 Docker 镜像加速,打开 /etc/docker/daemon.json 文件,registry-mirrors 增加或修改,加入https://registry.docker-cn.com 这个地址,也可以填写阿里云腾讯云等镜像加速地址。

示例

{
	"registry-mirrors": [

		"https://registry.docker-cn.com"

	]

}

重启 Docker,使配置生效

sudo systemctl daemon-reload

sudo systemctl restart docker

4,安装 K8S

执行以下命令安装 https 工具以及 k8s。

apt-get update && apt-get install -y apt-transport-https curl
apt-get install -y kubelet kubeadm kubectl --allow-unauthenticated
    
#常用命令
重启kubelet服务:
systemctl daemon-reload
systemctl restart kubelet
sudo systemctl restart kubelet.service
sudo systemctl daemon-reload
sudo systemctl stop kubelet
sudo systemctl enable kubelet
sudo systemctl start kubelet

执行下面命令测试是否正常

kubeadm version

#结果示例
kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:37:34Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}

禁用 swapoff

# 暂时关闭SWAP分区
swapoff -a
# 永久禁用SWAP分区
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab

将系统中桥接的IPv4以及IPv6的流量串通:

cat >/etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

这一步安装curl时可能会遇到这个问题

The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
Reading package lists... Done
W: GPG error: https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
E: The repository 'https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

只需执行即可(key就是NO_PUBKEY后的值,根据你自己的key进行替换)

 apt-key adv --keyserver keyserver.ubuntu.com --recv-keys  FEEA9169307EA071

如果安装时,出现下面情况,说明系统得镜像源中,找不到 k8s 的软件包。

No apt package "kubeadm", but there is a snap with that name.
Try "snap install kubeadm"


No apt package "kubectl", but there is a snap with that name.
Try "snap install kubectl"


No apt package "kubelet", but there is a snap with that name.
Try "snap install kubelet"

可以打开 /etc/apt/sources.list 文件,添加一行

deb https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial main

再次执行安装 K8s 的命令。

如果出现

The following signatures couldn't be verified because the public key is not available

则执行下面命令,为期添加 key。

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add 

上面命令,安装了 kubeletkubeadmkubectlkubelet 是 k8s 相关服务,kubectlk8s 管理客户端,kubeadm 是部署工具。

如果只是node的话到这里就可以了

另一台阿里云加入集群只需执行(这个在下面会告诉你怎么弄出来的,等全看完再回来搞就行)

kubeadm join 39.96.46.96:6443 --token 9vbzuf.vtzj1w5vefjlwi0t         --discovery-token-ca-cert-hash sha256:b6e6fffb6b0e11d2db374ce21f6d86de3e09e1e13075e1bf01055130c2c5e060

可能会遇到的错

[kubelet-check] Initial timeout of 40s passed.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

#解决:
swapoff -a
kubeadm reset
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X  
#再次执行join命令,node加入成功
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.


root@ubuntu:~# kubectl get nodes
NAME      STATUS   ROLES                  AGE   VERSION
master    Ready    control-plane,master   26m   v1.22.2
node      Ready    <none>                 15s   v1.22.2

5,初始化

执行下面命令进行初始化,会自动从网络中下载需要的 Docker 镜像。

此命令是用来部署主节点的Master

执行 kubeadm version 查看版本,GitVersion:"v1.17.2" 中即为版本号。

执行以下命令初始化(记得把ip换了)

kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96

--ignore-preflight-errors=NumCPU 是在只有一个 CPU 的时候使用,例如 1G1M 的学生服务器。

但是因为需要连接到 Google ,所以可能无法下载内容。

我们可以通过使用 kubeadm config images list 命令,列举需要拉取的镜像。我们来手动通过 Docker 拉取。这个过程比较麻烦,还需要手动修改镜像名称。

拉取方法 docker pull {镜像名称}

Google 访问不了,不过 DockerHub 已经备份好需要的镜像。

mirrorgooglecontainers 这个仓库备份了相应的镜像。遗憾的是,镜像不一定都是最新的备份。阿里云上面的 google_containers 仓库应该是备份最新的。

例如需要以下镜像

k8s.gcr.io/kube-apiserver:v1.22.2
k8s.gcr.io/kube-controller-manager:v1.22.2
k8s.gcr.io/kube-scheduler:v1.22.2
k8s.gcr.io/kube-proxy:v1.22.2
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns:1.8.4

则拉取对应的镜像

docker pull mirrorgooglecontainers/kube-apiserver:v1.22.2
docker pull mirrorgooglecontainers/kube-controller-manager:v1.22.2
docker pull mirrorgooglecontainers/kube-scheduler:v1.22.2
docker pull mirrorgooglecontainers/kube-proxy:v1.22.2
docker pull mirrorgooglecontainers/pause:3.5
docker pull mirrorgooglecontainers/etcd:3.5.0-0
docker pull coredns/coredns:1.8.4

使用 docker tag {旧名称:版本}:{新名称:版本} ,将镜像改名。

考虑到各种情况和可能会出现问题,笔者这里给出一个别人写的一键脚本,可以直接一键完成这一步。

touch pullk8s.sh	# 创建脚本文件
nano pullk8s.sh		# 编辑脚本

然后将以下内容复制进去

for  i  in  `kubeadm config images list`;  do
    imageName=${i#k8s.gcr.io/}
    docker pull registry.aliyuncs.com/google_containers/$imageName
    docker tag registry.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
    docker rmi registry.aliyuncs.com/google_containers/$imageName
done;

保存文件

Ctrl + O
回车键
Ctrl + x

给脚本文件赋权限

chmod +x pullk8s.sh

执行脚本

sh pullk8s.sh

然后执行 docker images 命令查看需要的镜像是否都准备好了。

root@ubuntu:~# docker images
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                v1.22.2             cba2a99699bd        2 weeks ago         116MB
k8s.gcr.io/kube-apiserver            v1.22.2             41ef50a5f06a        2 weeks ago         171MB
k8s.gcr.io/kube-controller-manager   v1.22.2             da5fd66c4068        2 weeks ago         161MB
k8s.gcr.io/kube-scheduler            v1.22.2             f52d4c527ef2        2 weeks ago         94.4MB
k8s.gcr.io/coredns                   1.8.4               70f311871ae1        3 months ago        41.6MB
k8s.gcr.io/etcd                      3.5.0-0             303ce5db0e90        3 months ago        288MB
k8s.gcr.io/pause                     3.5                 da86e6ba6ca1        2 years ago         742kB

也可能会报错,报错的话就手动拉取

Error response from daemon: pull access denied for registry.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Error response from daemon: No such image: registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.4
Error: No such image: registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.4
docker pull coredns/coredns:1.8.4

#镜像改名命令格式:
docker  tag  旧镜像名  新镜像名

最后执行 开头的初始化命令。

kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96

因为阿里云ecs里没有配置公网ip,etcd无法启动,所以kubeadm在初始化会出现”timeout“的错误。

解决办法:

1.建立两个ssh对话,即用ssh工具新建两个标签,一个用来初始化节点,另一个在初始化过程中修改配置文件。 注意是初始化过程中,每次运行kubeadm init,kubeadm都会生成etcd的配置文件,如果提前修改了配置文件,在运行kubeadm init时会把修改的结果覆盖,那么也就没有作用了。

2.运行”kubeadm init …“上述的初始化命令,此时会卡在

Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed

3.在输入上述命令后,kubeadm即开始了master节点的初始化,但是由于etcd配置文件不正确,所以etcd无法启动,要对该文件进行修改。
文件路径"/etc/kubernetes/manifests/etcd.yaml"。

#对文件这两行进行修改
--listen-client-urls=https://127.0.0.1:2379,https://39.96.46.96:2379
--listen-peer-urls=https://39.96.46.96:2380
#修改后
--listen-client-urls=https://127.0.0.1:2379
--listen-peer-urls=https://127.0.0.1:2380

4.此处"xxx"为公网ip,要关注的是"–listen-client-urls"和"–listen-peer-urls"。需要把"–listen-client-urls"后面的公网ip删除,把"–listen-peer-urls"改为本地的地址。

稍等后master节点初始化就会完成

可能遇到的问题

[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher

#执行指令
swapoff -a && kubeadm reset  && systemctl daemon-reload && systemctl restart kubelet  && iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
#再执行初始化就可以了
kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96

可能遇到的问题

error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.4: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1

只需打开这个网址https://www.ipaddress.com/,搜索https://k8s.gcr.io得到它的 ip 142.250.113.82,打开本机hosts文件,Linux是

vim /etc/hosts,将上面的网址和ip按下面的形式加入进去即可,不是root用户记得sudo

142.250.113.82  k8s.gcr.io

还是有问题

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

是因为docker和kubernetes所使用的cgroup不一致导致

解决方法
在docker中修改配置文件

cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF

重启docker

systemctl restart docker

之后还是会有问题,这些就简单了报什么错就解决什么(在此我附上我遇到的问题)

[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists

#解决方法(此处省略了几个,步骤都一样我就不写了)
cd /etc/kubernetes/manifests/
rm kube-apiserver.yaml
[ERROR Port-10250]: Port 10250 is in use

#解决方法(此处省略了几个,步骤都一样我就不写了)
lsof -i:10250
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
kubelet 22055 root   27u  IPv6 773301      0t0  TCP *:10250 (LISTEN)
kill -9 22055
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty

#解决方法
cd /var/lib/etcd/
rm -r member/

再次执行初始化命令就会成功

#成功后的结果
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 39.96.46.96:6443 --token 9vbzuf.vtzj1w5vefjlwi0t \
        --discovery-token-ca-cert-hash sha256:b6e6fffb6b0e11d2db374ce21f6d86de3e09e1e13075e1bf01055130c2c5e060

在master节点上执行如下

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#检查 master 
kubectl get nodes
root@ubuntu:~# kubectl get nodes
NAME         STATUS      ROLES                  AGE   VERSION
k8s-master   NotReady    control-plane,master   26h   v1.22.2
node         Ready       <none>                 15s   v1.22.2

#添加网络插件
sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
#结果
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

kubectl get pods --all-namespaces
#如果显示这样,个别的Pod是Pending状态
NAMESPACE     NAME                             READY   STATUS     RESTARTS      AGE
kube-system   coredns-78fcd69978-fkkmh         0/1     Pending    0             17m
kube-system   coredns-78fcd69978-qrx2c         0/1     Pending    0             17m
kube-system   etcd-ubuntu                      1/1     Running    0             17m
kube-system   kube-apiserver-ubuntu            1/1     Running    1 (19m ago)   17m
kube-system   kube-controller-manager-ubuntu   1/1     Running    2 (20m ago)   17m
kube-system   kube-flannel-ds-g97gm            0/1     Init:0/1   0             80s
kube-system   kube-proxy-f6ctf                 1/1     Running    0             17m
kube-system   kube-scheduler-ubuntu            1/1     Running    2 (19m ago)   17m

#只需把  185.199.111.133 raw.githubusercontent.com  加到hosts文件就可以,再次执行就OK了
NAMESPACE     NAME                             READY   STATUS    RESTARTS      AGE
kube-system   coredns-78fcd69978-fkkmh         1/1     Running   0             28m
kube-system   coredns-78fcd69978-qrx2c         1/1     Running   0             28m
kube-system   etcd-ubuntu                      1/1     Running   0             28m
kube-system   kube-apiserver-ubuntu            1/1     Running   1 (30m ago)   28m
kube-system   kube-controller-manager-ubuntu   1/1     Running   2 (31m ago)   28m
kube-system   kube-flannel-ds-g97gm            1/1     Running   0             11m
kube-system   kube-proxy-f6ctf                 1/1     Running   0             28m
kube-system   kube-scheduler-ubuntu            1/1     Running   2 (30m ago)   28m

kubectl get nodes
NAME         STATUS   ROLES                  AGE   VERSION
k8s-master   Ready    control-plane,master   26h   v1.22.2
node         Ready    <none>                 15s   v1.22.2

此处为止,k8s集群基本安装已完成,因为目前我暂时没有dashboard的需求,所以暂时没有安装,等有需求了我再回来更新哈哈

标签:kube,--,kubelet,阿里,集群,io,docker,k8s
来源: https://www.cnblogs.com/godsyu/p/15405885.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有