小知识:k8s集群部署时etcd容器不停重启问题以及处理详解

问题现象

在安装部署Kubernetes 1.26版本时,通过kubeadm初始化集群后,发现执行kubectl命令报以下错误:

The connection to the server localhost:8080 was refused – did you specify the right host or port?

查看kubelet状态是否正常,发现无法连接apiserver的6443端口。

?
1
2
3
4
Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015089    7127 kubelet_node_status.go:540] “Error updating node status, will retry” err=”error getting node \”k8s-master\”: Get \”https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\”: dial tcp 192.168.2.200:6443: connect: connection refused”
Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015445    7127 kubelet_node_status.go:540] “Error updating node status, will retry” err=”error getting node \”k8s-master\”: Get \”https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\”: dial tcp 192.168.2.200:6443: connect: connection refused”
Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015654    7127 kubelet_node_status.go:540] “Error updating node status, will retry” err=”error getting node \”k8s-master\”: Get \”https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\”: dial tcp 192.168.2.200:6443: connect: connection refused”
Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015818    7127 kubelet_node_status.go:540] “Error updating node status, will retry” err=”error getting node \”k8s-master\”: Get \”https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\”: dial tcp 192.168.2.200:6443: connect: connection refused”

进而查看apiserver容器的状态,由于是基于containerd作为容器运行时,此时kubectl不可用的情况下,使用crictl ps -a命令可以查看所有容器的情况。

?
1
2
3
4
5
6
7
8
root@k8s-master:~/k8s/calico# crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                      ATTEMPT             POD ID              POD
395b45b1cb733       a31e1d84401e6       50 seconds ago      Exited              kube-apiserver            28                  e87800ae06ff5       kube-apiserver-k8s-master
b5c7e2a07bf1b       5d7c5dfd3ba18       3 minutes ago       Running             kube-controller-manager   32                  6b7cc9dd07f1d       kube-controller-manager-k8s-master
944aa31862613       556768f31eb1d       4 minutes ago       Exited              kube-proxy                27                  ccb6557c6f629       kube-proxy-ctjjq
c097332b6f416       fce326961ae2d       4 minutes ago       Exited              etcd                      30                  079d491eb9925       etcd-k8s-master
b8103090322c4       dafd8ad70b156       6 minutes ago       Exited              kube-scheduler            32                  48f9544c9798c       kube-scheduler-k8s-master
a14b969e8ad05       5d7c5dfd3ba18       12 minutes ago      Exited              kube-controller-manager   31                  5576806b4e142       kube-controller-manager-k8s-master

发现此时kube-apiserver容器已经退出,查看容器日志是否有异常信息。通过日志信息发现是kube-apiserver无法连接etcd的2379端口,那么问题应该是出在etcd了。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
W1221 07:00:20.392868       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
“Addr”: “127.0.0.1:2379”,
“ServerName”: “127.0.0.1”,
“Attributes”: null,
“BalancerAttributes”: null,
“Type”: 0,
“Metadata”: null
}. Err: connection error: desc = “transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused”
W1221 07:00:21.391330       1 logging.go:59] [core] [Channel #4 SubChannel #6] grpc: addrConn.createTransport failed to connect to {
“Addr”: “127.0.0.1:2379”,
“ServerName”: “127.0.0.1”,
“Attributes”: null,
“BalancerAttributes”: null,
“Type”: 0,
“Metadata”: null
}. Err: connection error: desc = “transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused”

此时etcd容器也在不断地重启,查看其日志发现没有错误级别的信息。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 is starting a new election at term 2″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 became pre-candidate at term 2″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 received MsgPreVoteResp from d975d9ebc69964b3 at term 2″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 became candidate at term 3″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 received MsgVoteResp from d975d9ebc69964b3 at term 3″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”d975d9ebc69964b3 became leader at term 3″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.740Z”,”logger”:”raft”,”caller”:”etcdserver/zap_raft.go:77″,”msg”:”raft.node: d975d9ebc69964b3 elected leader d975d9ebc69964b3 at term 3″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.742Z”,”caller”:”etcdserver/server.go:2054″,”msg”:”published local member to cluster through raft”,”local-member-id”:”d975d9ebc69964b3″,”local-member-attributes”:”{Name:k8s-master ClientURLs:[https://192.168.2.200:2379]}”,”request-path”:”/0/members/d975d9ebc69964b3/attributes”,”cluster-id”:”f88ac1c8c4bab6″,”publish-timeout”:”7s”}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.742Z”,”caller”:”embed/serve.go:100″,”msg”:”ready to serve client requests”}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.742Z”,”caller”:”embed/serve.go:100″,”msg”:”ready to serve client requests”}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.743Z”,”caller”:”etcdmain/main.go:44″,”msg”:”notifying init daemon”}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.743Z”,”caller”:”etcdmain/main.go:50″,”msg”:”successfully notified init daemon”}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.744Z”,”caller”:”embed/serve.go:198″,”msg”:”serving client traffic securely”,”address”:”192.168.2.200:2379″}
{“level”:”info”,”ts”:”2022-12-21T10:29:00.745Z”,”caller”:”embed/serve.go:198″,”msg”:”serving client traffic securely”,”address”:”127.0.0.1:2379″}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.624Z”,”caller”:”osutil/interrupt_unix.go:64″,”msg”:”received signal; shutting down”,”signal”:”terminated”}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.624Z”,”caller”:”embed/etcd.go:373″,”msg”:”closing etcd server”,”name”:”k8s-master”,”data-dir”:”/var/lib/etcd”,”advertise-peer-urls”:[“https://192.168.2.200:2380″],”advertise-client-urls”:[“https://192.168.2.200:2379”]}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.636Z”,”caller”:”etcdserver/server.go:1465″,”msg”:”skipped leadership transfer for single voting member cluster”,”local-member-id”:”d975d9ebc69964b3″,”current-leader-member-id”:”d975d9ebc69964b3″}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.637Z”,”caller”:”embed/etcd.go:568″,”msg”:”stopping serving peer traffic”,”address”:”192.168.2.200:2380″}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.639Z”,”caller”:”embed/etcd.go:573″,”msg”:”stopped serving peer traffic”,”address”:”192.168.2.200:2380″}
{“level”:”info”,”ts”:”2022-12-21T10:30:20.639Z”,”caller”:”embed/etcd.go:375″,”msg”:”closed etcd server”,”name”:”k8s-master”,”data-dir”:”/var/lib/etcd”,”advertise-peer-urls”:[“https://192.168.2.200:2380″],”advertise-client-urls”:[“https://192.168.2.200:2379”]}

但是,其中一行日志信息表示etcd收到了关闭的信号,并不是异常退出的。

?
1
{“level”:”info”,”ts”:”2022-12-21T10:30:20.624Z”,”caller”:”osutil/interrupt_unix.go:64″,”msg”:”received signal; shutting down”,”signal”:”terminated”}

解决问题

该问题为未正确设置cgroups导致,在containerd的配置文件/etc/containerd/config.toml中,修改SystemdCgroup配置为true。

?
1
2
3
4
5
6
7
8
9
10
11
12
[plugins.”io.containerd.grpc.v1.cri”.containerd.runtimes.runc.options]
BinaryName = “”
CriuImagePath = “”
CriuPath = “”
CriuWorkPath = “”
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = “”
ShimCgroup = “”
SystemdCgroup = true

重启containerd服务

?
1
systemctl restart containerd

etcd容器不再重启,其他容器也恢复正常,问题解决。

总结

到此这篇关于k8s集群部署时etcd容器不停重启问题以及处理方法的文章就介绍到这了,更多相关k8s集群部署etcd容器不停重启内容请搜索服务器之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持服务器之家!

原文链接:https://blog.csdn.net/ldjjbzh626/article/details/128400797

声明: 猿站网有关资源均来自网络搜集与网友提供,任何涉及商业盈利目的的均不得使用,否则产生的一切后果将由您自己承担! 本平台资源仅供个人学习交流、测试使用 所有内容请在下载后24小时内删除,制止非法恶意传播,不对任何下载或转载者造成的危害负任何法律责任!也请大家支持、购置正版! 。本站一律禁止以任何方式发布或转载任何违法的相关信息访客发现请向站长举报,会员发帖仅代表会员个人观点,并不代表本站赞同其观点和对其真实性负责。本网站的资源部分来源于网络,如有侵权烦请发送邮件至:2697268773@qq.com进行处理。
建站知识

小知识:GoDaddy云主机好用吗?虚拟主机和VPS主机怎么选?

2023-3-2 18:22:02

建站知识

小知识:Linux常用高频命令

2023-3-2 18:27:13

0 条回复 A文章作者 M管理员
    暂无讨论,说说你的看法吧
个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索