前言:
目前咱们对“ubuntu安装mwget”都比较讲究,看官们都需要了解一些“ubuntu安装mwget”的相关内容。那么小编在网上收集了一些对于“ubuntu安装mwget””的相关文章,希望我们能喜欢,我们一起来学习一下吧!这篇文章我们将进行Kubernetes集群的核心组件 etcd 集群备份,然后在具有一个主节点和一个从节点的 kubernetes 集群中恢复相同的备份。下面是实验的步骤和效果验证。
Step1 安装ETCD客户端
安装etcd cli 客户端, 管理etcd集群。这里在Ubuntu系统中安装。
apt install etcd-clientStep2 创建Nginx部署
我们将创建具有多个副本的 nginx 部署,这些副本将用于验证 etcd 数据的恢复。
kubectl create deployment nginx — image nginx --replicas=5
验证新部署的 Pod 是否处于运行状态
controlplane $ kubectl get podsNAME READY STATUS RESTARTS AGEnginx-77b4fdf86c-6m8gl 1/1 Running 0 50snginx-77b4fdf86c-bfcsr 1/1 Running 0 50snginx-77b4fdf86c-bqmqk 1/1 Running 0 50snginx-77b4fdf86c-nkh7j 1/1 Running 0 50snginx-77b4fdf86c-x946x 1/1 Running 0 50sStep3 备份Etcd集群
为 etcd 备份创建一个备份目录mkdir etcd-backup运行以下命令进行 etcd 备份。
ETCDCTL_API=3 etcdctl --endpoints= \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key \snapshot save ./etcd-backup/etcdbackup.db
请注意,您不需要记住上述命令的证书路径,您可以从 kube-system 命名空间中运行的 etcd pod 获取证书路径。您可以通过运行以下命令来为 pod 运行命令
controlplane $ kubectl get pods -n kube-systemNAME READY STATUS RESTARTS AGEcalico-kube-controllers-784cc4bcb7-xk6q7 1/1 Running 4 38dcanal-9nszc 2/2 Running 0 42mcanal-brzd7 2/2 Running 0 42mcoredns-5d769bfcf4-5mwkn 1/1 Running 0 38dcoredns-5d769bfcf4-w4xs7 1/1 Running 0 38detcd-controlplane 1/1 Running 0 38dkube-apiserver-controlplane 1/1 Running 2 38dkube-controller-manager-controlplane 1/1 Running 3 (41m ago) 38dkube-proxy-5b8sx 1/1 Running 0 38dkube-proxy-5qlc5 1/1 Running 0 38dkube-scheduler-controlplane 1/1 Running 3 (41m ago) 38d
现在运行 get pods -o yaml 命令来获取 etcd pod 的容器命令。
kubectl get pods etcd-controlplane -o yaml -n kube-system
将得到它并可以获得所有证书路径。
containers: - command: - etcd - --advertise-client-urls= - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/var/lib/etcd - --experimental-initial-corrupt-check=true - --experimental-watch-progress-notify-interval=5s - --initial-advertise-peer-urls= - --initial-cluster=controlplane= - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls= - --listen-metrics-urls= - --listen-peer-urls= - --name=controlplane - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crtStep4 验证备份数据
运行以下命令,以从新备份数据中获取密钥列表和详细信息ETCDCTL_API=3 etcdctl --write-out=table snapshot status ./etcd-backup/etcdbackup.db
controlplane $ ETCDCTL_API=3 etcdctl --write-out=table snapshot status ./etcd-backup/etcdbackup.db +---------+----------+------------+------------+| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |+---------+----------+------------+------------+| cb4c04c | 4567 | 1346 | 6.0 MB |+---------+----------+------------+------------+Step5 将备份恢复到集群
在这里,我们将删除之前创建的 nginx 部署,然后恢复备份,以便恢复 nginx 部署。
A.删除nginx部署
controlplane $ kubectl delete deploy nginxdeployment.apps "nginx" deleted
B.将数据从备份恢复
ETCDCTL_API=3 etcdctl snapshot restore etcd-backup/etcdbackup.db
这将创建一个名为的default.etcd文件夹, 恢复备份时您可能会遇到如下错误:
controlplane $ ETCDCTL_API=3 etcdctl snapshot restore etcd-backup/etcdbackup.db Error: expected sha256 [253 81 3 207 182 43 249 52 218 166 71 135 221 106 6 216 216 21 183 250 36 126 187 251 171 98 91 69 113 40 229 2], got [63 25 34 167 139 91 18 135 249 179 157 115 214 138 237 35 161 237 175 12 61 31 141 130 204 146 143 177 132 241 193 15]
为了避免这种情况,您可以在上面的恢复命令中使用--skip-hash-check=true此标志,您应该可以很好地获取default.etcd当前路径上的文件夹。
controlplane $ ETCDCTL_API=3 etcdctl snapshot restore etcd-backup/etcdbackup.db --skip-hash-check=true2023-06-28 15:35:36.180956 I | etcdserver/membership: added member 8e9e05c52164694d [] to cluster cdf818194e3a8c32controlplane $ lsdefault.etcd etcd-backup filesystem
C.现在我们需要停止所有正在运行的 Kubernetes 组件以更新 etcd 数据。为此,我们在/etc/kubernetes/manifests/文件夹中放置了 kubernetes 组件的清单文件,我们将临时将此文件移出此路径,kubelet 将自动删除这些 pod。
controlplane $ ls /etc/kubernetes/manifests/etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yamlcontrolplane $ kubectl get pods -n kube-systemNAME READY STATUS RESTARTS AGEcalico-kube-controllers-784cc4bcb7-xk6q7 1/1 Running 4 38dcanal-5lxjg 2/2 Running 0 28mcanal-zv77t 2/2 Running 0 28mcoredns-5d769bfcf4-5mwkn 1/1 Running 0 38dcoredns-5d769bfcf4-w4xs7 1/1 Running 0 38detcd-controlplane 1/1 Running 0 38dkube-apiserver-controlplane 1/1 Running 2 38dkube-controller-manager-controlplane 1/1 Running 2 38dkube-proxy-5b8sx 1/1 Running 0 38dkube-proxy-5qlc5 1/1 Running 0 38dkube-scheduler-controlplane 1/1 Running 2 38dcontrolplane $ mkdir temp_yaml_filescontrolplane $ mv /etc/kubernetes/manifests/* temp_yaml_files/controlplane $ kubectl get pods -n kube-systemThe connection to the server 172.30.1.2:6443 was refused - did you specify the right host or port?
您可以在上面看到,一旦我们从清单路径中删除文件,api-server pod 将被终止,您将无法访问集群。你可以检查这些组件的docker容器是否被Kill或处于运行状态。在移动文件之前,容器将运行。
controlplane $ crictl psCONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD6a2bce359c15b 6f6e73fa8162b 3 seconds ago Running kube-apiserver 0 fe1be6aa651dd kube-apiserver-controlplanea26534b2e6244 c6b5118178229 4 seconds ago Running kube-controller-manager 0 38fb48a4ebb62 kube-controller-manager-controlplane58ac164968ec3 86b6af7dd652c 4 seconds ago Running etcd 0 170af0e603a02 etcd-controlplanee98ef4185206b 6468fa8f98696 4 seconds ago Running kube-scheduler 0 0bd26fd661a2c kube-scheduler-controlplane7a03436be6ce6 f9c3c1813269c 23 seconds ago Running calico-kube-controllers 7 6da32eed5e939 calico-kube-controllers-784cc4bcb7-xk6q71edf2a857f1d4 e6ea68648f0cd 31 minutes ago Running kube-flannel 0 3dac4c0c5960d canal-5lxjge249d3e4b2b51 75392e3500e36 31 minutes ago Running calico-node 0 3dac4c0c5960d canal-5lxjg039999604ba8c ead0a4a53df89 5 weeks ago Running coredns 0 f8b31a08b4907 coredns-5d769bfcf4-5mwkn26d7a0bc1b1b9 1780fa6665ff0 5 weeks ago Running local-path-provisioner 0 1913e8d9cb757 local-path-provisioner-bf548cc96-fchvwc86359e6bf649 fbe39e5d66b6a 5 weeks ago Running
一旦文件被移动,它们将被终止。
controlplane $ mv /etc/kubernetes/manifests/* temp_yaml_files/controlplane $ crictl psCONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD7a03436be6ce6 f9c3c1813269c 2 minutes ago Running calico-kube-controllers 7 6da32eed5e939 calico-kube-controllers-784cc4bcb7-xk6q71edf2a857f1d4 e6ea68648f0cd 34 minutes ago Running kube-flannel 0 3dac4c0c5960d canal-5lxjge249d3e4b2b51 75392e3500e36 34 minutes ago Running calico-node 0 3dac4c0c5960d canal-5lxjg039999604ba8c ead0a4a53df89 5 weeks ago Running coredns 0 f8b31a08b4907 coredns-5d769bfcf4-5mwkn26d7a0bc1b1b9 1780fa6665ff0 5 weeks ago Running local-path-provisioner 0 1913e8d9cb757 local-path-provisioner-bf548cc96-fchvwc86359e6bf649 fbe39e5d66b6a 5 weeks ago Running kube-proxy 0 d69f1cd083173 kube-proxy-5b8sx
D.现在 api-server/controller-manager/kube-scheduler 已终止,我们将把数据从default.etcd文件夹移动到 etcd data-dir,我们可以从第 3 阶段获取该数据,在阶段 3 中,我们在 etcd pod 中运行 etcd 命令,并且设置了 data-dir到--data-dir=/var/lib/etcd.
controlplane $ cd default.etcd/controlplane $ lsmembercontrolplane $ ls /var/lib/etcdmember
我们将从备份目录中重命名并添加member文件夹/var/lib/etcd/member。备份默认/var/lib/etcd/目录中的member 到文件夹/var/lib/etcd/member.bak
controlplane $ cd default.etcd/controlplane $ lsmembercontrolplane $ mv /var/lib/etcd/member/ /var/lib/etcd/member.bakcontrolplane $ mv member/ /var/lib/etcd/controlplane $ ls /var/lib/etcdmember member.bak
E. 现在,由于我们的数据已恢复,我们将停止 kubelet 服务并将 yaml 文件再次移动到清单文件夹。
controlplane $ systemctl stop kubeletcontrolplane $ systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: inactive (dead) since Wed 2023-06-28 16:03:32 UTC; 6s ago Docs: Process: 25011 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, stat> Main PID: 25011 (code=exited, status=0/SUCCESS)Jun 28 16:03:30 controlplane kubelet[25011]: E0628 16:03:30.524978 25011 controller.go:146] "Failed to ensure lease exists, will retry" err="Get \"htt>Jun 28 16:03:31 controlplane kubelet[25011]: I0628 16:03:31.195933 25011 status_manager.go:809] "Failed to get status for pod" podUID=4ad6dc12-6828-45>Jun 28 16:03:31 controlplane kubelet[25011]: E0628 16:03:31.196843 25011 mirror_client.go:138] "Failed deleting a mirror pod" err="Delete \";Jun 28 16:03:31 controlplane kubelet[25011]: E0628 16:03:31.197110 25011 mirror_client.go:138] "Failed deleting a mirror pod" err="Delete \";Jun 28 16:03:31 controlplane kubelet[25011]: E0628 16:03:31.197392 25011 mirror_client.go:138] "Failed deleting a mirror pod" err="Delete \";Jun 28 16:03:31 controlplane kubelet[25011]: E0628 16:03:31.197721 25011 mirror_client.go:138] "Failed deleting a mirror pod" err="Delete \";Jun 28 16:03:32 controlplane systemd[1]: Stopping kubelet: The Kubernetes Node Agent...Jun 28 16:03:32 controlplane kubelet[25011]: I0628 16:03:32.098579 25011 dynamic_cafile_content.go:171] "Shutting down controller" name="client-ca-bun>Jun 28 16:03:32 controlplane systemd[1]: kubelet.service: Succeeded.Jun 28 16:03:32 controlplane systemd[1]: Stopped kubelet: The Kubernetes Node Agent.lines 1-19/19 (END)controlplane $ mv temp_yaml_files/* /etc/kubernetes/manifests/controlplane $ ls /etc/kubernetes/manifests/etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
一旦这些文件被移动,我们将启动 kubelet 服务,以便它选择这些文件并部署组件。
controlplane $ systemctl start kubeletcontrolplane $ systemctl status kubelet● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Wed 2023-06-28 16:05:56 UTC; 3s ago Docs: Main PID: 60741 (kubelet) Tasks: 9 (limit: 2339) Memory: 70.5M CGroup: /system.slice/kubelet.service └─60741 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/>Jun 28 16:05:57 controlplane kubelet[60741]: W0628 16:05:57.729886 60741 reflector.go:533] vendor/k8s.io/client-go/informers/factory.go:150: failed to>Jun 28 16:05:57 controlplane kubelet[60741]: E0628 16:05:57.729952 60741 reflector.go:148] vendor/k8s.io/client-go/informers/factory.go:150: Failed to>Jun 28 16:05:57 controlplane kubelet[60741]: W0628 16:05:57.831598 60741 reflector.go:533] vendor/k8s.io/client-go/informers/factory.go:150: failed to>Jun 28 16:05:57 controlplane kubelet[60741]: E0628 16:05:57.832204 60741 reflector.go:148] vendor/k8s.io/client-go/informers/factory.go:150: Failed to>Jun 28 16:05:58 controlplane kubelet[60741]: W0628 16:05:58.130322 60741 reflector.go:533] vendor/k8s.io/client-go/informers/factory.go:150: failed to>Jun 28 16:05:58 controlplane kubelet[60741]: E0628 16:05:58.130397 60741 reflector.go:148] vendor/k8s.io/client-go/informers/factory.go:150: Failed to>Jun 28 16:05:58 controlplane kubelet[60741]: E0628 16:05:58.274435 60741 controller.go:146] "Failed to ensure lease exists, will retry" err="Get \"htt>Jun 28 16:05:58 controlplane kubelet[60741]: I0628 16:05:58.360755 60741 kubelet_node_status.go:70] "Attempting to register node" node="controlplane"Jun 28 16:05:58 controlplane kubelet[60741]: E0628 16:05:58.361160 60741 kubelet_node_status.go:92] "Unable to register node with API server" err="Pos>Jun 28 16:05:59 controlplane kubelet[60741]: I0628 16:05:59.962674 60741 kubelet_node_status.go:70] "Attempting to register node" node="controlplane"lines 1-22/22 (END)
您现在可以看到容器现在再次运行, kubectl 命令可能需要几分钟才能工作。
crictl psCONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD688cfa2890b4f f9c3c1813269c 23 seconds ago Running calico-kube-controllers 12 6da32eed5e939 calico-kube-controllers-784cc4bcb7-xk6q7db1797e3e2e83 6468fa8f98696 28 seconds ago Running kube-scheduler 0 307a1600b4346 kube-scheduler-controlplane1dc176c2a599e c6b5118178229 28 seconds ago Running kube-controller-manager 0 f9efc6c4c8d91 kube-controller-manager-controlplanef70e2103ec1e0 6f6e73fa8162b 29 seconds ago Running kube-apiserver 0 32f49c141ea69 kube-apiserver-controlplane2e274f5176656 86b6af7dd652c 29 seconds ago Running etcd 0 9c561113f9fcd etcd-controlplane1edf2a857f1d4 e6ea68648f0cd 47 minutes ago Running kube-flannel 0 3dac4c0c5960d canal-5lxjge249d3e4b2b51 75392e3500e36 47 minutes ago Running calico-node 0 3dac4c0c5960d canal-5lxjg039999604ba8c ead0a4a53df89 5 weeks ago Running coredns 0 f8b31a08b4907 coredns-5d769bfcf4-5mwkn26d7a0bc1b1b9 1780fa6665ff0 5 weeks ago Running local-path-provisioner 0 1913e8d9cb757 local-path-provisioner-bf548cc96-fchvwc86359e6bf649 fbe39e5d66b6a 5 weeks ago Running
您现在可以通过运行 get pods 命令来验证我们的 nginx 部署是否已恢复(我们在备份后删除了该部署)
controlplane $ kubectl get podsNAME READY STATUS RESTARTS AGEnginx-77b4fdf86c-8n7kg 1/1 Running 0 40mnginx-77b4fdf86c-gmbjm 1/1 Running 0 40mnginx-77b4fdf86c-pjpnr 1/1 Running 0 40mnginx-77b4fdf86c-qjxmd 1/1 Running 0 40mnginx-77b4fdf86c-zhvnv 1/1 Running 0 40m
恭喜!!!您现在已成功恢复 ETCD 数据。
标签: #ubuntu安装mwget