前言:
当前各位老铁们对“windows10安装nodejs出现2503”可能比较关心,我们都需要了解一些“windows10安装nodejs出现2503”的相关内容。那么小编在网上汇集了一些关于“windows10安装nodejs出现2503””的相关资讯,希望朋友们能喜欢,看官们一起来了解一下吧!问题背景
为了验证最新版本的k8s是否已修复某个bug,需要快速搭建一个k8s环境,本文选取资料[1]中的kubeasz工具,并记录部署过程及相关问题。
部署过程
先下载工具脚本、kubeasz代码、二进制、默认容器镜像。
使用如下命令开始安装:
[root@node01 k8s]# ./ezdown -S2023-03-22 13:39:40 INFO Action begin: start_kubeasz_docker2023-03-22 13:39:41 INFO try to run kubeasz in a container2023-03-22 13:39:41 DEBUG get host IP: 10.10.11.492023-03-22 13:39:41 DEBUG generate ssh key pair# 10.10.11.49 SSH-2.0-OpenSSH_6.6.1f1b442b7fdaf757c7787536b17d12d76208a2dd7884d56fbd1d35817dc2e94ca2023-03-22 13:39:41 INFO Action successed: start_kubeasz_docker[root@node01 k8s]# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESf1b442b7fdaf easzlab/kubeasz:3.5.0 "sleep 36000" 15 seconds ago Up 14 seconds kubeasz
执行后看不出是成功,还是失败。根据文档说明,进入容器内手动执行命令:
[root@node01 ~]# docker exec -it kubeasz ezctl start-aio2023-03-22 06:15:05 INFO get local host ipadd: 10.10.11.492023-03-22 06:15:05 DEBUG generate custom cluster files in /etc/kubeasz/clusters/default2023-03-22 06:15:05 DEBUG set versions2023-03-22 06:15:05 DEBUG disable registry mirrors2023-03-22 06:15:05 DEBUG cluster default: files successfully created.2023-03-22 06:15:05 INFO next steps 1: to config '/etc/kubeasz/clusters/default/hosts'2023-03-22 06:15:05 INFO next steps 2: to config '/etc/kubeasz/clusters/default/config.yml'ansible-playbook -i clusters/default/hosts -e @clusters/default/config.yml playbooks/90.setup.yml2023-03-22 06:15:05 INFO cluster:default setup step:all begins in 5s, press any key to abort:PLAY [kube_master,kube_node,etcd,ex_lb,chrony] **********************************************************************************************************************************************************TASK [Gathering Facts] **********************************************************************************************************************************************************************************fatal: [10.10.11.49]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: root@10.10.11.49: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}PLAY RECAP **********************************************************************************************************************************************************************************************10.10.11.49 : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
从日志看,提示权限有问题。实际测试可以正常的ssh免密登录:
bash-5.1# ssh-keygenGenerating public/private rsa key pair.Enter file in which to save the key (/root/.ssh/id_rsa):/root/.ssh/id_rsa already exists.Overwrite (y/n)?bash-5.1# ssh-copy-id root@10.10.11.49/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installedexpr: warning: '^ERROR: ': using '^' as the first characterof a basic regular expression is not portable; it is ignored/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keysroot@10.10.11.49's password:Number of key(s) added: 1Now try logging into the machine, with: "ssh 'root@10.10.11.49'"and check to make sure that only the key(s) you wanted were added.bash-5.1# ssh root@10.10.11.49root@10.10.11.49's password:
查看相关配置文件,权限正常:
[root@node01 kubeasz]# ll ~/.sshtotal 16-rw------- 1 root root 1752 Mar 22 14:25 authorized_keys-rw------- 1 root root 2602 Mar 22 14:25 id_rsa-rw-r--r-- 1 root root 567 Mar 22 14:25 id_rsa.pub-rw-r--r-- 1 root root 1295 Mar 22 13:39 known_hosts
不清楚具体哪里有问题,参考资料[2],尝试改为用用户名密码执行。
在容器内配置用户密码,检查通过:
bash-5.1# vi /etc/ansible/hosts[webservers]10.10.11.49[webservers:vars]ansible_ssh_pass='******'ansible_ssh_user='root'bash-5.1# ansible webservers -m ping10.10.11.49 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python" }, "changed": false, "ping": "pong"}
修改安装集群依赖的clusters/default/hosts文件,同样增加用户密码配置:
[etcd]10.10.11.49[etcd:vars]ansible_ssh_pass='******'ansible_ssh_user='root'# master node(s)[kube_master]10.10.11.49[kube_master:vars]ansible_ssh_pass='******'ansible_ssh_user='root'# work node(s)[kube_node]10.10.11.49[kube_node:vars]ansible_ssh_pass='******'ansible_ssh_user='root'
执行命令,提示缺少sshpass工具:
[root@node01 kubeasz]# docker exec -it kubeasz ezctl setup default allansible-playbook -i clusters/default/hosts -e @clusters/default/config.yml playbooks/90.setup.yml2023-03-22 07:35:46 INFO cluster:default setup step:all begins in 5s, press any key to abort:PLAY [kube_master,kube_node,etcd,ex_lb,chrony] **********************************************************************************************************************************************************TASK [Gathering Facts] **********************************************************************************************************************************************************************************fatal: [10.10.11.4]: FAILED! => {"msg": "to use the 'ssh' connection type with passwords, you must install the sshpass program"}PLAY RECAP **********************************************************************************************************************************************************************************************10.10.11.49 : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
安装sshpass依赖包:
bash-5.1# apk add sshpassfetch Installing sshpass (1.09-r0)Executing busybox-1.35.0-r17.triggerOK: 21 MiB in 47 packages
重复执行命令:
[root@node01 kubeasz]# docker exec -it kubeasz ezctl setup default allansible-playbook -i clusters/default/hosts -e @clusters/default/config.yml playbooks/90.setup.yml2023-03-22 07:36:37 INFO cluster:default setup step:all begins in 5s, press any key to abort:...TASK [kube-node : 轮询等待kube-proxy启动] *********************************************************************************************************************************************************************changed: [10.10.11.49]FAILED - RETRYING: 轮询等待kubelet启动 (4 retries left).FAILED - RETRYING: 轮询等待kubelet启动 (3 retries left).FAILED - RETRYING: 轮询等待kubelet启动 (2 retries left).FAILED - RETRYING: 轮询等待kubelet启动 (1 retries left).TASK [kube-node : 轮询等待kubelet启动] ************************************************************************************************************************************************************************fatal: [10.10.11.49]: FAILED! => {"attempts": 4, "changed": true, "cmd": "systemctl is-active kubelet.service", "delta": "0:00:00.014621", "end": "2023-03-22 15:42:07.230186", "msg": "non-zero return code", "rc": 3, "start": "2023-03-22 15:42:07.215565", "stderr": "", "stderr_lines": [], "stdout": "activating", "stdout_lines": ["activating"]}PLAY RECAP **********************************************************************************************************************************************************************************************10.10.11.49 : ok=85 changed=78 unreachable=0 failed=1 skipped=123 rescued=0 ignored=0localhost : ok=33 changed=30 unreachable=0 failed=0 skipped=11 rescued=0 ignored=0
kubelet阶段失败,查看kubelet服务:
[root@node01 log]# service kubelet status -lRedirecting to /bin/systemctl status -l kubelet.service● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Wed 2023-03-22 15:56:31 CST; 1s ago Docs: Process: 147581 ExecStart=/opt/kube/bin/kubelet --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///run/containerd/containerd.sock --hostname-override=10.10.11.49 --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --root-dir=/var/lib/kubelet --v=2 (code=exited, status=1/FAILURE) Main PID: 147581 (code=exited, status=1/FAILURE)Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.719832 147581 manager.go:228] Version: {KernelVersion:3.10.0-862.11.6.el7.x86_64 ContainerOsVersion:CentOS Linux 7 (Core) DockerVersion: DockerAPIVersion: CadvisorVersion: CadvisorRevision:}Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.720896 147581 server.go:659] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.721939 147581 container_manager_linux.go:267] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722392 147581 container_manager_linux.go:272] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName:Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722503 147581 topology_manager.go:134] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722609 147581 container_manager_linux.go:308] "Creating device plugin manager"Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722689 147581 manager.go:125] "Creating Device Plugin manager" path="/var/lib/kubelet/device-plugins/kubelet.sock"Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722763 147581 server.go:66] "Creating device plugin registration server" version="v1beta1" socket="/var/lib/kubelet/device-plugins/kubelet.sock"Mar 22 15:56:31 node01 kubelet[147581]: I0322 15:56:31.722905 147581 state_mem.go:36] "Initialized new in-memory state store"Mar 22 15:56:31 node01 kubelet[147581]: E0322 15:56:31.726502 147581 run.go:74] "command failed" err="failed to run Kubelet: validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
根据日志报错,参考资料[3],删除 /etc/containerd/config.toml 文件并重启 containerd 即可:
mv /etc/containerd/config.toml /root/config.toml.baksystemctl restart containerd
重复执行命令,后台查看发现calico-node启动失败,查看日志如下:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 41s default-scheduler Successfully assigned kube-system/calico-node-rqpjm to 10.10.11.49 Normal Pulling 20s (x2 over 31s) kubelet Pulling image "easzlab.io.local:5000/calico/cni:v3.23.5" Warning Failed 19s (x2 over 31s) kubelet Failed to pull image "easzlab.io.local:5000/calico/cni:v3.23.5": rpc error: code = Unknown desc = failed to pull and unpack image "easzlab.io.local:5000/calico/cni:v3.23.5": failed to resolve reference "easzlab.io.local:5000/calico/cni:v3.23.5": failed to do request: Head ";: http: server gave HTTP response to HTTPS client Warning Failed 19s (x2 over 31s) kubelet Error: ErrImagePull Normal BackOff 5s (x2 over 30s) kubelet Back-off pulling image "easzlab.io.local:5000/calico/cni:v3.23.5" Warning Failed 5s (x2 over 30s) kubelet Error: ImagePullBackOff
查看docker层面配置,并测试拉起镜像正常:
[root@node01 ~]# cat /etc/docker/daemon.json{ "max-concurrent-downloads": 10, "insecure-registries": ["easzlab.io.local:5000"], "log-driver": "json-file", "log-level": "warn", "log-opts": { "max-size": "10m", "max-file": "3" }, "data-root":"/var/lib/docker"}[root@node01 log]# docker pull easzlab.io.local:5000/calico/cni:v3.23.5v3.23.5: Pulling from calico/cniDigest: sha256:9c5055a2b5bc0237ab160aee058135ca9f2a8f3c3eee313747a02edcec482f29Status: Image is up to date for easzlab.io.local:5000/calico/cni:v3.23.5easzlab.io.local:5000/calico/cni:v3.23.5
查看containerd层面,并测试拉起镜像也正常:
[root@node01 log]# ctr image pull --plain-http=true easzlab.io.local:5000/calico/cni:v3.23.5easzlab.io.local:5000/calico/cni:v3.23.5: resolved |++++++++++++++++++++++++++++++++++++++|manifest-sha256:9c5055a2b5bc0237ab160aee058135ca9f2a8f3c3eee313747a02edcec482f29: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:cc0e45adf05a30a90384ba7024dbabdad9ae0bcd7b5a535c28dede741298fea3: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:47c5dbbec31222325790ebad8c07d270a63689bd10dc8f54115c65db7c30ad1f: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:8efc3d73e2741a93be09f68c859da466f525b9d0bddb1cd2b2b633f14f232941: done |++++++++++++++++++++++++++++++++++++++|config-sha256:1c979d623de9aef043cb4ff489da5636d61c39e30676224af0055240e1816382: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:4c98a4f67c5a7b1058111d463051c98b23e46b75fc943fc2535899a73fc0c9f1: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:51729c6e2acda05a05e203289f5956954814d878f67feb1a03f9941ec5b4008b: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:050b055d5078c5c6ad085d106c232561b0c705aa2173edafd5e7a94a1e908fc5: done |++++++++++++++++++++++++++++++++++++++|layer-sha256:7430548aa23e56c14da929bbe5e9a2af0f9fd0beca3bd95e8925244058b83748: done |++++++++++++++++++++++++++++++++++++++|elapsed: 3.1 s total: 103.0 (33.2 MiB/s)unpacking linux/amd64 sha256:9c5055a2b5bc0237ab160aee058135ca9f2a8f3c3eee313747a02edcec482f29...done: 6.82968396s
根据资料[4],查看containerd配置,并新增私有仓库的配置:
[root@node01 ~]# containerd config default > /etc/containerd/config.toml[root@node01 ~]# vim /etc/containerd/config.toml[plugins."io.containerd.grpc.v1.cri".registry] config_path = "" [plugins."io.containerd.grpc.v1.cri".registry.auths] [plugins."io.containerd.grpc.v1.cri".registry.configs] [plugins."io.containerd.grpc.v1.cri".registry.headers] [plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."easzlab.io.local:5000"] endpoint = [";][root@node01 ~]# service containerd restart
查看pod状态,又卡在了ContainerCreating状态:
[root@node01 ~]# kubectl get pod -ANAMESPACE NAME READY STATUS RESTARTS AGEkube-system calico-kube-controllers-89b744d6c-klzwh 1/1 Running 0 5m35skube-system calico-node-wmvff 1/1 Running 0 5m35skube-system coredns-6665999d97-mp7xc 0/1 ContainerCreating 0 5m35skube-system dashboard-metrics-scraper-57566685b4-8q5fm 0/1 ContainerCreating 0 5m35skube-system kubernetes-dashboard-57db9bfd5b-h6jp4 0/1 ContainerCreating 0 5m35skube-system metrics-server-6bd9f986fc-njpnj 0/1 ContainerCreating 0 5m35skube-system node-local-dns-wz9bg 1/1 Running 0 5m31s
选择一个describe查看:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 6m7s default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.. Normal Scheduled 5m47s default-scheduler Successfully assigned kube-system/coredns-6665999d97-mp7xc to 10.10.11.49 Warning FailedCreatePodSandBox 5m46s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "072c164d79f4874a8d851d36115ea04b75a2155dae3cecdc764e923c9f38f86b": plugin type="calico" failed (add): failed to find plugin "calico" in path [/opt/cni/bin] Normal SandboxChanged 33s (x25 over 5m46s) kubelet Pod sandbox changed, it will be killed and re-created.
从日志看,是cni插件不存在的问题,手动拷贝之后,查看pod状态:
[root@node01 bin]# cd /opt/cni/bin/[root@node01 bin]# chmod +x *[root@node01 bin]# ll -htotal 186M-rwxr-xr-x 1 root root 3.7M Mar 22 17:46 bandwidth-rwxr-xr-x 1 root root 56M Mar 22 17:46 calico-rwxr-xr-x 1 root root 56M Mar 22 17:46 calico-ipam-rwxr-xr-x 1 root root 2.4M Mar 22 17:46 flannel-rwxr-xr-x 1 root root 3.1M Mar 22 17:46 host-local-rwxr-xr-x 1 root root 56M Mar 22 17:46 install-rwxr-xr-x 1 root root 3.2M Mar 22 17:46 loopback-rwxr-xr-x 1 root root 3.6M Mar 22 17:46 portmap-rwxr-xr-x 1 root root 3.3M Mar 22 17:46 tuning[root@node01 bin]# kubectl get pod -ANAMESPACE NAME READY STATUS RESTARTS AGEkube-system calico-kube-controllers-89b744d6c-mpfgq 1/1 Running 0 37mkube-system calico-node-h9sm2 1/1 Running 0 37mkube-system coredns-6665999d97-8pdbd 1/1 Running 0 37mkube-system dashboard-metrics-scraper-57566685b4-c2l8w 1/1 Running 0 37mkube-system kubernetes-dashboard-57db9bfd5b-74lmb 1/1 Running 0 37mkube-system metrics-server-6bd9f986fc-d9crl 1/1 Running 0 37mkube-system node-local-dns-kvgv6 1/1 Running 0 37m
部署完成。
参考资料