Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rancher Tutorial Rancher Server is not Coming Up #208

Open
khan-belal opened this issue May 30, 2023 · 1 comment
Open

Rancher Tutorial Rancher Server is not Coming Up #208

khan-belal opened this issue May 30, 2023 · 1 comment

Comments

@khan-belal
Copy link

Hello,

I'm attempting to follow the rancher tutorial, however the rancher local cluster does not come up.

I'm able to access the Rancher dashboard and can login using the bootstrapped password, however I can't provision any new clusters as critical services are not available.

When looking into the docker logs, I can see that there are multiple errors installing helm charts. A small subset of errors is below:

2023/05/30 18:26:16 [ERROR] Failed to install system chart fleet-crd: failed to install , pod cattle-system/helm-operation-6glxv exited 128
2023/05/30 18:26:21 [ERROR] Failed to install system chart fleet: failed to install , pod cattle-system/helm-operation-xjlsw exited 128
2023/05/30 18:26:27 [ERROR] Failed to install system chart fleet-crd: failed to install , pod cattle-system/helm-operation-n2kq4 exited 128
2023/05/30 18:26:35 [ERROR] Failed to install system chart rancher-webhook: failed to install , pod cattle-system/helm-operation-2hmm4 exited 128
2023/05/30 18:26:42 [ERROR] Failed to install system chart fleet: failed to install , pod cattle-system/helm-operation-m64ch exited 128
2023/05/30 18:26:47 [ERROR] Failed to install system chart fleet-crd: failed to install , pod cattle-system/helm-operation-8blhb exited 128
2023/05/30 18:26:55 [ERROR] Failed to install system chart fleet: failed to install , pod cattle-system/helm-operation-98v28 exited 128
2023/05/30 18:27:11 [ERROR] Failed to install system chart fleet-crd: failed to install , pod cattle-system/helm-operation-j4bm6 exited 128
2023/05/30 18:27:22 [ERROR] Failed to install system chart fleet: failed to install , pod cattle-system/helm-operation-77996 exited 128
2023/05/30 18:27:29 [ERROR] Failed to install system chart fleet-crd: failed to install , pod cattle-system/helm-operation-s52v2 exited 128

I also cannot get the kubectl console from the dashboard to connect to the cluster. The only way I can access kubectl is by opening a shell inside of the docker container for Rancher. When running a get pod command in the cattle-system namespace there are multiple start errors that correspond to what is visible in the docker log above.

c2b5407dd247:/var/lib/rancher # kubectl get po -n cattle-system
NAME                   READY   STATUS       RESTARTS   AGE
helm-operation-2qbdj   0/2     StartError   0          4m11s
helm-operation-2x5ms   0/2     StartError   0          47m
helm-operation-42z5m   0/2     StartError   0          50m
helm-operation-4q9vr   0/2     StartError   0          39m
helm-operation-5dks8   0/2     StartError   0          24m
helm-operation-5h5h9   0/2     StartError   0          34m
helm-operation-5kjmr   0/2     StartError   0          51m
helm-operation-5l9rf   0/2     StartError   0          14m
helm-operation-74q8f   0/2     StartError   0          49m
helm-operation-74sqn   0/2     StartError   0          39m
helm-operation-7lhjd   0/2     StartError   0          34m
helm-operation-7nzrn   0/2     StartError   0          44m
helm-operation-8c2p8   0/2     StartError   0          49m
helm-operation-8gpjw   0/2     StartError   0          52m
helm-operation-8kdg5   0/2     StartError   0          14m
helm-operation-9b4h6   0/2     StartError   0          24m
helm-operation-9vdg2   0/2     StartError   0          4m21s
helm-operation-9wzp7   0/2     StartError   0          50m
helm-operation-9xsst   0/2     StartError   0          14m
helm-operation-bmt8l   0/2     StartError   0          51m
helm-operation-c2j4g   0/2     StartError   0          14m
helm-operation-c2pkw   0/2     StartError   0          53m
helm-operation-c4c7t   0/2     StartError   0          29m
helm-operation-c5mjv   0/2     StartError   0          48m
helm-operation-c8z4b   0/2     StartError   0          39m
helm-operation-ccs46   0/2     StartError   0          14m
helm-operation-cl2sq   0/2     StartError   0          53m
helm-operation-dglr8   0/2     StartError   0          51m
helm-operation-dhtkg   0/2     StartError   0          4m16s
helm-operation-dlmtp   0/2     StartError   0          19m
helm-operation-dsrvc   0/2     StartError   0          4m27s
helm-operation-dw9jx   0/2     StartError   0          34m
helm-operation-fk84b   0/2     StartError   0          9m37s
helm-operation-g47sz   0/2     StartError   0          34m
helm-operation-glrgs   0/2     StartError   0          29m
helm-operation-gnvvl   0/2     StartError   0          9m26s
helm-operation-gwgnr   0/2     StartError   0          9m20s
helm-operation-hdnpr   0/2     StartError   0          52m
helm-operation-hszxc   0/2     StartError   0          50m
helm-operation-htf48   0/2     StartError   0          9m14s
helm-operation-hzp2h   0/2     StartError   0          48m
helm-operation-j7rg7   0/2     StartError   0          29m
helm-operation-j7rgq   0/2     StartError   0          24m
helm-operation-j7rkm   0/2     StartError   0          49m
helm-operation-jflkb   0/2     StartError   0          24m
helm-operation-jwf5k   0/2     StartError   0          54m
helm-operation-jxqtv   0/2     StartError   0          14m
helm-operation-kp6p2   0/2     StartError   0          28m
helm-operation-mlhzg   0/2     StartError   0          24m
helm-operation-mmxks   0/2     StartError   0          54m
helm-operation-mxvgt   0/2     StartError   0          29m
helm-operation-n2kp6   0/2     StartError   0          34m
helm-operation-ndpxd   0/2     StartError   0          48m
helm-operation-nhg8w   0/2     StartError   0          44m
helm-operation-nj4fx   0/2     StartError   0          9m10s
helm-operation-q96kx   0/2     StartError   0          4m33s
helm-operation-qkt59   0/2     StartError   0          48m
helm-operation-rn58j   0/2     StartError   0          50m
helm-operation-rvtqn   0/2     StartError   0          49m
helm-operation-s9fp2   0/2     StartError   0          54m
helm-operation-sgv5j   0/2     StartError   0          44m
helm-operation-shd4k   0/2     StartError   0          19m
helm-operation-sxm4q   0/2     StartError   0          50m
helm-operation-t4pzk   0/2     StartError   0          48m
helm-operation-tkdds   0/2     StartError   0          48m
helm-operation-tr7vd   0/2     StartError   0          39m
helm-operation-v9c4v   0/2     StartError   0          34m
helm-operation-vg5gf   0/2     StartError   0          44m
helm-operation-vmmcq   0/2     StartError   0          39m
helm-operation-vmw94   0/2     StartError   0          52m
helm-operation-vxnqp   0/2     StartError   0          19m
helm-operation-w4mx2   0/2     StartError   0          44m
helm-operation-w5h58   0/2     StartError   0          19m
helm-operation-wfjmh   0/2     StartError   0          29m
helm-operation-wl5lt   0/2     StartError   0          54m
helm-operation-wmbrb   0/2     StartError   0          9m32s
helm-operation-wxcpf   0/2     StartError   0          4m39s
helm-operation-xjz9n   0/2     StartError   0          39m
helm-operation-xlgkp   0/2     StartError   0          19m
helm-operation-xtxvz   0/2     StartError   0          19m
helm-operation-z9kwc   0/2     StartError   0          24m
helm-operation-zf52q   0/2     StartError   0          47m
helm-operation-zgpqg   0/2     StartError   0          51m
helm-operation-zgrkz   0/2     StartError   0          43m

Looking into an individual pod I see the error below:

c2b5407dd247:/var/lib/rancher # kubectl describe po -n cattle-system helm-operation-zgrkz
Name:         helm-operation-zgrkz
Namespace:    cattle-system
Priority:     0
Node:         local-node/172.17.0.2
Start Time:   Tue, 30 May 2023 19:39:23 +0000
Labels:       pod-impersonation.cattle.io/token=kghdvfmchthnndsdddzhc47t4gpwhwjccb9qxhqlxhw4r8gbltsjqb
Annotations:  pod-impersonation.cattle.io/cluster-role: pod-impersonation-helm-op-f7zh7
Status:       Failed
IP:           10.42.0.167
IPs:
  IP:  10.42.0.167
Containers:
  helm:
    Container ID:  containerd://7ef5cd0c5e0916c876adb38e602bfdae9ac7e1902643fb94282a052feab706e4
    Image:         rancher/shell:v0.1.14
    Image ID:      docker.io/rancher/shell@sha256:9c33c0e58ceb0b3cb6a85d2a6349b1f7fe818e383e6a3cb46671558fbb2f7781
    Port:          <none>
    Host Port:     <none>
    Command:
      helm-cmd
    State:          Terminated
      Reason:       StartError
      Message:      failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/var/lib/rancher/k3s/agent/containerd/io.containerd.grpc.v1.cri/sandboxes/6cd0ead3753fdc537e0510794237751eaa4e6f684b9813682bb5055c6376b2c9/resolv.conf" to rootfs at "/etc/resolv.conf" caused: open /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/7ef5cd0c5e0916c876adb38e602bfdae9ac7e1902643fb94282a052feab706e4/rootfs/etc/resolv.conf: no such file or directory: unknown
      Exit Code:    128
      Started:      Thu, 01 Jan 1970 00:00:00 +0000
      Finished:     Tue, 30 May 2023 19:39:32 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      KUBECONFIG:  /home/shell/.kube/config
    Mounts:
      /home/shell/.kube/config from user-kubeconfig (ro,path="config")
      /home/shell/helm from data (ro)
  proxy:
    Container ID:  containerd://edd37b9494686084dcafd7c2cef92ee313bb9d5bd8d1e43fc9d11a38ca2be9ed
    Image:         rancher/shell:v0.1.14
    Image ID:      docker.io/rancher/shell@sha256:9c33c0e58ceb0b3cb6a85d2a6349b1f7fe818e383e6a3cb46671558fbb2f7781
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      kubectl proxy --disable-filter || true
    State:          Terminated
      Reason:       StartError
      Message:      failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/var/lib/rancher/k3s/agent/containerd/io.containerd.grpc.v1.cri/sandboxes/6cd0ead3753fdc537e0510794237751eaa4e6f684b9813682bb5055c6376b2c9/resolv.conf" to rootfs at "/etc/resolv.conf" caused: open /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/edd37b9494686084dcafd7c2cef92ee313bb9d5bd8d1e43fc9d11a38ca2be9ed/rootfs/etc/resolv.conf: no such file or directory: unknown
      Exit Code:    128
      Started:      Thu, 01 Jan 1970 00:00:00 +0000
      Finished:     Tue, 30 May 2023 19:39:34 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      KUBECONFIG:  /root/.kube/config
    Mounts:
      /root/.kube/config from admin-kubeconfig (ro,path="config")
      /var/run/secrets/kubernetes.io/serviceaccount from pod-impersonation-helm-op-s2tkm-token-k6b6q (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  data:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  helm-operation-z6wbr
    Optional:    false
  admin-kubeconfig:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      impersonation-helm-op-admin-kubeconfig-whwhn
    Optional:  false
  user-kubeconfig:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      impersonation-helm-op-user-kubeconfig-sjsw7
    Optional:  false
  pod-impersonation-helm-op-s2tkm-token-k6b6q:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  pod-impersonation-helm-op-s2tkm-token-k6b6q
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     cattle.io/os=linux:NoSchedule
                 node-role.kubernetes.io/controlplane=true:NoSchedule
                 node-role.kubernetes.io/etcd=true:NoExecute
                 node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  44m   default-scheduler  Successfully assigned cattle-system/helm-operation-zgrkz to local-node
  Normal   Pulled     44m   kubelet            Container image "rancher/shell:v0.1.14" already present on machine
  Normal   Created    44m   kubelet            Created container helm
  Warning  Failed     44m   kubelet            Error: failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/var/lib/rancher/k3s/agent/containerd/io.containerd.grpc.v1.cri/sandboxes/6cd0ead3753fdc537e0510794237751eaa4e6f684b9813682bb5055c6376b2c9/resolv.conf" to rootfs at "/etc/resolv.conf" caused: open /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/helm/rootfs/etc/resolv.conf: no such file or directory: unknown
  Normal   Pulled     44m   kubelet            Container image "rancher/shell:v0.1.14" already present on machine
  Normal   Created    44m   kubelet            Created container proxy
  Warning  Failed     44m   kubelet            Error: failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/var/lib/rancher/k3s/agent/containerd/io.containerd.grpc.v1.cri/sandboxes/6cd0ead3753fdc537e0510794237751eaa4e6f684b9813682bb5055c6376b2c9/resolv.conf" to rootfs at "/etc/resolv.conf" caused: open /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/proxy/rootfs/etc/resolv.conf: no such file or directory: unknown

The version of docker I'm running is 4.20.0 on Windows 10 21H2.

I have tried starting up the rancher server on multiple machines though docker and experience the same issue. I've been troubleshooting this for a few hours now and not sure how to proceed.

Any assistance would be appreciated.

Thanks

@khan-belal
Copy link
Author

I looked into this some more, and I have narrowed down the issue to the local directory we are mounting inside the container for persistent storage.

When running the docker command without persistent storage

docker run -d --name rancher-server --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher:latest

Rancher comes up normally and I'm able to create a cluster etc. For some reason by mounting the local directory, Rancher can't locate the required config files in order to get the management cluster up and running.

Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant