# longhorn-storage
Are you pods still stuck? Could you show the logs just by this command
kubectl get pods --namespace longhorn-system
? And Could you check logs of others
This is the log of the second
$ kubectl --namespace longhorn-system logs longhorn-conversion-webhook-55cf57895c-cn68m
time="2023-03-27T06:58:35Z" level=info msg="Starting longhorn conversion webhook server"
W0327 06:58:35.776898       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2023-03-27T06:58:35Z" level=warning msg="Failed to init Kubernetes secret: secrets \"longhorn-webhook-tls\" not found"
time="2023-03-27T06:58:36Z" level=info msg="generated self-signed CA certificate CN=dynamiclistener-ca,O=dynamiclistener-org: notBefore=2023-03-27 06:58:36.345554557 +0000 UTC notAfter=2033-03-24 06:58:36.345554557 +0000 UTC"
time="2023-03-27T06:58:38Z" level=info msg="Listening on :9443"
time="2023-03-27T06:58:38Z" level=info msg="certificate CN=dynamic,O=dynamic signed by CN=dynamiclistener-ca,O=dynamiclistener-org: notBefore=2023-03-27 06:58:36 +0000 UTC notAfter=2024-03-26 06:58:38 +0000 UTC"
time="2023-03-27T06:58:38Z" level=info msg="Creating new TLS secret for longhorn-webhook-tls (count: 1): map[<|> <]|]>"
time="2023-03-27T06:58:39Z" level=info msg="Active TLS secret longhorn-webhook-tls (ver=10272129) (count 1): map[<|> <]|]>"
time="2023-03-27T06:58:41Z" level=info msg="Starting <|>, Kind=APIService controller"
time="2023-03-27T06:58:41Z" level=info msg="Starting /v1, Kind=Secret controller"
time="2023-03-27T06:58:41Z" level=info msg="Starting <|>, Kind=CustomResourceDefinition controller"
time="2023-03-27T06:58:41Z" level=info msg="Building conversion rules..."
time="2023-03-27T06:58:41Z" level=info msg="Updating TLS secret for longhorn-webhook-tls (count: 1): map[<|> <]|]>"
time="2023-03-27T06:58:41Z" level=info msg="Update CRD for <|>"
time="2023-03-27T06:58:41Z" level=info msg="Active TLS secret longhorn-webhook-tls (ver=10272145) (count 1): map[<|> <]|]>"
time="2023-03-27T06:58:42Z" level=info msg="Update CRD for <|>"
time="2023-03-27T06:58:42Z" level=info msg="Update CRD for <|>"
time="2023-03-27T06:58:42Z" level=info msg="Update CRD for <|>"
time="2023-03-27T06:58:42Z" level=info msg="Building conversion rules..."
kubectl get pods --namespace longhorn-system
remains the same, still stuck
Which Longhorn version did you install? Did you install Longhorn by Rancher?
I have updated both nodes (one manager and one worker) to K3s v1.26.4 and longhorn 1.4.1 using
$ kubectl get pods --namespace longhorn-system --watch
NAME                                           READY   STATUS     RESTARTS      AGE
longhorn-admission-webhook-66bdd6674b-62tzp    0/1     Init:0/1   0             44s
longhorn-admission-webhook-66bdd6674b-lvqdm    1/1     Running    0             44s
longhorn-conversion-webhook-7b9fd878b6-jsl7p   1/1     Running    0             45s
longhorn-conversion-webhook-7b9fd878b6-p8vvq   1/1     Running    0             45s
longhorn-driver-deployer-7fddbc7f5c-sm9fp      0/1     Init:0/1   0             48s
longhorn-manager-6zv76                         0/1     Init:0/1   0             48s
longhorn-manager-bf48z                         0/1     Error      1 (20s ago)   48s
longhorn-recovery-backend-7d686cd599-47d2g     1/1     Running    0             47s
longhorn-recovery-backend-7d686cd599-vf7pd     1/1     Running    0             47s
longhorn-ui-59fdc4c7b9-8d2p8                   1/1     Running    0             46s
longhorn-ui-59fdc4c7b9-hpdp9                   1/1     Running    1 (21s ago)   46s
longhorn-ui-59fdc4c7b9-hpdp9                   0/1     Error      1 (22s ago)   47s
longhorn-manager-bf48z                         0/1     CrashLoopBackOff   1 (15s ago)   54s
longhorn-manager-bf48z                         0/1     Running            2 (16s ago)   55s
longhorn-ui-59fdc4c7b9-hpdp9                   0/1     CrashLoopBackOff   1 (13s ago)   59s
longhorn-ui-59fdc4c7b9-hpdp9                   1/1     Running            2 (14s ago)   60s
longhorn-manager-bf48z                         0/1     Error              2 (26s ago)   65s
longhorn-manager-bf48z                         0/1     CrashLoopBackOff   2 (15s ago)   79s
longhorn-ui-59fdc4c7b9-hpdp9                   0/1     Error              2 (34s ago)   80s
longhorn-manager-bf48z                         0/1     Running            3 (27s ago)   91s
longhorn-ui-59fdc4c7b9-hpdp9                   0/1     CrashLoopBackOff   2 (16s ago)   95s
longhorn-manager-bf48z                         0/1     Error              3 (38s ago)   102s
longhorn-ui-59fdc4c7b9-hpdp9                   1/1     Running            3 (30s ago)   109s
longhorn-manager-bf48z                         0/1     CrashLoopBackOff   3 (13s ago)   114s
$ kubect--namespace longhorn-system logs longhorn-manager-bf48z
Defaulted container "longhorn-manager" out of: longhorn-manager, wait-longhorn-admission-webhook (init)
W0328 11:25:21.547992       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2023-03-28T11:25:21Z" level=info msg="cannot list the content of the src directory /var/lib/rancher/longhorn/engine-binaries for the copy, will do nothing: failed to execute: nsenter [--mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net bash -c ls /var/lib/rancher/longhorn/engine-binaries/*], output , stderr ls: cannot access '/var/lib/rancher/longhorn/engine-binaries/*': No such file or directory\n: exit status 2"
I0328 11:25:21.600950       1 leaderelection.go:248] attempting to acquire leader lease longhorn-system/longhorn-manager-upgrade-lock...
I0328 11:25:21.628618       1 leaderelection.go:258] successfully acquired lease longhorn-system/longhorn-manager-upgrade-lock
time="2023-03-28T11:25:21Z" level=info msg="Start upgrading"
time="2023-03-28T11:25:21Z" level=info msg="setting default-engine-image not found"
time="2023-03-28T11:25:31Z" level=error msg="Upgrade failed: upgrade API version failed: cannot create CRDAPIVersionSetting: Internal error occurred: failed calling webhook \"<|>\": failed to call webhook: Post \"<https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s>\": context deadline exceeded"
time="2023-03-28T11:25:31Z" level=info msg="Upgrade leader lost: geo-node1"
time="2023-03-28T11:25:31Z" level=fatal msg="Error starting manager: upgrade API version failed: cannot create CRDAPIVersionSetting: Internal error occurred: failed calling webhook \"<|>\": failed to call webhook: Post \"<https://longhorn-admission-webhook.longhorn-system.svc:9443/v1/webhook/validaton?timeout=10s>\": context deadline exceeded"
Can you check if your DNS works well and check if the
is accessible?
not sure how to check that, but if I curl from the host or another pod I get
curl: (6) Could not resolve host: longhorn-admission-webhook.longhorn-system.svc
Could you show the logs from commands
kubectl -n longhorn-system get endpoints
kubectl -n longhorn-system get svc
$ kubectl -n longhorn-system get endpoints
NAME                          ENDPOINTS                          AGE
longhorn-admission-webhook                    95m
longhorn-backend                                                 95m
longhorn-conversion-webhook,   95m
longhorn-engine-manager       <none>                             95m
longhorn-frontend   ,   95m
longhorn-recovery-backend,   95m
longhorn-replica-manager      <none>                             95m
$ kubectl -n longhorn-system get svc
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
longhorn-admission-webhook    ClusterIP   <none>        9443/TCP   95m
longhorn-backend              ClusterIP   <none>        9500/TCP   95m
longhorn-conversion-webhook   ClusterIP   <none>        9443/TCP   95m
longhorn-engine-manager       ClusterIP   None            <none>        <none>     95m
longhorn-frontend             ClusterIP    <none>        80/TCP     95m
longhorn-recovery-backend     ClusterIP    <none>        9600/TCP   95m
longhorn-replica-manager      ClusterIP   None            <none>        <none>     95m
Could you test this command in the
longhorn-manager pod
#telnet longhorn-admission-webhook.longhorn-system.svc 9443
the manager always crashes, tried the command from the admission-webhook
I have no name!@longhorn-admission-webhook-66bdd6674b-lvqdm:/> telnet longhorn-admission-webhook.longhorn-system.svc 9443
Connected to longhorn-admission-webhook.longhorn-system.svc.
Escape character is '^]'.
sorry, from the conversion-webhook
Do you turn on the
Following the tutorial
could you provide the link?
is not running.
is inactive on the manager and active on the worker
Had you ran the scripts in
section? And you have the rancher installed in the cluster as well?
yes, i ran the script and rancher is installed. will run the script again, to see if that gives some indication
Could you also show the logs of
$ bash 
[INFO]  Required dependencies 'kubectl jq mktemp' are installed.
[INFO]  Hostname uniqueness check is passed.
[INFO]  Waiting for longhorn-environment-check pods to become ready (0/2)...
[INFO]  All longhorn-environment-check pods are ready (2/2).
[INFO]  Required packages are installed.
[WARN]  multipathd is running on localhost.
[WARN]  multipathd is running on geo-node1.
[WARN]  multipathd would probably result in the Longhorn volume mount failure. Please refer to <> for more information.
[INFO]  MountPropagation is enabled.
[INFO]  Cleaning up longhorn-environment-check pods...
[INFO]  Cleanup completed.
And Please help check the kubelet log. Thank you.
Having exactly same error. Any solution?