This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

11/03/2022, 8:57 PM

This message was deleted.

creamy-pencil-82913

11/03/2022, 9:28 PM

manifests is the right place for anything you want deployed to the cluster, regardless of what’s in the manifest

creamy-pencil-82913

11/03/2022, 9:29 PM

static is just a place to put files that you can then access from within the cluster; we use it to host the traefik tarball

wide-author-88664

11/03/2022, 9:33 PM

ah ok, thanks @creamy-pencil-82913! Now, I did already put the manifest there, and it did the deploy; however (sadly) it is not working and actually seems to be affecting the whole cluster state… Removing the manifest file from there does not work to end the deployment of the chart/pod… How can I delete this chart install now?

wide-author-88664

11/03/2022, 9:35 PM

(tried killing the “helm-install-…” pod and although that works, the system just spins another right up…)

creamy-pencil-82913

11/03/2022, 9:37 PM

what do the logs of the helm install pod say?

creamy-pencil-82913

11/03/2022, 9:37 PM

if its affecting your cluster that suggests its getting deployed but perhaps isn’t configured correctly…

creamy-pencil-82913

11/03/2022, 9:38 PM

find your file with

kubectl get addon -A

and then delete it

creamy-pencil-82913

11/03/2022, 9:39 PM

that should also remove the HelmChart manifest that was in it

wide-author-88664

11/03/2022, 9:39 PM

Pod is stuck in ContainerCreating, other k3s pods have status Unknown

wide-author-88664

11/03/2022, 9:40 PM

It’s the nvidia-device-plugin chart, to enable GPU compute pods

creamy-pencil-82913

11/03/2022, 9:43 PM

oh. You don’t really need to do that.

wide-author-88664

11/03/2022, 9:43 PM

am trying

kubectl delete addon nvidia-device-plugin

creamy-pencil-82913

11/03/2022, 9:44 PM

https://github.com/k3s-io/k3s/issues/4391#issuecomment-1233314825

creamy-pencil-82913

11/03/2022, 9:44 PM

all you need to do is install the runtime packages on the node and restart k3s, it will add them to the containerd config automatically

wide-author-88664

11/03/2022, 9:45 PM

but giving a addons.k3s.cattle.io “nvidia-device-plug-in” not found error

creamy-pencil-82913

11/03/2022, 9:45 PM

did you specify a namespace?

wide-author-88664

11/03/2022, 9:45 PM

ah right, thx

wide-author-88664

11/03/2022, 9:47 PM

huh, even tho addon gone now, deleting pod works, but still spinning up a replacement…

creamy-pencil-82913

11/03/2022, 9:49 PM

you may need to delete the helmchart as well

wide-author-88664

11/03/2022, 9:51 PM

how to do?

creamy-pencil-82913

11/03/2022, 9:51 PM

same way you deleted the addon

wide-author-88664

11/03/2022, 9:52 PM

ok, figured it out, thx

wide-author-88664

11/03/2022, 9:54 PM

Thx for the gpu-enablement instructions, a lot of conflicting/old info out there…

wide-author-88664

11/03/2022, 9:59 PM

Sorry to keep bothering you, but… now I have a “helm-delete-…” pod stuck in ContainerCreating…

wide-author-88664

11/03/2022, 10:00 PM

If I do a describe on the pod, is having event FailedCreatePodSandbox

creamy-pencil-82913

11/03/2022, 10:05 PM

yeah because the gpu operator hosed your containerd config

creamy-pencil-82913

11/03/2022, 10:05 PM

you might need to nuke this install and start over

wide-author-88664

11/03/2022, 10:11 PM

OK, happily I saved the orig containerd config.toml, and replacing the modified one with the original got things back to (seemingly) normal…

wide-author-88664

11/03/2022, 10:12 PM

The bottom of it has this:

wide-author-88664

11/03/2022, 10:12 PM

that’s what I need?

creamy-pencil-82913

11/03/2022, 10:12 PM

yeah that adds another runtime

wide-author-88664

11/03/2022, 10:36 PM

@creamy-pencil-82913 Sweet, it works great according to your GitHub issue instructions… Really appreciate the assist! Cheers!

nutritious-tomato-14686

11/04/2022, 7:06 PM

I really need to add this to the K3s docs

👏 1

creamy-pencil-82913

11/04/2022, 7:27 PM

yeah both the nvidia stuff and some of the intricacies of the helm/deploy controller could use a bit better coverage

wide-author-88664

11/04/2022, 7:32 PM

oh yes, the Helm stuff too… with not having a “helm” command alias, that would be super helpful

5 Views

Open in Slack

Previous Next