This message was deleted.
# k3s
a
This message was deleted.
c
manifests is the right place for anything you want deployed to the cluster, regardless of what’s in the manifest
static is just a place to put files that you can then access from within the cluster; we use it to host the traefik tarball
w
ah ok, thanks @creamy-pencil-82913! Now, I did already put the manifest there, and it did the deploy; however (sadly) it is not working and actually seems to be affecting the whole cluster state… Removing the manifest file from there does not work to end the deployment of the chart/pod… How can I delete this chart install now?
(tried killing the “helm-install-…” pod and although that works, the system just spins another right up…)
c
what do the logs of the helm install pod say?
if its affecting your cluster that suggests its getting deployed but perhaps isn’t configured correctly…
find your file with
kubectl get addon -A
and then delete it
that should also remove the HelmChart manifest that was in it
w
Pod is stuck in ContainerCreating, other k3s pods have status Unknown
It’s the nvidia-device-plugin chart, to enable GPU compute pods
c
oh. You don’t really need to do that.
w
am trying
kubectl delete addon nvidia-device-plugin
all you need to do is install the runtime packages on the node and restart k3s, it will add them to the containerd config automatically
w
but giving a addons.k3s.cattle.io “nvidia-device-plug-in” not found error
c
did you specify a namespace?
w
ah right, thx
huh, even tho addon gone now, deleting pod works, but still spinning up a replacement…
c
you may need to delete the helmchart as well
w
how to do?
c
same way you deleted the addon
w
ok, figured it out, thx
Thx for the gpu-enablement instructions, a lot of conflicting/old info out there…
Sorry to keep bothering you, but… now I have a “helm-delete-…” pod stuck in ContainerCreating…
If I do a describe on the pod, is having event FailedCreatePodSandbox
c
yeah because the gpu operator hosed your containerd config
you might need to nuke this install and start over
w
OK, happily I saved the orig containerd config.toml, and replacing the modified one with the original got things back to (seemingly) normal…
The bottom of it has this:
that’s what I need?
c
yeah that adds another runtime
w
@creamy-pencil-82913 Sweet, it works great according to your GitHub issue instructions… Really appreciate the assist! Cheers!
n
I really need to add this to the K3s docs
👏 1
c
yeah both the nvidia stuff and some of the intricacies of the helm/deploy controller could use a bit better coverage
w
oh yes, the Helm stuff too… with not having a “helm” command alias, that would be super helpful