This message was deleted.
# rancher-desktop
a
This message was deleted.
w
yeah it will be a bit more complicated as i believe CUDA is glibc and Alpine which I thought is what RD used for a container host is musl. nvidia has resisted releasing a musl option and jamming glibc into the host os may have other issues.
https://learn.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl has more information on prereqs for running CUDA specifically.
q
One quick doubt : RD is working on K3S and K3S is able to use the GPU capability . Is there any difference in both
w
for your specific needs i would say stick with Docker Desktop as it would be cheaper than moving to ROCm and i wouldn't hold your breath on nvidia supporting Alpine in the near term. Having RD have a debian option for RD would also take a bit of time so easiest if you need CUDA would be stick with DD.
it would be the same container host so are you sure its hitting CUDA?
q
We have tested K3s on edge server not on local. RD we tried with local machine having GPU.
w
yeah k3s is just an orchestrator?
the container host needs to support the kernel drivers which is the problem. you have a musl c support and not glibc and nvidia's drivers depend on glibc from what i know.
q
Okay, need to check on that.
Thank you @wide-mechanic-33041
w
you might be able to hack something in, but it will be super fragile and performance and functionality will be different. I say "there be dragons" so keep DD around for these users https://wiki.alpinelinux.org/wiki/NVIDIA
👍 1
q
can test for one machine, need to check on it. Open WebUI extension for RD will also have same issue ?
The hack appears to be challenging since it involves working with drivers.
w
well open webui is an interface? what are you using for inference?
q
well , team was checking Rancher desktop ui to check .. other than that VS code to run the container already enabled with GPU
w
no RD would be your orchestrator. what are you using to host the actual models used for inference?
many support CPU based which should work perfectly fine. AVX2 isn't a barn burner, but you can get ok pp and tg token throughput
i know webui tends to recommend ollama given its adaptability, but it can use any openapi compatible server
q
I need to check with DS team on this part.. What I got update is local desktop with GPU with it. Interface part will have to check with them once.
w
well RD won't be able to use a local nvidia GPU because of the lack of musl compatible drivers. That just isn't an option. so you will need to look at other options like sticking w docker desktop where mobyvm is glibc, getting an AMD card which should support musl linking, or use CPU inference
technically you could also use external inference, but sounded like you wanted this to all be local
f
The Open WebUI extension will be included in Rancher Desktop 1.17. It will run Ollama on the host, to get access to the GPU and the rests of the app (webui, search) inside containers.
👍 1
132 Views