This message was deleted.
# rancher-setup
a
This message was deleted.
πŸ‘ 1
πŸ‘€ 1
n
this didn't fully work. It failed on the driver install so there must be more to it. I tried the default and latest drivers and neither enabled the time-sharing via the driver.
for now. I create the primary node poll via Rancher then I go into GKE and Add Node Pool with the GPU resources and set the driver to latest in there.
πŸ‘€ 1
m
I will be tackling this next with an HP DL380a node w/ 8 NVIDIA L4 GPUs in my lab. Thanks for blazing the path. πŸ™πŸ‰
n
@microscopic-diamond-41007 I never fully got the labels to work. There is a file nvidia_config that is supposed to get written to for the driver install. Based on these settings.
Copy code
<http://cloud.google.com/gke-gpu-sharing-strategy=time-sharing|cloud.google.com/gke-gpu-sharing-strategy=time-sharing>
<http://cloud.google.com/gke-max-shared-clients-per-gpu=20|cloud.google.com/gke-max-shared-clients-per-gpu=20>
It looks like the labels are not the only thing needed because those entries are missing. When deployed via rancher. So what I did was I deployed my primary nodes that don't have a GPU via rancher. Then I deployed the GPU nodes from GCP. Those GPU nodes will get added and in fact you can control them from Rancher after that.
πŸ™Œ 1
πŸ‘ 1