Does anyone by chance have an example or walkthrou...
# harvester
w
Does anyone by chance have an example or walkthrough from creating a VM to using that VM snapshot with the harvester driver in Rancher? My nodes are being created as VMs in harvester, but they never connect. I believe I'm missing some instructions related to cloud-init or other setup for the nodes themselves.
b
If it never connects it sounds like the VM network isn't working.
I'd make a normal VM first and verify it can connect back to your rancher instance first.
w
VM has access to the RMS, but no luck
b
I don't know what you mean by that.
w
rancher management server
I connected Harvester to the Rancher
I created a new cluster using the harvester driver (created credential, etc)
the cluster created, and my nodes created in harvester as VMs
if I shell into those VMs, they can curl Rancher
image.png
I don't know that anything is actually executing, or if there's something being installed, or what to debug. It's just a bit of a black hole right now since I can't find documentation for that part of the process
it's presented as though it should just work
b
Your SSH user is correct? It should connect back to the rancher address automatically. Are you using a DNS name or an IP?
w
ssh user is correct
dns name with valid certificates (but that configuration seems automagic as part of the cluster setup in Rancher anyways)
I did not adjust the Add-On config, not sure if I'm supposed to
the cloud-init has qemu and everything it looks like it's supposed to
b
What do the provisioning logs and events say?
w
harvester or rancher side (or both)?
b
Rancher
under cluster management
w
Copy code
[INFO ] waiting for infrastructure ready
[INFO ] waiting for viable init node
[INFO ] waiting for all etcd machines to be deleted
[INFO ] waiting for at least one control plane, etcd, and worker node to be registered
[INFO ] waiting for viable init node
image.png
b
check under the local cluster for pods that might be crashlooping
I think it's in one of the fleet namespaces
w
looking
all pods and deployments look stable
b
Just making sure the filter on top is for all namespaces and not just user namespaces?
w
correct, all namespaces
I tried selecting all of the fleet-related namespaces just to be certain too
b
You never get anything that says configuring bootstrap node?
w
i'm not certain
I can create another cluster and watch for it
also, just noting, I don't see anything out of sorts in the embedded Rancher dashboard in the harvester cluster
(i'm also happy to screenshare btw)
do I need to change anything here?
b
You need iptables if it's not in your cloud image
w
I set up an ubuntu image and went through the installer, then created an image from a snapshot
do I need to run through all of this? or should that get taken care of through the rancher-harvester integration? https://docs.harvesterhci.io/v1.3/rancher/cloud-provider#deploying-to-the-k3s-cluster-with-harvester-node-driver-experimental
b
You should be able to add
iptables
in the cloud init between lines 4 and 5
below the
qemu-guest-agent
w
👍
is there a particular place on the node (VM) that I should check for logs?
b
I'd also suggest setting the container network to
canal
which is the default but not the literal default for when you set up harvester provider cluster.
you can follow journalctl
If that's what ubuntu uses.
I stick to suse images because I figure there's probably better test coverage with that than with other distros.
w
i typically do to, just happened to use ubuntu this time
couldn't get sle-micro to work either
b
If you didn't add
iptables
in the image then that might be why
w
do you run through the installer first, or do you just use one of the images with only the cloud-init?
b
It depends on the installer
You have to have the cloud-init packages to make it cloud based image
Most distros have cloud images you can just download/import
no need to run through an installer
w
which sle-micro base image do you start with?
b
w
NoCloud?
do you add a user to the cloud init then specify the same user as the SSH User?
some progress
log output with NoCloud image:
Copy code
Downloading driver from <https://dnac.krum.io/assets/docker-machine-driver-harvester>
2024-06-06T23:53:54.669943266Z Doing /etc/rancher/ssl
2024-06-06T23:53:57.931343509Z docker-machine-driver-harvester
2024-06-06T23:53:57.931370169Z docker-machine-driver-harvester: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
2024-06-06T23:53:58.079787051Z Trying to access option  which does not exist
2024-06-06T23:53:58.079800211Z THIS ***WILL*** CAUSE UNEXPECTED BEHAVIOR
2024-06-06T23:53:58.080971221Z Type assertion did not go smoothly to string for key 
2024-06-06T23:53:58.089574310Z Running pre-create checks...
2024-06-06T23:53:58.286226953Z Creating machine...
Waiting for machine to be running, this may take a few minutes...
Custom install script was sent via userdata, provisioning complete...
output with Ubuntu:
Copy code
Downloading driver from <https://dnac.krum.io/assets/docker-machine-driver-harvester>
2024-06-06T23:53:30.559586420Z Doing /etc/rancher/ssl
2024-06-06T23:53:33.979308689Z docker-machine-driver-harvester
2024-06-06T23:53:33.979325820Z docker-machine-driver-harvester: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
2024-06-06T23:53:34.106963507Z Trying to access option  which does not exist
2024-06-06T23:53:34.106979658Z THIS ***WILL*** CAUSE UNEXPECTED BEHAVIOR
2024-06-06T23:53:34.106985588Z Type assertion did not go smoothly to string for key 
2024-06-06T23:53:34.112327654Z Running pre-create checks...
2024-06-06T23:53:34.304019166Z Creating machine...
bootstrap appears to be running on leap
🎉
so what in the world am I missing on the ubuntu side? 😭
thanks again for the help
r
try to migrate servers during "wait". Rancher can't connect node which has VIP attached to it