This message was deleted.
# rancher-desktop
a
This message was deleted.
c
Hey Semih, How about enable qemu-guest-agent on the machine where rancher desktop is running? I have seen a comment in the issue you linked
b
I saw that as well but not sure how should i setup this on MacOS 😕
Is it something running behind the scenes always ?
c
It is running on your macos laptop directly or on a VM on top of your laptop?
b
how should i install and enable this ? Via docker or package manager
c
If you’re running Rancher Desktop on a VM, then yes, you should install if that apply, if RD is installed directly on your macos laptop, then there is no need, it should be something else
b
RD is running on my macos laptop
So will it work if I pull the image and run the container on my macos?
c
It should, that would be a good test to see if that’s a Rancher Desktop issue or it is a system-wide issue
b
which docker image should u use for this 😕
I am using macOS M1 btw 😕
c
What image you want to use?, if you’re referring to qemu, it does not affect on macos, so it should be something else
b
ok i am confused 😕
so to fix the the clock skew i need to enable qemu agent on my macos, right >
c
No, qemu agent should only be enabled if you’re running RD on a VM, not on your macos laptop
b
I seee. So what should I do in this case 😕
c
How did you detected the time skew?, it is a second skew or longer times?
b
After couple of times/hours later the spans are gettign weird which is not possible
If I reset cluster and redeploy everything at the beginning its working fine but later it breaks
c
Hmm, that’s weird, it didn’t happen to me yet, can you confirm RD and also k8s version?
b
Did u have any similar setup like this before in your local ?
Rancer 1.5.1
Copy code
kubectl version                                                                                                                                                                                                                                                                             
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: <http://version.Info|version.Info>{Major:"1", Minor:"24", GitVersion:"v1.24.4+k3s1", GitCommit:"c3f830e9b9ed8a4d9d0e2aa663b4591b923a296e", GitTreeState:"clean", BuildDate:"2022-08-25T03:46:35Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/arm64"}
c
I don’t have anything right now with so much proxies around
Can you go to the logs and provide some of them here?
b
I ll try.. Which one do you need exactly ?
c
k3s, <machineName>_serial, <machineName>.ha.stdout
Maybe there is something relevant there
b
It will be getting worst during macos sleep btw
Probably when I wake up tomorrow and try sameting i could not even see anyspan
c
Go to the rancher desktop dashboard preferences virtual machine, maybe you need more resources there to prevent skews
by default it uses 4gb and 2cpus, depending on how loaded is your cluster maybe it’s a lack of resources issue
b
it is already 24gb and 8cpu
c
It’s odd, I’ll look for it tomorrow, thanks Semih
🙌 1
f
If you are on an M1 mac, you need to use macOS 12.4+ to be able to use more than 3GiB. If you are on an older release, then Lima will automatically reduce the memory to 3GiB to avoid a macOS kernel bug with qemu 7+.
If you see more than a couple seconds time drift, please open a new bug for it. Lima is trying to sync the VM clock back to the host clock. It is using a somewhat crude mechanism, so may drift for a few seconds, but that should always be fixed within a minute or so. There may be larger drift directly after waking from a sleep state, but it should always catch up quickly
b
@fast-garage-66093 its already 12.5.1 macOS Monterey
f
Ok, then it should be able to use the 24GB you have configured for it
b
during this time for instance I created another trace, as you can see attached it took 54m 34s which is impossible 😕
and even if I triggered new one, it shows 1hour ago
f
I can't tell where these times come from, can you run this command to see any time drift between host and VM:
Copy code
$ TZ=UTC date;rdctl shell date
Wed 21 Sep 2022 22:00:21 UTC
Wed Sep 21 22:00:20 UTC 2022
b
host is 00:01 now
f
Yes, but you are probably in a different timezone than UTC
b
yeah UTC+2
f
Your screenshot shows that the VM and host have the same time within 1s
b
Sorry GMT+2
f
So whatever you see in time drift from Prom has to be due to other issues
If I had to guess I would say it is using ICMP to measure something, and that doesn't work properly within qemu on macOS
b
So what should I do in this case?
I am also pretty sure its not related with Prom
f
idk, I have no idea how Tempo is generating it's span metrics
I would go to the Grafana Slack and see if you can find some help there
b
I asked already
f
Or at least an explanation on how the metrics are generated
b
They told that it might be probably clock skew issue
I might maybe test this without RD to see how it goes
f
Yeah, but I've shown you that the clocks are pretty much in sync
It may very well be something about the VM, but without knowing what Tempo does, it is hard to figure out
I have never used Tempo, so I don't know how it works. It looks like it ingests traces, so the first step is to understand how these traces are generated, and if they have the correct timing information.
b
let me ask
Tempo uses the timestamps provided in the spans it ingests. For example, these are the fields when using OTel proto
f
Yeah, but what/where is the call that collects the timestamps? If you can show a Linux command or small C/Go program that demonstrates that the VM or container clocks are significantly out of sync with the host clock, then please file a bug. But we won't have time to debug the tempo app or trace collector ourselves; we need a small repro case that demonstrates the issue directly.
Lima has Add hwclock sync loop to the guestagent by mikluko · Pull Request #490 · lima-vm/lima to keep the clocks in sync, but maybe it doesn't always work.
This is how I checked if the RTC and regular clock are in sync:
Copy code
$ rdctl shell sudo -i sh -c 'date;hwclock'
Thu Sep 22 16:13:22 UTC 2022
Thu Sep 22 16:13:23 2022  0.000000 seconds
b
Copy code
rdctl shell sudo -i sh -c 'date;hwclock'                                                                                                                                                                                                                                                                                                                                       
Thu Sep 22 16:23:38 UTC 2022
Thu Sep 22 16:23:39 2022  0.000000 seconds
f
Yeah, they seem to match pretty closely
b
seems so 😕
f
I've been trying to run
hwclock
inside a container, but neither alpine nor ubuntu seems to have an RTC device configured
Anyways, this is just poking at things in the dark; we need to know how the timestamps in the traces are retrieved to look into this further
b
Dont know what should I do to be honest
f
You will have to go back to the team that produces the traces, and find out how they capture the timestamps. Since they use nano-second resolution, they aren't using the regular
time
, which have only full second granularity.
b
I am the one that 🙂 but in the instrumentation there is nothin related with timestams
f
Once you know which API they use, you can create a small test program to query the time yourself, to see if you see any clock drift to the system or hardware clocks. If you do, then we can take a look. But if you can't, then I would claim it is a bug in Tempo...
There must be some code that fills in the start and end times in the trace packets; you need to find the place it is done, and how it is done.
Or you could start writing sample programs that use 64bit time interfaces and call them, to see if you see any problems there.