This message was deleted.
# k3s
a
This message was deleted.
m
There are some flags to adjust logging: https://docs.k3s.io/cli/server?_highlight=logs#logging
q
I'm not setting any of these flags, and the defaults doesn't look like they would enable trace-level logging. 🤔
I have now tried explicitly setting
debug: false
and
v: 1
, but I still have loads of Trace logging from both server and agent. Is this something that only happens in my cluster?
c
There are some trace messages that Kubernetes will always emit if the correct conditions occur, mostly around high latency requests. What specifically are you seeing?
q
Some other issues caused me to make the decision to scrap the cluster and recreate, so don't have the logs available anymore. There was a lot of them, but I guess it could have been high latency requests. 🤔
How high is "high latency" in this context btw? Or in other words, how sensitive is k3s to latency towards the datastore? My database is not under any kind of stress in terms of cpu, memory or storage, but networkwise it is somewhat distanced from the cluster itself so network latency might be considerable. However it has worked fine for quite some time with the current setup.
c
Without knowing what specific messages you were getting I can’t say much. Its not k3s itself so much as etcd and the apiserver that have latency thresholds that will trigger warnings.
q
I've started setting up a new cluster from scratch. Started with 2 raspberry pi 4b running as servers, with external postgres database. Just those two nodes so far, and already seeing lots of Trace logging. The database is a managed postgres running in DigitalOcean Amsterdam (ams), while the servers are in my basement in Oslo, Norway. I'm guessing the distance is causing problems, although the same setup has been working fine for several months. Example trace logs:
Copy code
2024-03-02T22:28:28.848135+01:00 rpi4b01 k3s[72637]: I0302 22:28:28.847506   72637 trace.go:236] Trace[656851478]: "Update" accept:application/vnd.kubernetes.protobuf, */*,audit-id:04915b51-4239-48bb-ad57-434de8cb0469,client:127.0.0.1,protocol:HTTP/2.0,resource:customresourcedefinitions,scope:resource,url:/apis/apiextensions.k8s.io/v1/customresourcedefinitions/certificates.cert-manager.io/status,user-agent:k3s/v1.28.7+k3s1 (linux/arm64) kubernetes/051b14b,verb:PUT (02-Mar-2024 22:28:27.877) (total time: 969ms):
2024-03-02T22:28:28.848534+01:00 rpi4b01 k3s[72637]: Trace[656851478]: ["GuaranteedUpdate etcd3" audit-id:04915b51-4239-48bb-ad57-434de8cb0469,key:/apiextensions.k8s.io/customresourcedefinitions/certificates.cert-manager.io,type:*apiextensions.CustomResourceDefinition,resource:<http://customresourcedefinitions.apiextensions.k8s.io|customresourcedefinitions.apiextensions.k8s.io> 957ms (22:28:27.890)]
2024-03-02T22:28:28.848943+01:00 rpi4b01 k3s[72637]: Trace[656851478]: ---"Write to database call succeeded" len:26009 338ms (22:28:28.787)
2024-03-02T22:28:28.849137+01:00 rpi4b01 k3s[72637]: Trace[656851478]: ---"Writing http response done" 60ms (22:28:28.847)
2024-03-02T22:28:28.849314+01:00 rpi4b01 k3s[72637]: Trace[656851478]: [969.799324ms] [969.799324ms] END

2024-03-02T22:28:28.951518+01:00 rpi4b01 k3s[72637]: I0302 22:28:28.951150   72637 trace.go:236] Trace[1148973905]: "Update" accept:application/vnd.kubernetes.protobuf, */*,audit-id:167cc613-4677-4680-97fd-bbbe655addb0,client:127.0.0.1,protocol:HTTP/2.0,resource:customresourcedefinitions,scope:resource,url:/apis/apiextensions.k8s.io/v1/customresourcedefinitions/challenges.acme.cert-manager.io/status,user-agent:k3s/v1.28.7+k3s1 (linux/arm64) kubernetes/051b14b,verb:PUT (02-Mar-2024 22:28:28.181) (total time: 769ms):
2024-03-02T22:28:28.951877+01:00 rpi4b01 k3s[72637]: Trace[1148973905]: ---"Conversion done" 16ms (22:28:28.209)
2024-03-02T22:28:28.952094+01:00 rpi4b01 k3s[72637]: Trace[1148973905]: ---"Write to database call succeeded" len:67789 366ms (22:28:28.942)
2024-03-02T22:28:28.952286+01:00 rpi4b01 k3s[72637]: Trace[1148973905]: [769.790591ms] [769.790591ms] END

2024-03-02T22:28:29.087303+01:00 rpi4b01 k3s[72637]: I0302 22:28:29.086774   72637 trace.go:236] Trace[1948471463]: "List(recursive=true) etcd3" audit-id:,key:/cert-manager.io/clusterissuers,resourceVersion:,resourceVersionMatch:,limit:0,continue: (02-Mar-2024 22:28:28.551) (total time: 535ms):
2024-03-02T22:28:29.087704+01:00 rpi4b01 k3s[72637]: Trace[1948471463]: [535.050934ms] [535.050934ms] END

2024-03-02T22:28:29.161524+01:00 rpi4b01 k3s[72637]: I0302 22:28:29.160869   72637 trace.go:236] Trace[558258468]: "Update" accept:application/vnd.kubernetes.protobuf, */*,audit-id:27286bea-d669-4f35-9987-a5c04897dd61,client:127.0.0.1,protocol:HTTP/2.0,resource:customresourcedefinitions,scope:resource,url:/apis/apiextensions.k8s.io/v1/customresourcedefinitions/issuers.cert-manager.io/status,user-agent:k3s/v1.28.7+k3s1 (linux/arm64) kubernetes/051b14b,verb:PUT (02-Mar-2024 22:28:28.575) (total time: 585ms):
2024-03-02T22:28:29.161909+01:00 rpi4b01 k3s[72637]: Trace[558258468]: ---"Conversion done" 23ms (22:28:28.611)
2024-03-02T22:28:29.163682+01:00 rpi4b01 k3s[72637]: Trace[558258468]: ["GuaranteedUpdate etcd3" audit-id:27286bea-d669-4f35-9987-a5c04897dd61,key:/apiextensions.k8s.io/customresourcedefinitions/issuers.cert-manager.io,type:*apiextensions.CustomResourceDefinition,resource:<http://customresourcedefinitions.apiextensions.k8s.io|customresourcedefinitions.apiextensions.k8s.io> 546ms (22:28:28.613)
2024-03-02T22:28:29.164000+01:00 rpi4b01 k3s[72637]: Trace[558258468]:  ---"About to Encode" 293ms (22:28:28.921)
2024-03-02T22:28:29.164224+01:00 rpi4b01 k3s[72637]: Trace[558258468]:  ---"Txn call completed" 179ms (22:28:29.118)]
2024-03-02T22:28:29.164467+01:00 rpi4b01 k3s[72637]: Trace[558258468]: [585.053044ms] [585.053044ms] END
c
yep those are from the apiserver. if the total request time is over 500ms, you get a trace message breaking down the processing phases so you can figure out what was slow.
happens regardless of the log level
looks like most of the time is coming from etcd writes. you mentioned high latency between the nodes and the database, that would do it.
you might look for messages about datastore compact times, confirm that’s working properly
q
Seems to be compacting at regular intervals. I'll see if I can run some benchmarks of the database, but I guess this could just be "the internet between Oslo and Amsterdam has changed" and now it takes 100ms more or something. Too bad that's where the free Postgres is ... 🤷 It's funny how running a low budget home cluster makes you encounter entirely different problems than when running professional clusters in major cloud providers hosting thousands of apps 😅
☝️ 1