This message was deleted.
# general
a
This message was deleted.
f
Quick update: I did ultimately figure this out. The problem was that the
/openapi/v2
endpoint of the managed cluster was returning
503
for any API calls, whereas the
/openapi/v3
endpoint was working fine. Most tools seem to be just fine with hitting the
/openapi/v3
endpoint, but I suspect Rancher calls both versions of the endpoint for backwards compatibility reasons. The root cause of this issue is with some behaviour in
kubeapi-server
where in instances where there are two (or more?) CRDS that are nearly identical but not exactly the same, the
/openapi/v2
endpoint might not be able to correctly represent these CRDs in its API output and it returns a
503
as a result. Similar to the linked issue, ours was seemingly also caused by adding two different Kyverno components into the cluster that each contributed two similar-but-not-the-same versions of a given CRD at different times. We worked around this by purging all Kyverno components (CRDs, deployments, etc.) and redeploying everything from a clean slate, then restarting
cattle-cluster-agent
so that it could correctly scrape the
/openapi/v2
endpoint and rebuild its cache. Everything seems fine now!