adamant-kite-43734
04/14/2025, 5:06 PMfierce-australia-70779
05/02/2025, 5:30 PM/openapi/v2
endpoint of the managed cluster was returning 503
for any API calls, whereas the /openapi/v3
endpoint was working fine. Most tools seem to be just fine with hitting the /openapi/v3
endpoint, but I suspect Rancher calls both versions of the endpoint for backwards compatibility reasons.
The root cause of this issue is with some behaviour in kubeapi-server
where in instances where there are two (or more?) CRDS that are nearly identical but not exactly the same, the /openapi/v2
endpoint might not be able to correctly represent these CRDs in its API output and it returns a 503
as a result. Similar to the linked issue, ours was seemingly also caused by adding two different Kyverno components into the cluster that each contributed two similar-but-not-the-same versions of a given CRD at different times.
We worked around this by purging all Kyverno components (CRDs, deployments, etc.) and redeploying everything from a clean slate, then restarting cattle-cluster-agent
so that it could correctly scrape the /openapi/v2
endpoint and rebuild its cache. Everything seems fine now!