cpu and memory don't seem to be bottle neck, neither seem to be disks (all running on nvme) so I took a look at networking and it looks like the speed internally to the cluster is "slow".
To be clear: the fast cluster is a "cloud managed" bare metal instance, so I'm not sure if they installed any special module/whatever,
While on my home hosts I installed a vanilla ubuntu server.