This message was deleted.
# rancher-desktop
a
This message was deleted.
e
for example dig +tcp @192.168.127.1 google.com times out.. But dig @192.168.127.1 google.com works fine
c
Hi Nathan, I’m trying to repro this on my end by unfortunately can’t repro it 😐
Copy code
/ # dig +tcp @192.168.127.1 <http://google.com|google.com>

; <<>> DiG 9.18.13 <<>> +tcp @192.168.127.1 <http://google.com|google.com>
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5134
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
<http://google.com|google.com>.             41      IN      A       142.250.191.174

;; Query time: 89 msec
;; SERVER: 192.168.127.1#53(192.168.127.1) (TCP)
;; WHEN: Tue May 16 16:09:09 UTC 2023
;; MSG SIZE  rcvd: 55

/ # dig @192.168.127.1 <http://google.com|google.com>

; <<>> DiG 9.18.13 <<>> @192.168.127.1 <http://google.com|google.com>
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5221
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
<http://google.com|google.com>.             0       IN      A       142.250.190.78

;; Query time: 50 msec
;; SERVER: 192.168.127.1#53(192.168.127.1) (UDP)
;; WHEN: Tue May 16 16:09:19 UTC 2023
;; MSG SIZE  rcvd: 54
It resolves fine in either case.
The new network stack forwards ethernet frames to the host and all resolutions happens on the host, so I’m not sure why there would be any difference between UDP vs TCP in terms of ethernet frames
e
ah interesting, so maybe something configured wrong on my host side.. I'll keep looking into it
👍 1
TCP resolution works from my host side. Did you issue the dig commands from rancher-desktop wsl or a docker image?
c
I issued it from the network namespace, you can get in the namespace using the
rdctl shell
but issuing it from inside the container should have the same effect I can imagine, because it would ultimately cascade to the same dns server and gets forwarded to the host.
👍 1
e
Any tips for troubleshooting this further? From a container inside k3s I get errors if going through CoreDNS about tcp timeouts to the forwarding server... And if I do a nc -v 192.168.127.1 53 it refuses the connection.. but port 80 is open and working.. What is 192.168.127.1 as it's resolved in a container?
c
Let me try repro this on my end, do you have any steps I can run on my end?
also, are using any proxies, etc? or are you trying with just enabling network tunnel?
e
Yeah my setup is OOB for the most part on the 1.9.0 preview and unfortunately I've been fighting a corporate proxy. I've set the http_proxy, https_proxy env variables as previously mentioned in another thread in my rc.conf, I also have no_proxy set to 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.cluster.local
And from there i've deployed a debian based pod image with dnsutils and netcat installed to debug why i'm unable to resolve certain domains and narrowed it down to this
c
Thanks, I will try it as soon as I’m out of my meeting
e
Thank you, I appreciate it. I was assuming since a nc -v 192.168.127.1 80 was working the proxy wasn't interfering, but 443 also doesn't seem to be open. But I'm not really sure how it's all being forwarded.. Is the VM-switch process what is listening at 192.168.127.1 which then forwards over gviser?
c
I’ve set the http_proxy, https_proxy env variables as previously mentioned in another thread in my rc.conf, I also have no_proxy set to 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.cluster.local
Out of curiosity I assume you used the new experimental proxy that got introduced by @nice-toddler-37804 but I don’t think moproxy supports proxy environment variables, but I could be wrong.
e
I did, I was the one who ran into issues with certificate trust on that version, whereas setting these variables and not using moproxy seems fine
👍 2
Any luck on reproducing or are you thinking it's related to my vpn/proxy specifics?
r
Are you using allowed images functionality?
e
no
c
@echoing-carpenter-38694 unfortunately I had no luck with repro-ing this. I had lots of attempts, you shouldn’t get any connection refused to
192.168.127.1
that address is the IP address of the gateway, it’s the IP address that is assigned to the host-switch (a process that runs on windows host) for the virtual network.
e
hmm, yeah that makes sense. are you running off a newer build of host-switch? I see you added some new rules to the iptables a few weeks back that my build doesn't contain https://github.com/rancher-sandbox/rancher-desktop-networking/commit/eb2df541a053f3c1f916e35f0342042e35acc347 but yeah, doubt it would affect a netcat check
c
I used 1.9.0 but I will try to create a release tomorrow, do you want me to send a link to the build artifacts so that you can try with that to see if it makes a difference?
e
yeah if you don't mind
👍 1
@calm-sugar-3169 I re-built the host-switch process on my local and captured the pcap file it produced. Also ran wireshark against my vpn interface.. The pcap file created by host-switch proves the host is seeing the packets.. I'm not 100% but it seems DNS lookups over UDP the host-switch is changing the destination to my host configured DNS server, where over TCP the host-switch is sending it to 192.168.127.1 still to the external VPN interface which times out.. This lines up with what I'm seeing in the gviser-tap-vsock library. It seems the gateway only hosts a dns server on port 53 UDP which then will then do a DNS lookup within the process. Port 53 TCP will just be forwarded to 192.168.127.1 which won't work https://github.com/containers/gvisor-tap-vsock/blob/ee065017f1640987ca360d63036640f8e6813a94/pkg/virtualnetwork/services.go#L60 thoughts?
c
That’s interesting it looks as though the host-switch only creates dns server for UDP however, I’m still wondering how I can do a resolutions on both TCP and UDP? I would need to investigate this further. Meanwhile, are you able to run your tests from the host directly? can you successfully resolve the DNS query for both TCP and UDP? You would need to use the DNS server associated to the interface with the lowest route metric, in your case that would be your VPN interface I think.
e
Yeah against my real DNS server I can do Resolve-DnsName google.com -TcpOnly and it works fine.
c
It’s relatively easy to enable the TCP listener on the DNS server: https://github.com/miekg/dns/blob/730c265d18c3f51b447f6a4c61fe3cde336585d8/server.go#L198 but I need to understand how come it works for me 🤔
Do you have a lot of confidential stuff in the packet capture file since you were capturing the VPN interface? or is it possible to share? if not I can totally understand I can figure something out on my end.
e
Yeah that's where I'm confused also, are you able to see where the host-switch is sending the forward to? Maybe there's a listener there that I don't have
I'm not, unfortunately i have to hard code my corp credentials often in my proxy configs
c
Ok, I understand 😄
do you also get a timeout when you run the dns queries from inside the network namespace? that is if you did
rdctl shell
and run those queries again, do you get a timeout then too?
e
hmm having trouble even doing an apk add dnsutils there do to a timeout now
c
can you disconnect from your VPN and try again see if that works?
e
No, they force VPN to have connection. I was able to get it to work by setting exporting http_proxy & https_proxy env variables in rdctl shell.
dig +tcp google.com indeed times out there
For kicks, I cloned gviser-tap-vsock and added a replace to the host-switch go.mod. When attaching a gonet.ListenTCP on the gateway ip port 53 within the network stack and giving the listener to the dns server it causes k3s startup errors "no default routes found in /proc/net/route .... "
Edit: disregard.. just had to apply the same change against tag 0.5.0 vs main branch.. testing DNS now
didn't seem to work, but ran out of time to look into why, could be doing something wrong
@calm-sugar-3169 circling back up on this, I was able to get DNS working over TCP this morning after fixing my mistake and correctly exposing the TCP listener in the gviser-tap library. I have no idea how you are able to get DNS over TCP working without this.. One thing I did notice is if you issue "dig google.com" then "dig +tcp google.com" the second one will return a cached result
c
@echoing-carpenter-38694 that’s a great find and thanks for sharing that. I’m interested to learn more about your fix. Can you elaborate on how you exposed the TCP listener in the gviser-tap library? I want to make sure, that this issue is not being over looked at in RD. Thanks
e
https://github.com/containers/gvisor-tap-vsock/blob/ee065017f1640987ca360d63036640f8e6813a94/pkg/virtualnetwork/services.go#L60 I basically added a second DNS server that listens on TCP 53 alongside that UDP server
👀 1
Yeah was pretty straightforward to expose in the services.go
func dnsServer(configuration *types.Configuration, s *stack.Stack) error {
udpConn, err := gonet.DialUDP(s, &tcpip.FullAddress{
NIC:  1,
Addr: tcpip.Address(net.ParseIP(configuration.GatewayIP).To4()),
Port: uint16(53),
}, nil, ipv4.ProtocolNumber)
if err != nil {
return err
}
tcpLn, err := gonet.ListenTCP(s, tcpip.FullAddress{
NIC:  1,
Addr: tcpip.Address(net.ParseIP(configuration.GatewayIP).To4()),
Port: uint16(53),
},  ipv4.ProtocolNumber)
if err != nil {
log.Error(err)
return err
}
go func() {
if err := dns.Serve(udpConn, configuration.DNS); err != nil {
log.Error(err)
}
}()
go func() {
if err := dns.ServeTCP(tcpLn, configuration.DNS); err != nil {
log.Error(err)
}
}()
return nil
}
then in dns.go it's literally just a second server but connects via TCP
func ServeTCP(tcpListener net.Listener, zones []types.Zone) error {
handler := &dnsHandler{zones: zones}
tcpMux := dns.NewServeMux()
tcpMux.HandleFunc(".", handler.handleTCP)
tcpSrv := &dns.Server{
Listener:   tcpListener,
Handler:    tcpMux,
}
return tcpSrv.ActivateAndServe()
}
435 Views