This message was deleted.
# rancher-desktop
a
This message was deleted.
h
I did a docker swarm init and my nodes look like this:
Copy code
$ docker node ls
ID                            HOSTNAME               STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
1vdqq80ak1g4s0nnph4xjsean *   lima-rancher-desktop   Ready     Active         Leader           20.10.18
and created a network and my network look slike this
Copy code
$ docker network ls
NETWORK ID     NAME              DRIVER    SCOPE
34aa9f60baee   bridge            bridge    local
7cd7acb2040c   docker_gwbridge   bridge    local
b928699d5d6c   host              host      local
1qd6e2d7zqrg   ingress           overlay   swarm
4583883d6b30   none              null      local
tw4bolgoata9   svc_default       overlay   swarm
and did a docker stack deploy co -c docker-compose.yaml
and the docker-compse.yaml looks like this:
Copy code
version: "3.8"
services:
  co-svc:
    image: co-test:1.0
    command:
      - serve
    deploy:
      replicas: 2
    ports:
      - mode: ingress
        protocol: tcp
        published: 8080
        target: 8080
    networks:
      - svc_default
    environment:
      JAVA_OPTS: -Dcoherence.wka=tasks.co_co-svc

networks:
  svc_default:
    external: true
I can see my service running:
Copy code
$ docker service ls
ID             NAME        MODE         REPLICAS   IMAGE         PORTS
6ygzdu8w6f7s   co_co-svc   replicated   2/2        co-test:1.0   *:8080->8080/tcp
according to the logs it's healthy
Copy code
$ docker service logs co_co-svc -n 5
co_co-svc.1.vhj6hny4nxpb@lima-rancher-desktop    | 13:24:06.005 [main] DEBUG io.smallrye.openapi.jaxrs - SROAP10001: Processing JaxRs method: io.example.entity.Task updateTask(java.util.UUID id, io.example.entity.Task task)
co_co-svc.1.vhj6hny4nxpb@lima-rancher-desktop    | 13:24:06.006 [main] DEBUG io.smallrye.openapi.jaxrs - SROAP10000: Processing a JAX-RS resource class: CoherenceResource
co_co-svc.1.vhj6hny4nxpb@lima-rancher-desktop    | 13:24:06.006 [main] DEBUG io.smallrye.openapi.jaxrs - SROAP10001: Processing JaxRs method: javax.ws.rs.core.Response get()
co_co-svc.1.vhj6hny4nxpb@lima-rancher-desktop    | 13:24:06.016 [features-thread] INFO io.helidon.common.HelidonFeatures - Helidon MP 2.5.4 features: [CDI, Config, Fault Tolerance, Health, JAX-RS, Metrics, Open API, REST Client, Security, Server, Tracing]
co_co-svc.1.vhj6hny4nxpb@lima-rancher-desktop    | 13:24:06.134 [Logger@9255624 22.06.2] INFO coherence - (thread=PagedTopic:Topic, member=2, up=9.698): Partition ownership has stabilized with 2 nodes
co_co-svc.2.724kxhhi8f3d@lima-rancher-desktop    | 13:24:05.795 [main] DEBUG io.smallrye.openapi.jaxrs - SROAP10001: Processing JaxRs method: javax.ws.rs.core.Response get()
co_co-svc.2.724kxhhi8f3d@lima-rancher-desktop    | 13:24:05.841 [Logger@9242415 22.06.2] INFO coherence - (thread=PagedTopic:Topic, member=1, up=9.399): Transferring primary PartitionSet{0..127} to member 2 requesting 128
co_co-svc.2.724kxhhi8f3d@lima-rancher-desktop    | 13:24:05.851 [features-thread] INFO io.helidon.common.HelidonFeatures - Helidon MP 2.5.4 features: [CDI, Config, Fault Tolerance, Health, JAX-RS, Metrics, Open API, REST Client, Security, Server, Tracing]
co_co-svc.2.724kxhhi8f3d@lima-rancher-desktop    | 13:24:06.026 [Logger@9242415 22.06.2] INFO coherence - (thread=PagedTopic:Topic, member=1, up=9.584): Partition ownership has stabilized with 2 nodes
co_co-svc.2.724kxhhi8f3d@lima-rancher-desktop    | 13:24:06.128 [Logger@9242415 22.06.2] INFO coherence - (thread=PagedTopic:Topic, member=1, up=9.686): Partition ownership has stabilized with 2 nodes
but when i try and access it, via curl it just ... hangs
Copy code
$ curl -vvv <http://localhost:8080/health>
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /health HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.79.1
> Accept: */*
>
this goes on forever...
I have no idea how to even debug this, so any help would be greatly appreciated - please don't come at me with "move to k8s" because that's fundamentally unhelpful.
fwiw -- from inside the swarm I can see everything just fine:
Copy code
$ docker exec -ti 24011efe1863 bash
bash-4.4# curl <http://co-svc:8080/health>
{"outcome":"UP","status":"UP","checks":[{"name":"deadlock","state":"UP","status":"UP"},{"name":"diskSpace","state":"UP","status":"UP","data":{"free":"90.58 GB","freeBytes":97263853568,"percentFree":"92.55%","total":"97.87 GB","totalBytes":105088212992}},{"name":"heapMemory","state":"UP","status":"UP","data":{"free":"54.82 MB","freeBytes":57482184,"max":"1.45 GB","maxBytes":1553989632,"percentFree":"97.29%","total":"95.00 MB","totalBytes":99614720}}]}bash-4.4# curl <http://co-svc:8080/coherence>
{"cluster":"root's cluster","members":[{"name":null,"address":"/10.0.4.7"},{"name":null,"address":"tasks.co_co-svc/10.0.4.6"}]}bash-4.4#
k
Hi @hundreds-sandwich-7373 Another user reported another network issue on M1 yesterday at https://rancher-users.slack.com/archives/C0200L1N1MM/p1666044276563699 where they had to disable their vpn/firewall to fix it? Is that the case here as well?
h
I don't think so, when i start it with docker run:
Copy code
docker run --rm -ti -p "8080:8080" -e "JAVA_OPTS=-Dcoherence.localhost=127.0.0.1" co-test:1.0 serve
it works fine.
@kind-iron-72902 no, this is entirely different. I managed to pull and build images just fine.
k
Thanks. Someone with access to an M1 machine is going to look at this.
Also could you file an issue at https://github.com/rancher-sandbox/rancher-desktop/issues/new/choose ? That makes it easier for us to manage problems.
c
@hundreds-sandwich-7373 I’m trying to test this, but unable to make much progress without your
co-test:1.0
image. Can you replicate this with another public image? (I’ll continue to look for a possible test image as well.)
… and did this work under earlier versions of Rancher Desktop?
Also, can you provide the output of:
docker network inspect svc_default
Finally (for tonight) can you check and see if your app is listening on IPv4 or v6 in the VM?
rdctl shell
, then
sudo netstat -nlp | grep 8080
. In my testing, there seems be a disconnect between IPv4 and IPv6, but it could easily be due to my test configuration.
h
@creamy-laptop-9977 you can use any image at all - an nginx serving helloword.html would be functionally equivalent for these purposes.
network inspect:
Copy code
$ docker network inspect svc_default
[
    {
        "Name": "svc_default",
        "Id": "tw4bolgoata9gpp785ql59jwb",
        "Created": "2022-10-18T09:05:29.876631272Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.4.0/24",
                    "Gateway": "10.0.4.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": null,
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4100"
        },
        "Labels": null
    }
]
Copy code
$ rdctl shell
lima-rancher-desktop:/Users/lcoughli$ sudo netstat -nlp | grep 8080
tcp        0      0 :::8080                 :::*                    LISTEN      2860/dockerd
I have no idea if it worked under a previous version - this is my first go around with rancher-desktop. It has worked under docker-desktop.
it does look like an ip6/4 thing maybe? That's fascinating.
fwiw:
Copy code
$ curl -g -6 "http://[::1]:8080/"
curl: (7) Failed to connect to ::1 port 8080 after 12 ms: Connection refused
Also, when you list the ports int he compose section, it's actually the ingress network that is responsible for the mapping i think, and that's picked up appropos
Copy code
$ docker network inspect ingress
[
    {
        "Name": "ingress",
        "Id": "1qd6e2d7zqrgwsnsus1mn0nx0",
        "Created": "2022-10-18T13:22:14.42613455Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": true,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "02fc1a3f9c335feb51d51c7ae87610401cd3befb98650bf9652e2884d9eb3c28": {
                "Name": "co_co-svc.2.lwvctygwa616q7up8zozq4mwg",
                "EndpointID": "eb482388ddded26424d09a2385cbc47f3bf9214bf086116cfd0938a757f1a2fd",
                "MacAddress": "02:42:0a:00:00:0a",
                "IPv4Address": "10.0.0.10/24",
                "IPv6Address": ""
            },
            "1976e6b0b7f6a9dba6a8dd0d43cfc9e38b8381ac175ae09ffbd3967f605dbcdd": {
                "Name": "co_co-svc.1.clatiekuuz08610tosp328c50",
                "EndpointID": "b16ba537248fe430698660c2b6a2b3863e0538a4357dc3a1ee6a565cad24b292",
                "MacAddress": "02:42:0a:00:00:09",
                "IPv4Address": "10.0.0.9/24",
                "IPv6Address": ""
            },
            "ingress-sbox": {
                "Name": "ingress-endpoint",
                "EndpointID": "82b7925d1c6e6d2afb74889bf156544284007a343ac6f38a2ea6e648ed24ca89",
                "MacAddress": "02:42:0a:00:00:02",
                "IPv4Address": "10.0.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4096"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "e49e33aed00b",
                "IP": "192.168.5.15"
            }
        ]
    }
]
c
I suspect the IPV6 thing might be a red herring, I managed to repro this on a non-M1 mac and I can see the the service is actually bound and responds tho IPV4 within the VM and there is actually a corresponding rule for it in the IP tables:
Copy code
Chain DOCKER-INGRESS (2 references)
 pkts bytes target     prot opt in     out     source               destination
    5   300 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8080 to:172.18.0.2:8080
    0     0 RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
as for the netstat why it is showing IPv6 this might be the explanation: https://github.com/moby/moby/issues/2174#issuecomment-237439515
now this might be one of those situations that request comes in but has no idea how to get out 🤷
Ok, I can confirm that this behavior is the same on 1.5.1 so it is not a regression. Below is my observation, it seems as though the application listens on both IPV4 and IPV6, you can confirm this by curling the application from within the VM:
Copy code
curl 127.0.0.1:8080
However, curling with IPV6:
Copy code
curl -g -6 '<http://localhost:8080>'
Hangs, this could be due to this open issue: https://github.com/moby/moby/issues/24379 In addition, looking at the tcpdump within the VM I can see the curl request that is destined for port 8080 from the client (host in this case) is always forced to IPv6
Copy code
20:02:55.367583 lo    In  IP6 ::1.54490 > ::1.8080: Flags [S], seq 3666321179, win 65476, options [mss 65476,sackOK,TS val 869371798 ecr 0,nop,wscale 7], length 0
20:02:55.367593 lo    In  IP6 ::1.8080 > ::1.54490: Flags [S.], seq 3999623390, ack 3666321180, win 65464, options [mss 65476,sackOK,TS val 869371798 ecr 869371798,nop,wscale 7], length 0
20:02:55.367602 lo    In  IP6 ::1.54490 > ::1.8080: Flags [.], ack 1, win 512, options [nop,nop,TS val 869371798 ecr 869371798], length 0
20:02:55.368230 lo    In  IP6 ::1.54490 > ::1.8080: Flags [P.], seq 1:79, ack 1, win 512, options [nop,nop,TS val 869371798 ecr 869371798], length 78: HTTP: GET / HTTP/1.1
20:02:55.368235 lo    In  IP6 ::1.8080 > ::1.54490: Flags [.], ack 79, win 511, options [nop,nop,TS val 869371798 ecr 869371798], length 0
20:02:59.929836 lo    In  IP6 ::1.54490 > ::1.8080: Flags [F.], seq 79, ack 1, win 512, options [nop,nop,TS val 869376360 ecr 869371798], length 0
20:02:59.977814 lo    In  IP6 ::1.8080 > ::1.54490: Flags [.], ack 80, win 511, options [nop,nop,TS val 869376408 ecr 869376360], length 0
This could potentially be related to the open issue mentioned above or perhaps misconfiguration in Lima’s underlying switch that always forces IPV6 with the swarm’s overlay networks. However, it is worth noting that when a published port container is deployed:
Copy code
docker run -p 8081:80 -td nginx
Everything works as expected and looking at the tcpdump in the VM the requests are made over IPV4 as expected:
Copy code
19:34:54.386979 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [S], seq 3673555498, win 65495, options [mss 65495,sackOK,TS val 1782487875 ecr 0,nop,wscale 7], length 0
19:34:54.387035 lo    In  IP 127.0.0.1.8081 > 127.0.0.1.35410: Flags [S.], seq 228590965, ack 3673555499, win 65483, options [mss 65495,sackOK,TS val 1782487875 ecr 1782487875,nop,wscale 7], length 0
19:34:54.387092 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [.], ack 1, win 512, options [nop,nop,TS val 1782487875 ecr 1782487875], length 0
19:34:54.387974 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [P.], seq 1:79, ack 1, win 512, options [nop,nop,TS val 1782487876 ecr 1782487875], length 78
19:34:54.388017 lo    In  IP 127.0.0.1.8081 > 127.0.0.1.35410: Flags [.], ack 79, win 511, options [nop,nop,TS val 1782487876 ecr 1782487876], length 0
19:34:54.388655 lo    In  IP 127.0.0.1.8081 > 127.0.0.1.35410: Flags [P.], seq 1:854, ack 79, win 512, options [nop,nop,TS val 1782487877 ecr 1782487876], length 853
19:34:54.388703 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [.], ack 854, win 506, options [nop,nop,TS val 1782487877 ecr 1782487877], length 0
19:34:54.389693 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [F.], seq 79, ack 854, win 512, options [nop,nop,TS val 1782487878 ecr 1782487877], length 0
19:34:54.390404 lo    In  IP 127.0.0.1.8081 > 127.0.0.1.35410: Flags [F.], seq 854, ack 80, win 512, options [nop,nop,TS val 1782487879 ecr 1782487878], length 0
19:34:54.390534 lo    In  IP 127.0.0.1.35410 > 127.0.0.1.8081: Flags [.], ack 855, win 512, options [nop,nop,TS val 1782487879 ecr 1782487879], length 0
To summarize this could be a combination of the open issue mentioned above in conjunction with the misconfiguration of lima’s underlying switch (vde_vmnet) or the misbehaviour is completely driven by the open issue here: https://github.com/moby/moby/issues/24379
Also, create the corresponding issue to keep track of this: https://github.com/rancher-sandbox/rancher-desktop/issues/3221
c
Thanks for continuing the dig into this, @calm-sugar-3169! Interesting situation indeed!
h
Thank you so much for this follow up, I really appreciate it, unfortunately it looks like I decanned the worms here.
🤣 2