This message was deleted.
# rke2
a
This message was deleted.
m
After setting private registry, restarted rke2-server noticed following message in journalctl
Copy code
level=info msg="Using private registry config file at /etc/rancher/rke2/registries.yaml"
fyi: copied
/etc/rancher/rke2/registries.yaml
in all agent nodes & restarted service.
c
I don’t think you shared the actual error message anywhere?
m
@creamy-pencil-82913
Copy code
Events:
        Type     Reason     Age                    From               Message
        ----     ------     ----                   ----               -------
        Normal   Scheduled  39m                    default-scheduler  Successfully assigned default/ecr-pod to sv10
        Normal   Pulling    38m (x4 over 39m)      kubelet            Pulling image "<http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>"
        Warning  Failed     38m (x4 over 39m)      kubelet            Failed to pull image "<http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>": rpc error: code = Unknown desc = failed to pull and unpack image "<http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>": failed to resolve reference "<http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>": pulling from host <http://2341234234.dkr.ecr.yyyyy.amazonaws.com|2341234234.dkr.ecr.yyyyy.amazonaws.com> failed with status code [manifests slim]: 401 Unauthorized
        Warning  Failed     38m (x4 over 39m)      kubelet            Error: ErrImagePull
        Warning  Failed     38m (x6 over 39m)      kubelet            Error: ImagePullBackOff
        Normal   BackOff    4m35s (x151 over 39m)  kubelet            Back-off pulling image "<http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>"
c
What are you using for the username and password in there? ECR is “fun” because the creds that
aws ecr login
gives you expire very frequently.
and you can’t just use your token and secret as username and password, you have to actually go through the ecr login workflow to get the the basic auth creds.
m
```configs:
"2341234234.dkr.ecr.yyyyy.amazonaws.com:
auth:
username: xxxxx
password: xxxxx
tls:
insecure_skip_verify: true```
in /etc/rancher/rke2/registries.yaml
c
yes,
What are you using for the username and password
m
IAM console username & password
c
is it just your AWS key ID and secret, is it the output of
aws ecr login
, or what
Yeah that’s not how ECR works
This command retrieves and displays an authentication token using the GetAuthorizationToken API that you can use to authenticate to an Amazon ECR registry. You can pass the authorization token to the login command of the container client of your preference, such as the Docker CLI. After you have authenticated to an Amazon ECR registry with this command, you can use the client to push and pull images from that registry as long as your IAM principal has access to do so until the token expires. The authorization token is valid for 12 hours.
m
yeah I came across this..
c
You can’t just give it your IAM login credentials
ECR is significantly more difficult to use because you have to reauthenticate every 12 hours.
😞 1
Normally this is handled with EC2 IAM roles on the nodes, and a credential helper
m
so username going to be AWS then ?
c
I believe that’s what it shows on that page, yes
m
so after updating
/etc/rancher/rke2/registries.yaml
with new token for password we need to restart rke2-server too right?
c
correct
it’s not really a long-term solution
m
yeah that was in my mind too..
tried crictl pull --creds with AWS:<token> & it pulled image
but same with
/etc/rancher/rke2/registries.yaml
rke2 failed
c
did you update the same file on all the nodes?
m
sorry, made some mistake in registries.yaml, it worked now
1
its real pain 😞
also if possible
/etc/rancher/rke2/registries.yaml
should be reflected to all worker nodes by rke2-server would have been good. not only for ECR but not any registries..
thanks @creamy-pencil-82913
@creamy-pencil-82913, I have 2 rke2 cluster. Cluster with
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
add
/etc/rancher/rke2/registries.yaml
with following format allowed to access images from private registry, and on execution of
crictl info
we can see the private registry. config.toml
Copy code
[plugins.opt]
  path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.cri]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"
  enable_selinux = false
  sandbox_image = "<http://index.docker.io/rancher/pause:3.5|index.docker.io/rancher/pause:3.5>"
[plugins.cri.containerd]
  disable_snapshot_annotations = true
  snapshotter = "overlayfs"
[plugins.cri.registry.mirrors]
[plugins.cri.registry.mirrors."<http://harbor-local.org|harbor-local.org>"]
  endpoint = ["<https://harbor-local.org/>"]
registry
......
registries.yaml
Copy code
mirrors:
  <http://harbor-local.org|harbor-local.org>:
    endpoint:
      - "<https://harbor-local.org>"
configs:
  "<http://harbor-local.org|harbor-local.org>":
    auth:
      username: xxxx
      password: xxxx
    tls:
      insecure_skip_verify: true
.....
Another rke2 cluster with
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
and
/etc/rancher/rke2/registries.yaml
with following format were able to connect with private registry and facing on execution of
crictl info
we can are not able to see the private registry. config.toml
Copy code
version = 2
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    enable_selinux = false
    sandbox_image = "<http://index.docker.io/rancher/pause:3.2|index.docker.io/rancher/pause:3.2>"
    stream_server_address = "127.0.0.1"
    stream_server_port = "10010"
    [plugins."io.containerd.grpc.v1.cri".containerd]
      default_runtime_name = "nvidia"
      disable_snapshot_annotations = true
      snapshotter = "overlayfs"
````
registries.yaml
Copy code
mirrors:
  <http://harbor-local.org|harbor-local.org>:
    endpoint:
      - "<https://harbor-local.org>"
configs:
  "<http://harbor-local.org|harbor-local.org>":
    auth:
      username: xxxx
      password: xxxx
    tls:
      insecure_skip_verify: true
```
Looking for guidance. RKE2 version in both cluster is
Copy code
rke2 version v1.21.5+rke2r2 (9e4acdc6018ae74c36523c99af25ab861f3884da)
go version go1.16.6b7
sorry please ignore, its done. note: copy content of`/etc/rancher/rke2/registries.yaml` to all nodes and in controller node update containerd template like
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
to include private registry details. https://github.com/containerd/cri/blob/release/1.4/docs/registry.md post these steps make sure to restart rke2 services
856 Views