This message was deleted.
# longhorn-storage
a
This message was deleted.
f
This was in one of the instance managers
This seemed a little suspect, but I'm honestly not sure
Copy code
[pvc-a52f5d04-76e6-4058-abbc-34d8b20331da-e-2565aa66] go-iscsi-helper: tgtd is already running
I also noticed this on the underlying system...
Copy code
# audit2why < /var/log/audit/audit.log
type=AVC msg=audit(1695262967.474:1524): avc:  denied  { read } for  pid=48846 comm="iscsiadm" path="/dev/null" dev="tmpfs" ino=72 scontext=system_u:system_r:iscsid_t:s0 tcontext=system_u:object_r:container_runtime_tmpfs_t:s0 tclass=chr_file permissive=0

	Was caused by:
		Missing type enforcement (TE) allow rule.

		You can use audit2allow to generate a loadable module to allow this access.
which would translate to this
Copy code
# audit2allow < /var/log/audit/audit.log


#============= iscsid_t ==============
allow iscsid_t container_runtime_tmpfs_t:chr_file read;
Again, not sure if that's a red herring
Looking at the timestamp, it looks like that issue happened during system provisioning and hasn't occurred since
This is what I'm seeing in the volume event logs...
I'll poke at it tomorrow, but if anyone has any ideas where else I should look, please let me know
c
You might try with 1.27 or 1.26, I don't know if LH has been tested on 1.28 yet.
f
I'll probably give that a try. Before I tear the thing down, is there someone I can send a support bundle to to help with testing/supporting 1.28?
I rebuilt the cluster from scratch using k3s v1.27.6+k3s1 and am running into the same / similar problem. Any ideas on what to check next?
The pod will get created, it'll get scheduled, and then it'll fail and then StatefulSet will delete & recreate the pod. The PVC shows as "bound" Pod status:
Copy code
Events:
  Type     Reason              Age                From                     Message
  ----     ------              ----               ----                     -------
  Normal   Scheduled           51s                default-scheduler        Successfully assigned logging-system/logging-system-fluentd-0 to <http://kdevdev2-compute1.macc.ns.internet2.edu|kdevdev2-compute1.macc.ns.internet2.edu>
  Warning  FailedAttachVolume  42s (x5 over 52s)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-bf64b67b-c1ed-4f45-a408-21412a8503f6" : rpc error: code = DeadlineExceeded desc = vol
ume pvc-bf64b67b-c1ed-4f45-a408-21412a8503f6 failed to attach to node <http://kdevdev2-compute1.macc.ns.internet2.edu|kdevdev2-compute1.macc.ns.internet2.edu> with attachmentID csi-ccc1ab4902965eeeb91a846c3604db8461757657f681720a5009ccd97fbfe592
PVC Describe:
Copy code
Name:          logging-system-fluentd-buffer-logging-system-fluentd-0
Namespace:     logging-system
StorageClass:  longhorn
Status:        Bound
Volume:        pvc-bf64b67b-c1ed-4f45-a408-21412a8503f6
Labels:        <http://app.kubernetes.io/component=fluentd|app.kubernetes.io/component=fluentd>
               <http://app.kubernetes.io/managed-by=logging-system|app.kubernetes.io/managed-by=logging-system>
               <http://app.kubernetes.io/name=fluentd|app.kubernetes.io/name=fluentd>
Annotations:   <http://pv.kubernetes.io/bind-completed|pv.kubernetes.io/bind-completed>: yes
               <http://pv.kubernetes.io/bound-by-controller|pv.kubernetes.io/bound-by-controller>: yes
               <http://volume.beta.kubernetes.io/storage-provisioner|volume.beta.kubernetes.io/storage-provisioner>: <http://driver.longhorn.io|driver.longhorn.io>
               <http://volume.kubernetes.io/storage-provisioner|volume.kubernetes.io/storage-provisioner>: <http://driver.longhorn.io|driver.longhorn.io>
Finalizers:    [<http://kubernetes.io/pvc-protection|kubernetes.io/pvc-protection>]
Capacity:      20Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       logging-system-fluentd-0
Events:
  Type    Reason                 Age                From                                                                                      Message
  ----    ------                 ----               ----                                                                                      -------
  Normal  ExternalProvisioning   10m (x2 over 10m)  persistentvolume-controller                                                               waiting for a volume to be created, either by external provis
ioner "<http://driver.longhorn.io|driver.longhorn.io>" or manually created by system administrator
  Normal  Provisioning           10m                driver.longhorn.io_csi-provisioner-65cb5cc4ff-k2xmk_90d4fb49-de20-4aab-9a8f-86a1c1af5514  External provisioner is provisioning volume for claim "loggin
g-system/logging-system-fluentd-buffer-logging-system-fluentd-0"
  Normal  ProvisioningSucceeded  10m                driver.longhorn.io_csi-provisioner-65cb5cc4ff-k2xmk_90d4fb49-de20-4aab-9a8f-86a1c1af5514  Successfully provisioned volume pvc-bf64b67b-c1ed-4f45-a408-2
1412a8503f6
I also created my own Deployment/Pod with my own PVC and its having the same issue, so I don't think it's an issue specific to the logging operator. Never the less, i can't figure out what's going wrong here
I did a fresh install with k3s v1.27.6+k3s1 and Longhorn v1.4.3 and it's still exhibiting the same behavior
I'm kind of at a loss here
c
it sounds like you’re running this on selinux-enabled nodes?
f
I am...
c
f
I have not, is there a "quick fix"? or is it just to turn off SELinux? 😆
F*... That was it, disabling SELinux worked
c
there is some discussion of policies in that issue, I’m not sure where it ended up
f
There was a work-around that @eager-orange-89771 posted near hte end of that ticket. https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/prerequisite/longhorn-iscsi-selinux-workaround.yaml
I hate disabling SELinux to get around issues, but I'm flipping it into permissive mode for now