adamant-kite-43734
07/23/2023, 3:23 PMaloof-hair-13897
07/24/2023, 2:52 AMcrooked-cat-21365
07/24/2023, 7:50 AMcrooked-cat-21365
07/24/2023, 7:56 AMcrooked-cat-21365
07/24/2023, 7:57 AMaloof-hair-13897
07/24/2023, 7:58 AMcrooked-cat-21365
07/24/2023, 8:08 AMcrooked-cat-21365
07/24/2023, 8:17 AMaloof-hair-13897
07/25/2023, 12:59 AMcrooked-cat-21365
07/25/2023, 5:52 AMcrooked-cat-21365
07/25/2023, 6:53 AM{hdunkel@dpcl082:~ 08:50:42 (local) 1002} kubectl oomd -A
NAMESPACE POD CONTAINER REQUEST LIMIT TERMINATION TIME
longhorn-system engine-image-ei-87057037-k52s9 engine-image-ei-87057037 0 0 2023-07-24 13:18:10 +0200 CEST
longhorn-system engine-image-ei-87057037-trxfw engine-image-ei-87057037 0 0 2023-06-28 12:00:19 +0200 CEST
longhorn-system engine-image-ei-ef01bf86-gjmzd engine-image-ei-ef01bf86 0 0 2023-07-24 13:18:08 +0200 CEST
longhorn-system engine-image-ei-ef01bf86-spq6b engine-image-ei-ef01bf86 0 0 2023-06-28 12:00:19 +0200 CEST
The same containers were killed by oomd again, which implies they were restarted before.aloof-hair-13897
07/25/2023, 7:46 AMengine-image-ei
pod just only do some commands, and it should not trigger OOM.aloof-hair-13897
07/26/2023, 1:35 AMengine-image-ei-xxx
killed, or could you observe the memory usage of pods of engine-image?crooked-cat-21365
07/26/2023, 6:44 AM% kubectl oomd -A
NAMESPACE POD CONTAINER REQUEST LIMIT TERMINATION TIME
longhorn-system engine-image-ei-87057037-k52s9 engine-image-ei-87057037 0 0 2023-07-24 13:18:10 +0200 CEST
longhorn-system engine-image-ei-87057037-pfxx8 engine-image-ei-87057037 0 0 2023-07-26 07:35:24 +0200 CEST
longhorn-system engine-image-ei-87057037-trxfw engine-image-ei-87057037 0 0 2023-06-28 12:00:19 +0200 CEST
longhorn-system engine-image-ei-ef01bf86-5wsb8 engine-image-ei-ef01bf86 0 0 2023-07-25 11:10:48 +0200 CEST
longhorn-system engine-image-ei-ef01bf86-gjmzd engine-image-ei-ef01bf86 0 0 2023-07-24 13:18:08 +0200 CEST
longhorn-system engine-image-ei-ef01bf86-spq6b engine-image-ei-ef01bf86 0 0 2023-06-28 12:00:19 +0200 CEST
The oomd seems to be an internal Kubernetes thing. There are "real" ooms listed in kernel.log as well (not related to Longhorn). kubectl oomd doesn't list these "real" ooms.
AFAICT the oomd kills pods, if resources are getting tight. The pods not requesting any resources are killed first.billowy-painting-56466
07/31/2023, 3:13 AMcrooked-cat-21365
08/04/2023, 1:02 PM