belegdol Posted January 4, 2020 Posted January 4, 2020 @Igor, was the lockup after soft reset in combination with CGROUPS caused by too new of a compiler? Kudos for figuring this out!
Igor Posted January 4, 2020 Posted January 4, 2020 58 minutes ago, belegdol said: was the lockup after soft reset in combination with CGROUPS caused by too new of a compiler? Kudos for figuring this out! I don't think so. That's unrelated and left out by mistake .. I didn't pay attention to that I recreated images at the download section - if you can, try them if they work fine for you as well. We can later in the evening do one more update with CHROUPS enabled. (I am out of office until then)
Igor Posted January 4, 2020 Posted January 4, 2020 PID groups can't be enabled on this kernel - you enable, but it doesn't stay. Try to find out why or move to Kernel 5.4.y
grunlab Posted January 6, 2020 Author Posted January 6, 2020 Hi Igor, belegdol ... happy new year to both of you ! I see that a testing version of Armbian buster based on kernel 5.4.y is now available. I have a spare node in the cluster that can be used to test this image. Before reinstalling the node, could you check/confirm if pids cgroup is enabled on this testing image ?
Igor Posted January 7, 2020 Posted January 7, 2020 8 hours ago, grunlab said: I have a spare node in the cluster that can be used to test this image. Before reinstalling the node, could you check/confirm if pids cgroup is enabled on this testing image Its enabled on kernel 4.14.y which is o.k. now (we also had some stability issues) - you need to download normal stable images - they have been updated, while (apt) update is not yet available since it has to be tested better, while 5.4.y is unstable atm.
grunlab Posted January 7, 2020 Author Posted January 7, 2020 I've downloaded the latest stable image based on kernel 4.14.y ... this one : https://dl.armbian.com/odroidxu4/archive/Armbian_19.11.6_Odroidxu4_buster_legacy_4.14.161.7z (build date: 2020/01/04) and reinstalled one of the cluster node (worker-02): pids cgroup looks not enabled : adrien@worker-02:~$ uname -a Linux worker-02 4.14.161-odroidxu4 #40 SMP PREEMPT Sat Jan 4 12:41:15 CET 2020 armv7l GNU/Linux adrien@worker-02:~$ cat /proc/cgroups #subsys_name hierarchy num_cgroups enabled cpuset 2 14 1 cpu 7 97 1 cpuacct 7 97 1 blkio 8 97 1 memory 6 203 1 devices 5 97 1 freezer 3 14 1 net_cls 4 14 1 net_prio 4 14 1 adrien@bilbon:~/git/k8s$ kubectl get events -n default | grep worker-02 | grep pids 14s Warning FailedNodeAllocatableEnforcement node/worker-02 Failed to update Node Allocatable Limits ["kubepods"]: failed to set supported cgroup subsystems for cgroup [kubepods]: failed to find subsystem mount for required subsystem: pids Did i miss something ?
Igor Posted January 7, 2020 Posted January 7, 2020 1 hour ago, grunlab said: pids cgroup looks not enabled I know. I did enable them, but configuration is ignored for no apparent reason. Something must be wrong/missing in the kernel code or configuration ... which means someone needs to look deeper into the problem.
grunlab Posted January 8, 2020 Author Posted January 8, 2020 On 1/7/2020 at 7:33 AM, Igor said: Its enabled on kernel 4.14.y which is o.k. now (we also had some stability issues) - you need to download normal stable images - they have been updated Are you sure the normal stable image i've downloaded for the test was built with the pids cgroup enabled in the kernel config file ? - pids cgroup has been enabled in the kernel config the 2019/12/13: https://github.com/armbian/build/commit/755388147d65a12d39b31898be29434f999700c8#diff-c793c3265bc32ca486ae790ba3c0ca59 - but a rollback of the config has been done the 2020/01/04: https://github.com/armbian/build/commit/c1fbf0ab087e1be71a4b8edcf4736f5a6ffe211a#diff-c793c3265bc32ca486ae790ba3c0ca59 23 hours ago, grunlab said: I've downloaded the latest stable image based on kernel 4.14.y ... this one : https://dl.armbian.com/odroidxu4/archive/Armbian_19.11.6_Odroidxu4_buster_legacy_4.14.161.7z (build date: 2020/01/04) and reinstalled one of the cluster node So this is certainly explaining why the pids cgroup was not enabled in the image i've downloaded ! ... no ? FYI, i'm currently trying (first time for me !) to build an image following this doc: https://docs.armbian.com/Developer-Guide_Building-with-Docker/ I will let you know ...
Igor Posted January 8, 2020 Posted January 8, 2020 17 minutes ago, grunlab said: Are you sure the normal stable image i've downloaded for the test was built with the pids cgroup enabled in the kernel config file ? I am sure they are not. What I am trying to tell you is that kernel does not want to accept this feature even select from a menu. It's a bug in the kernel code which I have no time to fix.
grunlab Posted January 8, 2020 Author Posted January 8, 2020 ok, ok ... no problem :-) I'm living without this kernel feature for a while and i didn't notice real impact at k8s level except those warning in the logs ... I will try to investigate on my side ... 1
belegdol Posted January 9, 2020 Posted January 9, 2020 8 hours ago, Igor said: I am sure they are not. What I am trying to tell you is that kernel does not want to accept this feature even select from a menu. It's a bug in the kernel code which I have no time to fix. That, plus the fact that enabling the feature seems to cause lockup on reboot according to the bisect run I did.
grunlab Posted March 11, 2020 Author Posted March 11, 2020 FYI: - I've reinstalled one of the cluster node using the last image Armbian_20.02.1_Odroidxu4_buster_current_5.4.19_minimal.img: adrien@bastion:~ $ kc get node worker-03 -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME worker-03 Ready <none> 11m v1.17.3 192.168.0.106 <none> Debian GNU/Linux 10 (buster) 5.4.19-odroidxu4 docker://19.3.8 - pids cgroup is enabled on this image: adrien@worker-03:~$ cat /proc/cgroups | grep pids pids 3 97 - No more warning at kubernetes logs level and the node looks stable for the moment :-) I think this topic can be closed now. 1
Recommended Posts