popnukem Posted July 22, 2021 Posted July 22, 2021 Hi, I have ten orange-pi pc plus with Ubuntu 20.04.2 LTS with Linux 5.10.43-sunxi burned onto emmc, and twenty orange-pi pc plus with Focal 21.05.6 with Linux 5.10.43-sunxi burned onto emmc. In an attempt to provoke the same issues we have with Xenial on these boxes, I have all of them rebooting (shutdown -r now) once every 10 minutes. They have all been under test for several days. They are all connected to a LAN backbone, but the LAN is not connected to the internet, so whilst they may be attempting to do unattended upgrades, for example, they are all failing. This is a deliberate part of the test. Of the twenty Focal 21 boxes none have failed. Of the ten Focal 20 boxes within 24 hours three have halted instead of rebooted, having been successfully rebooting for some time. I can find nothing in any of the logs (syslog, dmesg, kern, armbian, unattended-upgrades or auth) to indicate why they chose to halt. I know they halted because I have a screen plugged into the HDMI port and the last few lines are to do with halting and not rebooting (see attached screen shot - sorry about the quality), and the green led is off. My questions are: Has anyone seen this behaviour before whether on Focal 20 or otherwise? If so did you find out why? Is it possible that the "shutdown -r now" command is occasionally being misinterpreted as "shutdown now"? Is there anything else in the opsys that could be causing the halt? many thanks andy 0 Quote
Igor Posted July 23, 2021 Posted July 23, 2021 11 hours ago, popnukem said: Has anyone seen this behaviour before whether on Focal 20 or otherwise? If so did you find out why? "Focal" has almost no value or relationship with stability of the system. It has to be some other difference. Since you have provided no logs, can't tell. Especially we need to see which u-boot you are using on one, which on another. 0 Quote
popnukem Posted July 25, 2021 Author Posted July 25, 2021 Hi Igor, Appologies for mixing my terminology, I kind of started off on the right track. "Ubuntu 20.04.2 LTS with Linux 5.10.43-sunxi" had the 'hatl' issue, "Armbian 21.05.6 Focal with Linux 5.10.43-sunxi" did not. Having said that they have all been continuously rebooting over the weekend and there hasn't been a single failure. None of them are attached to the Internet. Please see attached logs. I've edited the armbian logs because they were too big to attach. Each contains one complete monitor cycle at the beginning and end of the period. thanks andy pi-pc-plus-logs.tgz 0 Quote
popnukem Posted August 8, 2021 Author Posted August 8, 2021 Hi Guys, I believe I have found the culprit. It seems I have been looking at the wrong log files, the files in /var/log appear to get overwritten on startup, whereas the files in /var/log.hdd appear to remain intact on startup and have preserved the traces from the previous shutdown. In auth.log I found: Aug 5 11:20:01 sd1-v18-F210506-516 CRON[24971]: pam_unix(cron:session): session opened for user root by (uid=0) Aug 5 11:20:01 sd1-v18-F210506-516 CRON[24972]: pam_unix(cron:session): session opened for user root by (uid=0) Aug 5 11:20:01 sd1-v18-F210506-516 CRON[24972]: pam_unix(cron:session): session closed for user root Aug 5 11:20:02 sd1-v18-F210506-516 CRON[24971]: pam_unix(cron:session): session closed for user root Aug 5 11:23:55 sd1-v18-F210506-516 systemd-logind[1593]: Power key pressed. Aug 5 11:23:55 sd1-v18-F210506-516 systemd-logind[1593]: Powering Off... Aug 5 11:23:55 sd1-v18-F210506-516 systemd-logind[1593]: System is powering down. Aug 5 11:24:08 sd1-v18-F210506-516 CRON[1562]: pam_unix(cron:session): session opened for user root by (uid=0) Aug 5 11:24:09 sd1-v18-F210506-516 systemd-logind[1605]: New seat seat0. Aug 5 11:24:09 sd1-v18-F210506-516 systemd-logind[1605]: Watching system buttons on /dev/input/event0 (r_gpio_keys) Aug 5 11:24:11 sd1-v18-F210506-516 sshd[1768]: Server listening on 0.0.0.0 port 22. Aug 5 11:24:11 sd1-v18-F210506-516 sshd[1768]: Server listening on :: port 22. Aug 5 11:24:26 sd1-v18-F210506-516 CRON[1562]: pam_unix(cron:session): session closed for user root Aug 5 11:25:01 sd1-v18-F210506-516 CRON[1905]: pam_unix(cron:session): session opened for user root by (uid=0) Aug 5 11:25:01 sd1-v18-F210506-516 CRON[1904]: pam_unix(cron:session): session opened for user root by (uid=0) Aug 5 11:25:01 sd1-v18-F210506-516 CRON[1905]: pam_unix(cron:session): session closed for user root Aug 5 11:25:01 sd1-v18-F210506-516 CRON[1904]: pam_unix(cron:session): session closed for user root I had expected to find a similar indication in either syslog, kern.log or dmesg, but nothing other than the shutdown messages. I'm sure nobody actually pressed the power button, but they are incredibly sensitive on the orange pi pc plus, and especially when housed in one of orange pi's plastic cases. It may be that this particular board is installed a bit too close to the little plastic flap that activates the button on the board, and therefore the smallest vibration causes the button to activate, but this is speculation. I have deactivated the button by "rmmod gpio_keys" in rc.local, so we'll see if this fixes the issue in the longer term. Incidentally this button was not activated in our version of Xenial and therefore the 'halting' issue has not materialised on any of the two hundred or so o.pi's we have in the field. Happy to provide further info if anybody is interested. cheers andy 0 Quote
Igor Posted August 9, 2021 Posted August 9, 2021 We know that all Allwinner 32bit boards are affected when using latest u-boot. I have pushed old u-boot - try this: 0 Quote
popnukem Posted August 15, 2021 Author Posted August 15, 2021 Hi Igor, I have thirty orange-pi pc pluses running "Armbian 21.05.6 Focal with Linux 5.10.43-sunxi" burned onto emmc, all rebooting every ten minutes. They all have the gpio_keys module removed in rc.local and none have failed since 11-aug. That's 5 days, or some 720 reboots each. For this reason I've decided not to replace the current u-boot 2020.10. I'm now pretty convinced the halting issue is mechanical, I have one box without gpio_keys removed, where I can provoke the halt by pulling and replacing the LAN cable, which is located at the opposite end of the board to the power button. I think pushing in the LAN cable is enough to move the board inside the case which nudges the power button up against the little plastic flap in the case. Many thanks for your attention, I'll take a look at the outstanding Armbian projects to see if there's anything I can help out with, or if you have any suggestions I'll be happy to try. thanks andy 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.