jscax Posted July 1, 2018 Posted July 1, 2018 So I have an OPI0 perfectly running then suddenly it freezes. All I can find on /var/log/syslog Jul 1 10:35:10 localhost CRON[1553]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Jul 1 10:36:08 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:36:38 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:37:07 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:37:37 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:38:07 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:38:37 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:39:06 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:39:36 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:40:06 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:40:36 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:41:05 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:41:35 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:42:04 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:42:34 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:43:04 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:44:04 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:45:01 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul 1 10:46:01 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ] Jul 1 10:45:01 localhost CRON[2103]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Googling I found this but it seems like it's not solving anyone with the same issue? hints?? thank you
jscax Posted July 1, 2018 Author Posted July 1, 2018 https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=134971#p898539 maybe I found the solution commenting out last 4 lines from /etc/rsyslog.d/50-default.conf, those last 4: # NOTE: adjust the list below, or you'll go crazy if you have a reasonably # busy site.. # daemon.*;mail.*;\ news.err;\ *.=debug;*.=info;\ *.=notice;*.=warn |/dev/xconsole then service rsyslog restart Let's see if the issue will shows again
jscax Posted July 3, 2018 Author Posted July 3, 2018 it's happening again. there's something in rsyslog configuration which crash the Orange Pi Zero! I can see there's a new rule which writes on /var/log/syslog all ufw (firewall) catch: :msg,contains,"[UFW " /var/log/ufw.log I don't know if this make things worse but the issue here is that I'd like to get rid of this bad crash. After this crash I have to reboot several times, and I can do it only manually, removing +5V from the board. log2ram seems to be disasbled Can anyone help? This is having a bad impact thank you
Igor Posted July 3, 2018 Posted July 3, 2018 Random crashes are usually related to DRAM clocked too high. Your DRAM configuration might be below average quality and ... the second option is DVFS, which is also not done properly yet. Setting to fixed CPU speed might help. For first you need to recompile u-boot with lower DRAM speed, second, you need to adjust /etc/default/cpufrequtils 1
tkaiser Posted July 3, 2018 Posted July 3, 2018 On 7/1/2018 at 10:55 AM, jscax said: hints?? If a board crashed or freezes nothing will be written to syslog so searching for the strings that appeared in the logfile long before the crash/freeze happened is pretty much useless. We provide diagnosis tools, check the output of 'armbianmonitor -m' yourself (maybe post a few lines after an hour of operation) and please post the output from 'armbianmonitor -u' (since without no one has a clue which branch and kernel you're running and so on). @Igor: Is this patch not applied any more? https://github.com/armbian/build/blob/master/patch/u-boot/u-boot-sunxi/adjust-default-dram-clockspeeds.patch#L244 1
Igor Posted July 3, 2018 Posted July 3, 2018 9 minutes ago, tkaiser said: Is this patch not applied any more? It is applied. Perhaps in some rare cases, even this adjustment might not be enough? And powering with microUSB ? 1
tkaiser Posted July 3, 2018 Posted July 3, 2018 9 minutes ago, Igor said: Perhaps in some rare cases, even this adjustment might not be enough? If @jscax is using a default/legacy image it's pretty easy to lower DRAM clockspeed with 'h3consumption -D'. And yes, undervoltage might be the culprit (the Micro USB nightmare). 1
jscax Posted July 14, 2018 Author Posted July 14, 2018 It happened again and I can't find a pattern nor a cause. armbianmonitor -u share too much informations, attached a manually cleaned (I hope) version. 2018_07_04_armbianmonitor.txt There's something strange at the beginning of the attached report about wrong SD partitioning. I don't know if this has something to do with the crash. In this crash event I had to try to boot again 2 times, then at the 3d attempt it succeded. During the failing attempts I had to pull out 5V to force shutdown of the OPI0. The OPI0 is powered by an oversized 15W PSU and I'm not using microusb but I'm powering it using GPIO pins. Pin2 for instance. uname -r 4.14.18-sunxi It's quite important for me to avoid those crashes because they are completely blocking the functionality of OPI0 until I manually shutdown it several times. thank you very much
Igor Posted July 14, 2018 Posted July 14, 2018 13 minutes ago, jscax said: uname -r 4.14.18-sunxi Now repeat the whole thing with most recent kernel 4.17.y/u-boot 2018.05 combo from beta.armbian.com ... armbian-config -> system -> switch to nightly automated build -> reboot
jscax Posted July 14, 2018 Author Posted July 14, 2018 14 hours ago, Igor said: Now repeat the whole thing with most recent kernel 4.17.y/u-boot 2018.05 combo from beta.armbian.com ... armbian-config -> system -> switch to nightly automated build -> reboot uname -r 4.17.6-sunxi Mmh now the CPU seems stuck at 1200MHz. Frequency scaling is not working anymore. armbianmonitor -m Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 22:18:48: 1200MHz 0.65 22% 7% 13% 0% 1% 0% 47.1°C 0/8 22:18:53: 1200MHz 0.60 0% 0% 0% 0% 0% 0% 46.2°C 0/8 22:18:58: 1200MHz 0.55 0% 0% 0% 0% 0% 0% 46.1°C 0/8 cat /etc/default/cpufrequtils # WARNING: this file will be replaced on board support package (linux-root-...) upgrade ENABLE=true MIN_SPEED=240000 MAX_SPEED=1200000 GOVERNOR=ondemand Any suggestion? Thank you
Igor Posted July 15, 2018 Posted July 15, 2018 9 hours ago, jscax said: Any suggestion? One problem at the time. Does it crash?
jscax Posted July 15, 2018 Author Posted July 15, 2018 4 hours ago, Igor said: One problem at the time. Does it crash? Sadly it crashed again after like 16 hours of uptime with the new kernel. In my case it must be on 24h/7d Attaching the armbianmonitor -u file I think it has something to do with logrotation/cron/log2ram something But I can't spot anything from logs. Now it crashed at 12PM and had to force reboot, which succeded at 12:31 thank you armbianlog.txt
Igor Posted July 15, 2018 Posted July 15, 2018 3 minutes ago, jscax said: Sadly it crashed again after like 16 hours of uptime with the new kernel. In my case it must be on 24h/7d Attaching the armbianmonitor -u file I think it has something to do with logrotation/cron/log2ram something But I can't spot anything from logs. Now it crashed at 12PM and had to force reboot, which succeded at 12:31 thank you armbianlog.txt Nothing suspicious to me. Try with limit CPU down to 960000, double check voltage at the board, Another option is to lower DRAM speed, but you will need to rebuild u-boot with lower settings. It's worth trying.
tkaiser Posted July 15, 2018 Posted July 15, 2018 11 minutes ago, jscax said: But I can't spot anything from logs Again: you won't find anything in any log if you suffer from usual hardware problems (they cause freezes/crashes, especially on 'el cheap' SBC and especially when powered by crappy Micro USB). Apart from that logging with latest Armbian is broken anyway (at least shutdown logging -- the relevant service is not executed at shutdown/halt/reboot target)
jscax Posted July 15, 2018 Author Posted July 15, 2018 3 minutes ago, tkaiser said: Again: you won't find anything in any log if you suffer from usual hardware problems (they cause freezes/crashes, especially on 'el cheap' SBC and especially when powered by crappy Micro USB). Apart from that logging with latest Armbian is broken anyway (at least shutdown logging -- the relevant service is not executed at shutdown/halt/reboot target) it's not completely impossible to spot something on logs written before a kernel panic, that's my experience, but this time there's no trace. log2ram should not be my friend in this case. how does it work? it logs in ram, then a kernel panic occurs and puff... logs are gone because no one wrote them on sd? On the power supply side: I'm not using micro usb. I have a 15W PSU (and I'm using like 3 watts) connected with GPIO pins directely. I'll have a chance to use another power supply in a few days and will see if that's the issue.
jscax Posted July 15, 2018 Author Posted July 15, 2018 26 minutes ago, Igor said: Nothing suspicious to me. Try with limit CPU down to 960000, double check voltage at the board, Another option is to lower DRAM speed, but you will need to rebuild u-boot with lower settings. It's worth trying. ok now limiting CPU @ 960MHz Do you have a link for the DRAM thing? I think I'm a little bit out of the know... where can I keep me updated with armbian news? I think there's a lot I'm missing thank you
tkaiser Posted July 15, 2018 Posted July 15, 2018 11 minutes ago, jscax said: log2ram should not be my friend in this case. how does it work? it logs in ram, then a kernel panic occurs and puff... logs are gone because no one wrote them on sd? Yes. And simply deactivating it won't help since you also need to modify /etc/fstab since our default settings use a commit interval of 10 minutes. So removing the commit setting and adding sync might provide some insights in the log... take care of reducing the life-time of your SD card after such changes.
jscax Posted July 24, 2018 Author Posted July 24, 2018 So an update came into apt upgrade and it updated max freq config file too (/etc/default/cpufrequtils). Welcome to ARMBIAN 5.53.180722 nightly Ubuntu 16.04.5 LTS 4.17.9-sunxi After the update max frequency was restored at the original default value MAX_SPEED 1200MHz and I was barely able to boot again my OPI0. I went through several boot attempts, then after leaving it turned off for 30 minutes I was able to boot again and change back the MAX_SPEED to 960MHz. From my experience 1200MHz frequency is the offending cause of the random OPI0 freezes. And I had no freezes since I changed the MAX_SPEED @960MHz. Do you know if there's some undervolting going on? Maybe there's a wrong value @1200MHz? Moreover: after the update now the OPI0 is scaling frequency (before it was locked at max freq). Is it possible to link temperature with freq scaling? If temp > 50 °C ==> max freq = xx
Igor Posted July 25, 2018 Posted July 25, 2018 On 7/24/2018 at 11:53 AM, jscax said: So an update came into apt upgrade and it updated max freq config file too @tkaiser Shall we rework this to set MIN, MAX and governor with armbian-hardware-optimization and our variables go to /etc/defaults/armbian-cpu-config?
tkaiser Posted July 25, 2018 Posted July 25, 2018 1 hour ago, Igor said: Shall we rework this to set MIN, MAX and governor with armbian-hardware-optimization and our variables go to /etc/defaults/armbian-cpu-config? I would prefer to use upstream mechanisms: Altering contents of /etc/defaults/cpufrequtils when creating images but not overwriting this file afterwards as part of updates. Possible?
Igor Posted July 25, 2018 Posted July 25, 2018 8 minutes ago, tkaiser said: I would prefer to use upstream mechanisms: Altering contents of /etc/defaults/cpufrequtils when creating images but not overwriting this file afterwards as part of updates. Possible? OK, than rather this way.
Recommended Posts