Jump to content

rsyslog causing Orange Pi Zero crash and freeze


jscax

Recommended Posts

So I have an OPI0 perfectly running then suddenly it freezes.

All I can find on /var/log/syslog

Jul  1 10:35:10 localhost CRON[1553]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jul  1 10:36:08 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:36:38 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:37:07 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:37:37 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:38:07 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:38:37 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:39:06 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:39:36 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:40:06 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:40:36 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:41:05 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:41:35 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:42:04 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:42:34 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:43:04 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:44:04 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:45:01 localhost rsyslogd-2007: action 'action 10' suspended, next retry is Sun Jul  1 10:46:01 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Jul  1 10:45:01 localhost CRON[2103]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

 

Googling I found this but it seems like it's not solving

 

anyone with the same issue?

hints??

 

thank you

Link to comment
Share on other sites

https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=134971#p898539 maybe I found the solution

 

commenting out last 4 lines from /etc/rsyslog.d/50-default.conf, those last 4:


# NOTE: adjust the list below, or you'll go crazy if you have a reasonably
#      busy site..
#
daemon.*;mail.*;\
       news.err;\
       *.=debug;*.=info;\
       *.=notice;*.=warn       |/dev/xconsole

then service rsyslog restart

 

Let's see if the issue will shows again

Link to comment
Share on other sites

it's happening again.

 

there's something in rsyslog configuration which crash the Orange Pi Zero!

 

I can see there's a new rule which writes on /var/log/syslog all ufw (firewall) catch:

:msg,contains,"[UFW " /var/log/ufw.log

 

I don't know if this make things worse but the issue here is that I'd like to get rid of this bad crash.

After this crash I have to reboot several times, and I can do it only manually, removing +5V from the board.

 

log2ram seems to be disasbled

 

Can anyone help? This is having a bad impact

 

thank you
 

Link to comment
Share on other sites

Random crashes are usually related to DRAM clocked too high. Your DRAM configuration might be below average quality and ... the second option is DVFS, which is also not done properly yet. Setting to fixed CPU speed might help. For first you need to recompile u-boot with lower DRAM speed, second, you need to adjust /etc/default/cpufrequtils

Link to comment
Share on other sites

On 7/1/2018 at 10:55 AM, jscax said:

hints??

 

If a board crashed or freezes nothing will be written to syslog so searching for the strings that appeared in the logfile long before the crash/freeze happened is pretty much useless.

 

We provide diagnosis tools, check the output of 'armbianmonitor -m' yourself (maybe post a few lines after an hour of operation) and please post the output from 'armbianmonitor -u' (since without no one has a clue which branch and kernel you're running and so on).

 

@Igor: Is this patch not applied any more? https://github.com/armbian/build/blob/master/patch/u-boot/u-boot-sunxi/adjust-default-dram-clockspeeds.patch#L244

Link to comment
Share on other sites

9 minutes ago, tkaiser said:

Is this patch not applied any more?


It is applied. Perhaps in some rare cases, even this adjustment might not be enough? And powering with microUSB ?:)

Link to comment
Share on other sites

9 minutes ago, Igor said:

Perhaps in some rare cases, even this adjustment might not be enough?

 

If @jscax is using a default/legacy image it's pretty easy to lower DRAM clockspeed with 'h3consumption -D'. And yes, undervoltage might be the culprit (the Micro USB nightmare).

Link to comment
Share on other sites

It happened again and I can't find a pattern nor a cause.

 

armbianmonitor -u share too much informations, attached a manually cleaned (I hope) version.

2018_07_04_armbianmonitor.txt

 

There's something strange at the beginning of the attached report about wrong SD partitioning. I don't know if this has something to do with the crash.

In this crash event I had to try to boot again 2 times, then at the 3d attempt it succeded. During the failing attempts I had to pull out 5V to force shutdown of the OPI0.

 

The OPI0 is powered by an oversized 15W PSU and I'm not using microusb but I'm powering it using GPIO pins. Pin2 for instance.

 

uname -r
4.14.18-sunxi
 

It's quite important for me to avoid those crashes because they are completely blocking the functionality of OPI0 until I manually shutdown it several times.

 

thank you very much

 

Link to comment
Share on other sites

13 minutes ago, jscax said:

uname -r
4.14.18-sunxi


Now repeat the whole thing with most recent kernel 4.17.y/u-boot 2018.05 combo from beta.armbian.com ... armbian-config -> system -> switch to nightly automated build -> reboot

Link to comment
Share on other sites

14 hours ago, Igor said:


Now repeat the whole thing with most recent kernel 4.17.y/u-boot 2018.05 combo from beta.armbian.com ... armbian-config -> system -> switch to nightly automated build -> reboot

 

 

uname -r
4.17.6-sunxi

Mmh now the CPU seems stuck at 1200MHz. Frequency scaling is not working anymore.

armbianmonitor -m
Stop monitoring using [ctrl]-[c]
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.

22:18:48: 1200MHz  0.65  22%   7%  13%   0%   1%   0% 47.1°C  0/8
22:18:53: 1200MHz  0.60   0%   0%   0%   0%   0%   0% 46.2°C  0/8
22:18:58: 1200MHz  0.55   0%   0%   0%   0%   0%   0% 46.1°C  0/8
cat /etc/default/cpufrequtils
# WARNING: this file will be replaced on board support package (linux-root-...) upgrade
ENABLE=true
MIN_SPEED=240000
MAX_SPEED=1200000
GOVERNOR=ondemand

 

Any suggestion?

Thank you

Link to comment
Share on other sites

4 hours ago, Igor said:

 

One problem at the time. Does it crash?

Sadly it crashed again after like 16 hours of uptime with the new kernel. In my case it must be on 24h/7d

 

Attaching the armbianmonitor -u file

 

I think it has something to do with logrotation/cron/log2ram something

But I can't spot anything from logs.

 

Now it crashed at 12PM and had to force reboot, which succeded at 12:31

 

thank you

armbianlog.txt

Link to comment
Share on other sites

3 minutes ago, jscax said:

Sadly it crashed again after like 16 hours of uptime with the new kernel. In my case it must be on 24h/7d

 

Attaching the armbianmonitor -u file

 

I think it has something to do with logrotation/cron/log2ram something

But I can't spot anything from logs.

 

Now it crashed at 12PM and had to force reboot, which succeded at 12:31

 

thank you

armbianlog.txt

 

Nothing suspicious to me. Try with limit CPU down to 960000, double check voltage at the board, Another option is to lower DRAM speed, but you will need to rebuild u-boot with lower settings. It's worth trying.

Link to comment
Share on other sites

11 minutes ago, jscax said:

But I can't spot anything from logs

 

Again: you won't find anything in any log if you suffer from usual hardware problems (they cause freezes/crashes, especially on 'el cheap' SBC and especially when powered by crappy Micro USB).

 

Apart from that logging with latest Armbian is broken anyway (at least shutdown logging -- the relevant service is not executed at shutdown/halt/reboot target)

Link to comment
Share on other sites

3 minutes ago, tkaiser said:

 

Again: you won't find anything in any log if you suffer from usual hardware problems (they cause freezes/crashes, especially on 'el cheap' SBC and especially when powered by crappy Micro USB).

 

Apart from that logging with latest Armbian is broken anyway (at least shutdown logging -- the relevant service is not executed at shutdown/halt/reboot target)

it's not completely impossible to spot something on logs written before a kernel panic, that's my experience, but this time there's no trace.

log2ram should not be my friend in this case. how does it work? it logs in ram, then a kernel panic occurs and puff... logs are gone because no one wrote them on sd?

 

On the power supply side: I'm not using micro usb. I have a 15W PSU (and I'm using like 3 watts) connected with GPIO pins directely.

I'll have a chance to use another power supply in a few days and will see if that's the issue.

Link to comment
Share on other sites

 

26 minutes ago, Igor said:

 

Nothing suspicious to me. Try with limit CPU down to 960000, double check voltage at the board, Another option is to lower DRAM speed, but you will need to rebuild u-boot with lower settings. It's worth trying.

ok now limiting CPU @ 960MHz

 

Do you have a link for the DRAM thing? I think I'm a little bit out of the know... where can I keep me updated with armbian news? I think there's a lot I'm missing  thank you

Link to comment
Share on other sites

11 minutes ago, jscax said:

log2ram should not be my friend in this case. how does it work? it logs in ram, then a kernel panic occurs and puff... logs are gone because no one wrote them on sd?

 

Yes. And simply deactivating it won't help since you also need to modify /etc/fstab since our default settings use a commit interval of 10 minutes. So removing the commit setting and adding sync might provide some insights in the log... take care of reducing the life-time of your SD card after such changes.

Link to comment
Share on other sites

So an update came into apt upgrade and it updated max freq config file too (/etc/default/cpufrequtils).

 

Welcome to ARMBIAN 5.53.180722 nightly Ubuntu 16.04.5 LTS 4.17.9-sunxi

After the update max frequency was restored at the original default value MAX_SPEED 1200MHz and I was barely able to boot again my OPI0.

 

I went through several boot attempts, then after leaving it turned off for 30 minutes I was able to boot again and change back the MAX_SPEED to 960MHz.

 

From my experience 1200MHz frequency is the offending cause of the random OPI0 freezes. And I had no freezes since I changed the MAX_SPEED @960MHz.

Do you know if there's some undervolting going on? Maybe there's a wrong value @1200MHz?

 

Moreover: after the update now the OPI0 is scaling frequency (before it was locked at max freq).

Is it possible to link temperature with freq scaling?

If temp > 50 °C ==> max freq = xx

Link to comment
Share on other sites

1 hour ago, Igor said:

Shall we rework this to set MIN, MAX and governor with armbian-hardware-optimization and our variables go to /etc/defaults/armbian-cpu-config?

 

I would prefer to use upstream mechanisms: Altering contents of /etc/defaults/cpufrequtils when creating images but not overwriting this file afterwards as part of updates. Possible?

Link to comment
Share on other sites

8 minutes ago, tkaiser said:

I would prefer to use upstream mechanisms: Altering contents of /etc/defaults/cpufrequtils when creating images but not overwriting this file afterwards as part of updates. Possible?


OK, than rather this way. 

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines