Random system freezes


Recommended Posts

Hello

 

For quite some time I experienced system freezes. I already measured the voltage on the board 12V and 5V are okay on both connectors.

Attached you will find the armbianmonitor -U output.

 

I tried to capture kernel logs using information from some other thread.

 

sudo dmesg -n 7
sudo dmesg -w

 

But I could not capture anything useful.

Today the system froze while checking the raid (filesystem was not mounted).

 

[  168.224361] md: data-check of RAID array md0

 

Is there anything else I can do to shed some light?

 

 

Thanks and regards

-kratz00

armbianmonitor.log

Link to post
Share on other sites
Armbian is a community driven open source project. Do you like to contribute your code?

Hi,

 

By system freeze, you mean the system hangs and you need to manually reset / power cycle it ?

 

Is the watchdog service running ? systemctl status watchdog.service

 

What the temperature of the SoC during load ? cat /dev/thermal-cpu/temp1_input

Just trying to dismiss first any thermal issue.

 

Regards,

Link to post
Share on other sites

Hi gprovost

3 hours ago, gprovost said:

By system freeze, you mean the system hangs and you need to manually reset / power cycle it ?

Exactly. Not reachable over the network. Does not respond via serial console.

 

3 hours ago, gprovost said:

Is the watchdog service running ? systemctl status watchdog.service

Seems it is missing:

kratz00@helios4:~$ systemctl status watchdog.service
Unit watchdog.service could not be found.

 

3 hours ago, gprovost said:

What the temperature of the SoC during load ? cat /dev/thermal-cpu/temp1_input

Just trying to dismiss first any thermal issue.

Raid check is nearly running for an hour, load is high and the temperature is stable around 55°C:

root@helios4:~# uptime
 06:53:44 up 21 min,  1 user,  load average: 2.00, 1.92, 1.27
root@helios4:~# cat /dev/thermal-cpu/temp1_input 
55122

 

Regards

-kratz00

Link to post
Share on other sites

@Mangix, try

 5.4.66-mvebu #20.08.3

, I also had random reboots after some updates on heavy NFS loads again, previously I had this problem, but after reinstalling system from scratch to spare sd card system was rock solid. I inserted spare card again and it's stable again, but I have no time to test where the problem is now thought.

Link to post
Share on other sites

Short update running 5.4.66-mvebu now, resulted in a system freeze in just a couple of minutes running raid check.

I am officially out of ideas.

The Helios4 was running fine for many years when I got it after the successful Kickstarter campaign.

I can not really pin point when it started freezing (I think it started after October of 2019).

I am also not sure if it is a hardware or a software problem.

 

Link to post
Share on other sites