Jump to content

NanoPi M4V2 randomly crashes


Pedro Lamas

Recommended Posts

My NanoPi M4V2 is currently being used completely headless just to run a few Docker containers (AdGuard Home, Home Assistant, Zigbee2Mqtt, etc.) and I've been noticing more and more crashes happening... sometimes they occur while I have an open SSH terminal, and I get a glimpse of a "kernel panic" type error...

 

Here's the output from armbianconfig: http://ix.io/2vFX 

 

Any ideas on what this is or how to fix it?

 

Thanks in advance!

Link to comment
Share on other sites

On 8/30/2020 at 11:27 AM, Pedro Lamas said:

and I get a glimpse of a "kernel panic" type error...


Screenshot of the kernel panic on a console might help - our default logs doesn't show anything suspicious. I hope your powering is proper quality?

Link to comment
Share on other sites

8 hours ago, Igor said:

I hope your powering is proper quality?

I'm powering it with a 5V 4A PoE adapter. I'm using a 256GB Sabrent M.2 NVME SSD as the main disk (configured via armbian-config), so the Samsung SD card is for initial booting only.

 

I'll try to get a screenshot of the error and post it here the next time I get that error.

Link to comment
Share on other sites

2 hours ago, Pedro Lamas said:

not sure what else can I do to fix it

At this point I do not recommend running M4v2 with mainline linux.

The reason for this instabilities and the remedy are yet to be discovered :(

The board runs stable with legacy though.

Link to comment
Share on other sites

Just got yet another of those errors while using docker:

 

pedro@nanopim4v2:~/docker$ docker-compose down
Stopping docker_mariadb_1       ...
Stopping docker_homeassistant_1 ...
Stopping docker_nginx_1         ...
Stopping docker_zigbee2mqtt_1   ... done
Stopping docker_esphome_1       ...
Stopping docker_telegraf_1      ... done
Stopping docker_acme.sh_1       ... done
Stopping docker_mosquitto_1     ...
Stopping docker_portainer_1     ...
Stopping docker_vscode_1        ...
Stopping docker_grafana_1       ...
Stopping docker_adguardhome_1   ...

Message from syslogd@localhost at Sep 19 16:10:14 ...
 kernel:[108790.564123] Kernel panic - not syncing: bad mode

 

System stalled after that and I had to manually restart it.

Link to comment
Share on other sites

A few more errors this morning:

Message from syslogd@localhost at Sep 20 10:50:09 ...
 kernel:[59823.225830] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP

Message from syslogd@localhost at Sep 20 10:50:09 ...
 kernel:[59823.252559] Code: f9401bf7 17ffff7d a9025bf5 f9001bf7 (d4210000)

 

Link to comment
Share on other sites

Hello, 

I found a problem with the CPU GOVERNOR.

ENABLE=true
MIN_SPEED=600000
#MAX_SPEED=2016000
MAX_SPEED=1800000
# performance i firefox preferences i 6 
# GOVERNOR=conservative
# GOVERNOR=ondemand
GOVERNOR=performance

When I set the GOVERNOR set to "performance" it looks like that the NANOPI run stable.

In MODE "ondemand" and "conservative" the PI crashes after a few hours.

I do not have enough experience to play with the GOVERNOR parameters to get a stable setting in USER mode.

Link to comment
Share on other sites

Not sure what the governor is or how I can change it (granted, haven't looked into the docs yet) but if there's any setting I can change to test I'll be willing to do just that!

 

At the moment, the best I've seen my nanopi m4v2 work without crashing was about 3 days, but on average I have to reboot it daily...

Link to comment
Share on other sites

On 9/26/2020 at 12:09 PM, Werner said:

Interesting. So there might be an issue with clock changes (DVFS)?

Not sure how relevant this might be, but after checking around on previous reports on issues with the governor on the nanopi m4v2, I found this post and it actually mentions that "on demand" is causing issues... looking further down, I see bug reports that are quite similar to my own experience.

 

My system is currently set to "on demand" (never changed it, so I assume this is the default), I might give it a try and set it to "performance" or some other values and test with that.

Link to comment
Share on other sites

On 9/28/2020 at 4:44 AM, Werner said:

You could also try to set userspace as govenor and put the min an max frequency at the same values. performance basically means to set them to the highest possible.

 

I've changed it to 1008000 yesterday, so far all good (though obviously a bit slower):

pedro@nanopim4v2:~$ cat /etc/default/cpufrequtils
ENABLE=true
MIN_SPEED=1008000
MAX_SPEED=1008000
GOVERNOR=userspace

 

Link to comment
Share on other sites

7 hours ago, Pedro Lamas said:

I've changed it to 1008000 yesterday, so far all good (though obviously a bit slower):

With one of my boards I have had good results with min set to 1008000 and max to 2016000 (ondemand governor). You could also try that range.

However the other one is still unstable in this scenario but runs stable with performance governor (meaning 2016000 all the time).

 

 

Link to comment
Share on other sites

15 hours ago, hexdump said:

maybe that helps, but running this board at only 1 ghz sounds a bit strange :)

 

 

Indeed, but I'm just doing this as a test and seeing how things go... I want to take full advantage of this boards, but something must be going terribly wrong for it to randomly crash and maybe this is a step in the right direction to find out what it is and how to mitigate it!

 

10 hours ago, piter75 said:

With one of my boards I have had good results with min set to 1008000 and max to 2016000 (ondemand governor). You could also try that range.

However the other one is still unstable in this scenario but runs stable with performance governor (meaning 2016000 all the time).

 

 

 
 

 

Those are the settings I had originally (min 1008000, max 2016000, ondemand), unfortunately I know those make my board to randomly crash!

Link to comment
Share on other sites

@Werner just a thought but the NanoPi M4V2 has an RK3399 SoC , so it has a Dual-Core Cortex-A72(up to 2.0GHz) and a Quad-Core Cortex-A53(up to 1.5GHz)... does the "ondemand" governor handle these two CPU's separately?

 

The datasheet does mention they work with different voltages: http://opensource.rock-chips.com/images/d/d7/Rockchip_RK3399_Datasheet_V2.1-20200323.pdf

Link to comment
Share on other sites

13 minutes ago, Pedro Lamas said:

@Werner just a thought but the NanoPi M4V2 has an RK3399 SoC , so it has a Dual-Core Cortex-A72(up to 2.0GHz) and a Quad-Core Cortex-A53(up to 1.5GHz)... does the "ondemand" governor handle these two CPU's separately?

 

The datasheet does mention they work with different voltages: http://opensource.rock-chips.com/images/d/d7/Rockchip_RK3399_Datasheet_V2.1-20200323.pdf

No idea. Sorry

Link to comment
Share on other sites

23 minutes ago, Pedro Lamas said:

does the "ondemand" governor handle these two CPU's separately?

Yes. They are treated as separate groups/clusters when it comes to scaling and they also have separate regulators assigned to them.

 

cpufrequtils however cannot configure their limits separately - the same limits are applied to both clusters

 

Link to comment
Share on other sites

I just noticed something while reading at the RK3399 specsheet: the recommended maximum frequency of the A72 is actually 1.8Ghz, not 2.0Ghz as on the FriendlyARM website and wiki!

 

The RK3399K however does indicate a recommended maximum of 2.0Ghz, but that is not the version in use on the NanoPi M4V2.

 

The Rock Pi 4 uses the same RK3399 SoC and they specifically say the frequency of the A72 is 1.8Ghz.

 

I even found a commit in armbian codebase for the Helios64 (another one with the same RK3399 SoC) where the maximum is set to 1.8Ghz: https://github.com/armbian/build/pull/2191

 

I will leave my board for a couple of days more with "userspace" governor and min and max set to 1008000, and if there's no crashes, I will try "ondemand" governor with min set to 1008000 and max to 1800000

Link to comment
Share on other sites

@piter75 @Pedro Lamas

Helios64 also encounter some random crash, yesterday we tried to redefine opp just 408 MHz and 1.4/1.8 GHz and we don't see any random crash anymore.

It seems similar DVFS problem as discussed in this thread. Then our customer point us to odroid n1 issue at

https://forum.odroid.com/viewtopic.php?t=30303

Maybe you can give it a try on Nano Pi M4v2.

 

We are still testing on Helios64 (with value 40000), so far with reboot and power cycle does not trigger any kernel crash.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines