Last effort regarding RockPro64 kernel panics


Recommended Posts

This is a last-ditch effort to figure out how we can continue to build our project on top of Armbian.

 

Goal

We are building an appliance on the RockPro64 board, with eMMC and an PCIe SSD attached, using a custom-built Armbian Ubuntu 18.04 image with extensive post-configuration (packages, appliacations, overlayroot...) within the boot process. The image is then processed by Mender to create a true dual-root filesystem over-the-air update system with fallback (we invested a lot of resources to extend Mender for Armbian and RockPro64). This build process works perfectly and is exactly what we need.

 

Issue

The reliability, however, is bad. In an older topic of mine, I documented our issues of regular device reboots due to kernel panics and infrequent freezes. We never really figured out what causes these crashes. It depends on the individual hardware, but on average reboots happen every few days, freezes that need a manuall power cycle every one or two weeks. This is also the case when using the official images.

 

Workaround

When using the exact same custom-built Armbian image, but replace the kernel with an Ayufan kernel, the images are stable. The devices run for weeks without reboots or freezes, independent on the individual hardware.

Quote

dpkg --install linux-firmware-image-4.4.202-1237-rockchip-ayufan-gfd4492386213_4.4.202-1237-rockchip-ayufan_arm64.deb
dpkg --install linux-image-4.4.202-1237-rockchip-ayufan-gfd4492386213_4.4.202-1237-rockchip-ayufan_arm64.deb

ln -sf /boot/vmlinuz-4.4.202-1237-rockchip-ayufan-4.4.202-1237-rockchip-ayufan /boot/Image

 

Options

This is obviously a very hacky way of creating a Linux image. So we have two options:

  1. (preferred) continue to use Armbian, but with a stable kernel
  2. switch to Ayufan and eat all the costs: implement new build system, redesign post-configuration, extend Mender solution

 

The way forward

I am not a Linux kernel expert. I am also not sure how the two projects Armbian and Ayufan are related exactly, but it seems that Armbian uses some of Ayufans resources in the build process. For me the question then becomes where the differences originate.

 

  • Is this something that can be and people are willing to figure out?
  • How can I support that by running tests, providing logs or assist in other non source-code tasks?

 

Any help is appreciated.

Link to post
Share on other sites
Armbian is a community driven open source project. Do you like to contribute your code?

10 hours ago, Meier said:

This is a last-ditch effort to figure out how we can continue to build our project on top of Armbian.

Hmm, nobody apart from you suffers ;)

 

So, you want to analyze what happens in exactly the moment it breaks/stops. Right?  But your devices dies, so no information.

Write a document what steps you gonna do to analyze and write down the results per test. For example:

How about a serial-connection to a second device that does nothing else but collecting as many data as possible, Kernel log, other logs, voltage ?

Additionally a graphical interface like:  RPi-Monitor  evtl. still in armbian-config  otherwise https://rpi-experiences.blogspot.com/

And see if it happens by removing HW from the board, one after the other.  Step-by-Step.

 

Link to post
Share on other sites
16 hours ago, Meier said:

We are building an appliance on the RockPro64 board, with eMMC and an PCIe SSD attached, using a custom-built Armbian Ubuntu 18.04 image with extensive post-configuration (packages, appliacations, overlayroot...) within the boot process. The image is then processed by Mender to create a true dual-root filesystem over-the-air update system with fallback (we invested a lot of resources to extend Mender for Armbian and RockPro64). This build process works perfectly and is exactly what we need.

 

are using the armbian-build tools to build your image at all? or just taking and modifying image?

 

by taking advantage of the userpatches functionality you could configure builder to use kernel sources of your choosing https://docs.armbian.com/Developer-Guide_User-Configurations/#user-provided-sources-config-overrides

Link to post
Share on other sites
21 hours ago, Meier said:

but it seems that Armbian uses some of Ayufans resources in the build process.


Yes, we use the same kernel sources, just a few more patches and a different kernel config. There are also additional drivers, which can't have affects on this.

 

Try this:

  • Remove all patches from 
    patch/kernel/rockchip64-legacy

     

  • Use a kernel config from ayufan image
  • Place that config to
    userpatches/linux-rockchip64-legacy.config

 

Rebuild kernel with additional parameters: WIREGUARD="no" EXTRAWIFI="no" and report back. This will narrow down the scope where to look for the troubles.

Link to post
Share on other sites

Thanks for all the hints, will try this out.

 

To be honest, I can't imagine to be the only one having these issues. Maybe it's just that our usage of the RockPro64 board with a PCIe NVMe M.2 SSD is not very common? We suspected the PCIe / SSD combination to be the cause for a long time, for but cannot really confirm this. The issues are consistent over dozens of RockPro64 boards and many SSD brands. And I've heard from users that the PinebookPro (which uses similar hardware) also freezes under heavy load when run with an SSD.

Link to post
Share on other sites

Hi, 

 

I am having similar issues as reported by Meier, difference is I am running a NanoPi M4v2. I tried all images available (server only) resulting in the same behavior.
Kernel panics, system freezes, reboots etc. To me this happens more frequent than is described by Meier, sometimes the system panics during boot or during 'heavy' disk activity.


It always happens between 30 seconds and 2 mins in watching something on PLEX.
I don't have any logs as nothing is logged when this happens, I have a screen shot of a kernel panic if needed.


Right now I am using FriendlyCore Bionic and the system has been running stable for several weeks.

 

My Setup:
- NanoPi M4v2
- SATA Hat
- 12V 10A power supply (through 4 pin power conn on sata hat)
- 16Gb eMMC
- 3x 3.5" hhd's
- Buster/Bionic server (2020/01/25) with 4.4/5.4 (tried all versions)
- Clean firmware image with no additional config done.
- Latest Plex Media Server installed.
- No other software or changes were made to the system.

 

I ruled out hardware issues (well as much as possible) as I have tried 3 different NanoPi M4v2's, 2 different SATA Hats, 3 different eMMC modules, different hard drives.
It seems what Meier and I have in common is disk access over PCIe, not sure if this is where the issue is.

 

I really would like to use Armbian for my projects as I have been successfully using it for years on other boards and always running perfectly.

 

Thanks!

Edited by MikeInNs
Spelling....
Link to post
Share on other sites

Hi,

 

I have a RockPro64 too: I have the same problems than you all. 

I had a Ayufan image then I tried Armbian for installing OMV 5.

I have a PCIe to SATA too with 1 HDD.

When I make a cp command, the system freezes.

When I use the Resetperms Pluging on a folder with datas, the system freezes.

Sometimes, it freezes without doing nothing.

Link to post
Share on other sites

Rock64, USB3.0 to SATA with 2 HDDs using USB3.0 hub

 

Have been using Ayufan's for a couple of years. Very stable. However his build is not being updated since last October,  I tried Armbian 4.4.y legacy in December and yesterday, thinking newer build of kernel might be worth the trouble. Same thing. Freezes a few times in an hour. It was early stage of configuring the server, I don't think it's any specific programs causing the freeze. I was installing docker or OVM or just 'ls -la' at the time of the freeze.

 

On the other hand, my Orange Pi Zero on Armbian is very stable.  It doesn't have anything attached to it though.

 

Link to post
Share on other sites
On 3/7/2020 at 5:03 AM, test011b said:

Rock64, USB3.0 to SATA with 2 HDDs using USB3.0 hub

 

Have been using Ayufan's for a couple of years. Very stable. However his build is not being updated since last October,  I tried Armbian 4.4.y legacy in December and yesterday, thinking newer build of kernel might be worth the trouble. Same thing. Freezes a few times in an hour. It was early stage of configuring the server, I don't think it's any specific programs causing the freeze. I was installing docker or OVM or just 'ls -la' at the time of the freeze.

 

On the other hand, my Orange Pi Zero on Armbian is very stable.  It doesn't have anything attached to it though.

 

There is a same problem on my rock64, I found after freezing armbian firmware and armbian-config update rock64 becomes stable .

Link to post
Share on other sites

@Meier @aldrick I do have rockpro64, i have only experinced kernel freeze with some crap sata pcie cards, and yes card from pine64 stores is one of them.

I run Armbian buster with kernel 5.8.13. I use OMV5 with no problems what so ever, i did update spi bootloader from ayufan, installede armbian on ssd. 

Boot up from and run from SSD,  + 4 disk attached (1x480GB SSD, 2x3.5" 2x2.5"), i did write 4TB on one of the disks, yesterday.. No problems..

ROCK64, i have not used that board for a long time, the usb ports acts like my girlfriend when she is mad.

Link to post
Share on other sites
On 10/12/2020 at 9:05 AM, soerenderfor said:

@Meier @aldrick I do have rockpro64, i have only experinced kernel freeze with some crap sata pcie cards, and yes card from pine64 stores is one of them.

I run Armbian buster with kernel 5.8.13. I use OMV5 with no problems what so ever, i did update spi bootloader from ayufan, installede armbian on ssd. 

Boot up from and run from SSD,  + 4 disk attached (1x480GB SSD, 2x3.5" 2x2.5"), i did write 4TB on one of the disks, yesterday.. No problems..

ROCK64, i have not used that board for a long time, the usb ports acts like my girlfriend when she is mad.

 

 

sorry to hijack this thread but I tried exactly as you have with kernel 5.8.13 with OMV5 and as you say it does discover the PCIe card marvell 88SE9230 and drives successfully after adding a udev rule that binds the driver to the card , however on a 5.(+) kernel the fan runs continuously . Did you discover a way to control the fan on the newer kernel , if you implement this setup .

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...