Stability issues on 20.08.21


Recommended Posts

Continuing the discussion from here

 

On a clean install of 20.08.21 im able to crash the box within a few hours of it being under load.

It appears as if the optimisations are being applied

root@helios64:~# cat /proc/sys/net/core/rps_sock_flow_entries
32768

 

The suggestion @ShadowDance made to switch to the performance governor hasn't helped.

 

Anecdotally, I think I remember the crashes always mentioning page faults, and early on there was some discussion about memory timing. Is it possible this continues to be that issue?

 

Edited by jbergler
spelling and some extra details
Link to post
Share on other sites
Armbian is a community driven open source project. Do you like to contribute your code?

 

On 11/11/2020 at 4:08 PM, jbergler said:

I also tried the suggestion to set a performance governor, and for shits and giggles I reduced the max cpu frequency, but that hasn’t made a difference.

System still locks up within a few hours.

What was the max cpu freq you set?

Could you try with performance governor at 1.2GHz and at 816 MHz?

How did you load the system?

 

 

Did you encounter kernel crash on 20.08.10 ?
 

Link to post
Share on other sites
On 11/13/2020 at 5:31 PM, aprayoga said:

Did you encounter kernel crash on 20.08.10 ?

 

It's hard to say for sure, I never quite had a stable system, but I also wasn't generating the kind of load I am now back then.

 

On 11/13/2020 at 5:31 PM, aprayoga said:

What was the max cpu freq you set?

 

 

I had only reduced it one step, I'm trying again now with the settings you suggest.

 

root@helios64:~# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor | uniq
performance
root@helios64:~# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq | uniq
816000
root@helios64:~# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq | uniq
1200000

 

The load I'm generating is running a zfs scrub on a 37TB pool across all five disks.

 

Link to post
Share on other sites

After about an hour of the ZFS scrub the "bad PC value" error happened again, however this time the system didn't hard lock.

A decent number of processes related to ZFS are stuck in uninterruptible IO, I can't export the pool, etc.

 

I did see the system crash like this occasionally without the cpufreq tweaks, so I'm not sure it tells us anything new.

I will try again.

 

note, the relatively high uptime is from the system sitting idle for ~5 days before I put it under load again.

 

Spoiler

[433046.690213] Unable to handle kernel paging request at virtual address f9ff8000091f3190

[433046.690218] Internal error: SP/PC alignment exception: 8a000000 [#1] PREEMPT SMP

[433046.690224] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge rfkill governor_performance zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) r8152 snd_soc_hdmi_codec snd_soc_rockchip_i2s snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer panfrost snd gpu_sched soundcore leds_pwm gpio_charger pwm_fan rockchip_rga videobuf2_dma_sg hantro_vpu(C) rockchip_vdec(C) v4l2_h264 videobuf2_dma_contig videobuf2_vmalloc v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc fusb30x(C) zstd sg gpio_beeper cpufreq_dt zram sch_fq_codel lm75 ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek rockchipdrm analogix_dp dw_hdmi dwmac_rk dw_mipi_dsi stmmac_platform drm_kms_helper cec stmmac rc_core

[433046.690323] mdio_xpcs

[433046.690976] Mem abort info:

[433046.691593] drm drm_panel_orientation_quirks adc_keys

[433046.699701]   ESR = 0x86000004

[433046.700155] CPU: 5 PID: 248302 Comm: z_rd_int Tainted: P         C OE     5.8.17-rockchip64 #20.08.21

[433046.700433]   EC = 0x21: IABT (current EL), IL = 32 bits

[433046.701245] Hardware name: Helios64 (DT)

[433046.701718]   SET = 0, FnV = 0

[433046.702073] pstate: 40000005 (nZcv daif -PAN -UAO BTYPE=--)

[433046.702373]   EA = 0, S1PTW = 0

[433046.702850] pc : 0xb

[433046.703132] [f9ff8000091f3190] address between user and kernel address ranges

[433046.703334] lr : 0xb

[433046.704168] sp : ffff800019d53a40

[433046.704469] x29: ffff0000b604c000 x28: ffff0000f6c03a00

[433046.704946] x27: ffff000045281600 x26: 000000000000000b

[433046.705421] x25: ffff800011a10000 x24: 0000000000000000

[433046.705897] x23: 0000000000000000 x22: 0080000000000000

[433046.706374] x21: 0000000000042c00 x20: ffff000092ff8d88

[433046.706849] x19: ffff000045281600 x18: 00001e1e0a99c21b

[433046.707326] x17: 00000030510320ae x16: 000000fe01cf8d4b

[433046.707801] x15: 0000000000000000 x14: 0000000000000000

[433046.708277] x13: 0000000000000008 x12: ffff0000d8f2ea28

[433046.708753] x11: 0000000000000020 x10: 0000000000000001

[433046.709229] x9 : 0000000000000000 x8 : ffff00006fb62b00

[433046.709705] x7 : 0000000000000000 x6 : 000000000000003f

[433046.710181] x5 : 0000000000000040 x4 : 0000000000000000

[433046.710657] x3 : 0000000000000004 x2 : 0000000000000000

[433046.711133] x1 : ffff000000000000 x0 : ffff00006fb62a00

[433046.711610] Call trace:

[433046.711837] 0xb

[433046.712016] Code: bad PC value

[433046.712298] ---[ end trace ac904cdd631dd942 ]---

[433046.712714] Internal error: Oops: 86000004 [#2] PREEMPT SMP

[433046.713212] Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge rfkill governor_performance zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) r8152 snd_soc_hdmi_codec snd_soc_rockchip_i2s snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer panfrost snd gpu_sched soundcore leds_pwm gpio_charger pwm_fan rockchip_rga videobuf2_dma_sg hantro_vpu(C) rockchip_vdec(C) v4l2_h264 videobuf2_dma_contig videobuf2_vmalloc v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc fusb30x(C) zstd sg gpio_beeper cpufreq_dt zram sch_fq_codel lm75 ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek rockchipdrm analogix_dp dw_hdmi dwmac_rk dw_mipi_dsi stmmac_platform drm_kms_helper cec stmmac rc_core

[433046.713298] mdio_xpcs drm drm_panel_orientation_quirks adc_keys

[433046.721466] CPU: 4 PID: 248273 Comm: z_rd_int Tainted: P      D  C OE     5.8.17-rockchip64 #20.08.21

[433046.722281] Hardware name: Helios64 (DT)

[433046.722637] pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)

[433046.723135] pc : 0xf9ff8000091f3190

[433046.723464] lr : avl_find+0x68/0xc8 [zavl]

[433046.723833] sp : ffff800019c73a40

[433046.724134] x29: ffff800019c73a40 x28: ffff000080c2afa8

[433046.724611] x27: ffff0000b604c9a8 x26: ffff0000b604c9c8

[433046.725088] x25: ffff0000b6743090 x24: 0000000000000000

[433046.725565] x23: ffff800019c73af0 x22: ffff8000091f40d8

[433046.726041] x21: ffff00005c0be900 x20: ffff000056059e00

[433046.726517] x19: ffff000056059e00 x18: 000021ba4d598e5d

[433046.726994] x17: 0000003fa86bd6e8 x16: 0000014c01a9b0f1

[433046.727470] x15: 0000000000000000 x14: 0000000000000000

[433046.727946] x13: 0000000000000008 x12: ffff0000e5b3b028

[433046.728422] x11: 0000000000000100 x10: 0000000000000001

[433046.728898] x9 : 0000000000000000 x8 : 000000000023e0e8

[433046.729373] x7 : 000000000023e120 x6 : 0000000000000001

[433046.729849] x5 : 0000000000000001 x4 : 0000000000000000

[433046.730325] x3 : 0000000000000000 x2 : 0000000000000100

[433046.730801] x1 : 0000000000000000 x0 : 00000000ffffffff

[433046.731277] Call trace:

[433046.731503] 0xf9ff8000091f3190

[433046.731948] dsl_scan_prefetch+0x1a8/0x228 [zfs]

[433046.732490] dsl_scan_prefetch_dnode+0x8c/0x110 [zfs]

[433046.733068] dsl_scan_prefetch_cb+0x21c/0x268 [zfs]

[433046.733630] arc_read_done+0x20c/0x3f8 [zfs]

[433046.734140] zio_done+0x254/0xd40 [zfs]

[433046.734634] zio_execute+0xac/0x110 [zfs]

[433046.735016] taskq_thread+0x298/0x440 [spl]

[433046.735402] kthread+0x118/0x150

[433046.735700] ret_from_fork+0x10/0x34

[433046.736031] Code: bad PC value

[433046.736315] ---[ end trace ac904cdd631dd943 ]---

 

Link to post
Share on other sites

I'm been testing my Helios64 as well.  I'm running armbian 20.08.21 Focal, but I also downloaded the kernel builder script thingy from github and built linux-image-current-rockchip64-20.11.0-trunk which is a 5.9.9 kernel.  Installed that, then built openzfs 2.0.0-rc6.   I then proceeded to syncoid 2.15TB of snapshots to it also while doing a scrub and was able to get the load average up to 10+.  The machine ran through the night, so I think it might be stable.  A few more days testing will validate this.

 

schu

Edited by akschu
speling
Link to post
Share on other sites

I see a lot of stability issue posts around this board. Do we know if this issue is related purely to the kernel such as what was stated here: https://blog.kobol.io/2020/10/27/helios64-software-issue/?

Or is this maybe a combination of things, such as ZFS and latest kernel? My Helios64 was delivered today, and I plan on a RAID setup but not with ZFS. So I guess I will see for myself soon. :)

Link to post
Share on other sites

I'll defer to the Kobol folks, in the previous mega thread the statement was made that the issues should have been fixed in a new version that ensured it was correctly applying the hardware tweaks, for me things have never been properly stable, even on just a vanilla install. The only semi-stable solution has been to reduce the clock speed, which is fine for now.

Link to post
Share on other sites

5.9.9 with armbian patches is working well for me so far.  I've scrubbed the pool 5-6  times as well as syncoid from my hypervisor every hour for the last two days.  I'm mostly just looking for a stable backup system that supports ZFS and it looks like this will work.

Link to post
Share on other sites

@jbergler

5.8.x & 5.9.x are working here as well, but I'm not using ZFS, just plain vanilla mdadm RAID and LVM2 formatted as XFS.

If you have an extra set of HDDs could you try building a new data pool with mdamd or LVM2 to test your setup?

Since you're getting memory related errors, is there a way for you to run a memory test on your board?

Have you checked if the heatsink is seated properly over the components of the board?

 

Link to post
Share on other sites

Did more testing over the weekend on 5.9.9.  I was able to benchmark with FIO on top of a ZFS dataset for hours with the load average hitting 10+ while scrubing the datastore.  No issues.  Right nowt he uptime is 3 days. 

 

I'm actually a little surprised at the performance.  It's very decent for what it is. 

 

I wonder if the fact that I'm running ZFS and 5.9.9 while others are using mdadm and 5.8 is the difference.  I'm not really planning on going backwards on either.  If 5.9.9 works then no need to build another kernel, and you would have to pry ZFS out of my cold dead hands.  I've spend enough of my life layering encryption/compression on top of partitions on top of volume management on top of partitions on top of disks.  ZFS is just better, and having performance penalty free snapshops that I can replicate to other hosts over SSH is the icing on the cake. 

 

Link to post
Share on other sites
3 hours ago, akschu said:

I'm not really planning on going backwards on either.  If 5.9.9 works then no need to build another kernel, and you would have to pry ZFS out of my cold dead hands.  I've spend enough of my life layering encryption/compression on top of partitions on top of volume management on top of partitions on top of disks.  ZFS is just better, and having performance penalty free snapshops that I can replicate to other hosts over SSH is the icing on the cake. 

 

Amen!

 

I have been following this forum with great interest and suspect it's only a matter of time until I buy one of these devices (or maybe wait for ECC one).

 

Thanks to everyone testing and contributing feedback toward getting these devices stable, I for one certainly appreciate it (I am sure others do/will as well).

Link to post
Share on other sites

@jbergler Could you try the attached u-boot ? This u-boot contains updated Rockchip blob (DDR driver & ATF)

install with

 

dpkg -i linux-u-boot-current-helios64_20.11.0-trunk_arm64.deb

After that, run armbian-config > System > Install > 5 Install/Update the bootloader on SD/eMMC

 

If you are using SD card, make sure to clean bootloader on the eMMC. you can run

dd if=/dev/zero of=/dev/mmcblk1 seek=64 count=30000

 

Power cycle the system. The system should boot with new bootloader.

 

Spoiler

DDR Version 1.24 20191016 RevNocRL
In
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 MHz, current 856MHz
OUT
Boot1 Release Time: May 29 2020 17:36:36, version: 1.26
CPUId = 0x0
ChipType = 0x10, 352
SdmmcInit=2 0
BootCapSize=100000
UserCapSize=14910MB
FwPartOffset=2000 , 100000
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=15103MB
FwPartOffset=2000 , 0
StorageInit ok = 67151
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3335db8 signature is wrong
recovery gpt...
GPT 0x3335db8 signature is wrong
recovery gpt fail!
Trust Addr:0x4000, 0x58334c42
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0xdd6b0
RunBL31 0x40000 @ 191346 us
NOTICE:  BL31: v1.3(debug):2803a2c8a
NOTICE:  BL31: Built : 14:31:03, May 19 2020
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1191): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.07-armbian (Nov 25 2020 - 07:14:05 +0700)

SoC: Rockchip rk3399
Reset cause: POR
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot/boot.scr
3185 bytes read in 6 ms (517.6 KiB/s)
## Executing script at 00500000
Boot script loaded from mmc 1
166 bytes read in 5 ms (32.2 KiB/s)
14091886 bytes read in 600 ms (22.4 MiB/s)
27331072 bytes read in 1157 ms (22.5 MiB/s)
79946 bytes read in 13 ms (5.9 MiB/s)
2698 bytes read in 10 ms (262.7 KiB/s)
Applying kernel provided DT fixup script (rockchip-fixup.scr)
## Executing script at 09000000
## Loading init Ramdisk from Legacy Image at 06000000 ...
   Image Name:   uInitrd
   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
   Data Size:    14091822 Bytes = 13.4 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 01f00000
   Booting using the fdt blob at 0x1f00000
   Loading Ramdisk to f5176000, end f5ee662e ... OK
   Loading Device Tree to 00000000f50fa000, end 00000000f5175fff ... OK

Starting kernel ...

 

Please take note at binaries version

DDR Version 1.24 20191016 RevNocRL
NOTICE:  BL31: Built : 14:31:03, May 19 2020
U-Boot 2020.07-armbian (Nov 25 2020 - 07:14:05 +0700)

 

Try to trigger the kernel crash.

 

---

If you want to restore the original u-boot you can run

apt install linux-u-boot-helios64-current=20.08.21

and update the u-boot using armbian-config

---

 

There is built in memory tester on Linux kernel,

just add this line to /boot/armbianEnv.txt

extraargs=memtest=10

you can change number of loop (10). It took quite some time to run the test.

you can see the result using dmesg

 

 

 

 

linux-u-boot-current-helios64_20.11.0-trunk_arm64.deb

Link to post
Share on other sites

Initial attempt with the new uboot and with removing the cpufreq tweaks results in a new panic

Spoiler

[588872.135762] reboot: Restarting system
DDR Version 1.24 20191016 RevNocRL
In
soft reset
SRX
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 MHz, current 856MHz
OUT

Boot1 Release Time: May 29 2020 17:36:36, version: 1.26
CPUId = 0x0
ChipType = 0x10, 447
SdmmcInit=2 0
BootCapSize=100000
UserCapSize=14910MB
FwPartOffset=2000 , 100000
mmc0:cmd8,20
mmc0:cmd5,20
mmc0:cmd55,20
mmc0:cmd1,20
mmc0:cmd8,20
mmc0:cmd5,20
mmc0:cmd55,20
mmc0:cmd1,20
mmc0:cmd8,20
mmc0:cmd5,20
mmc0:cmd55,20
mmc0:cmd1,20
SdmmcInit=0 1
StorageInit ok = 69105
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3335db8 signature is wrong
recovery gpt...
GPT 0x3335db8 signature is wrong
recovery gpt fail!
Trust Addr:0x4000, 0x58334c42
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0xdd6b0
RunBL31 0x40000 @ 96897 us
NOTICE:  BL31: v1.3(debug):2803a2c8a
NOTICE:  BL31: Built : 14:31:03, May 19 2020
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1191): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9

 

U-Boot 2020.07-armbian (Nov 25 2020 - 07:14:05 +0700)

SoC: Rockchip rk3399
Reset cause: RST
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
Hit any key to stop autoboot:  0
Card did not respond to voltage select!
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot/boot.scr
3185 bytes read in 18 ms (171.9 KiB/s)
## Executing script at 00500000
Boot script loaded from mmc 0
193 bytes read in 15 ms (11.7 KiB/s)
16364137 bytes read in 1576 ms (9.9 MiB/s)
27331072 bytes read in 2614 ms (10 MiB/s)
79946 bytes read in 40 ms (1.9 MiB/s)
2698 bytes read in 32 ms (82 KiB/s)

Applying kernel provided DT fixup script (rockchip-fixup.scr)
## Executing script at 09000000
## Loading init Ramdisk from Legacy Image at 06000000 ...
   Image Name:   uInitrd
   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
   Data Size:    16364073 Bytes = 15.6 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 01f00000
   Booting using the fdt blob at 0x1f00000
   Loading Ramdisk to f4f4b000, end f5ee6229 ... OK
   Loading Device Tree to 00000000f4ecf000, end 00000000f4f4afff ... OK

Starting kernel ...

[   16.090622] OF: graph: no port node found in /syscon@ff770000/usb2-phy@e450/otg-port
[   16.637382] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256
[   24.585805] Unable to handle kernel NULL pointer dereference at virtual address 00000000000005cc
[   24.586591] Mem abort info:
[   24.586844]   ESR = 0x96000004
[   24.587120]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.587591]   SET = 0, FnV = 0
[   24.587865]   EA = 0, S1PTW = 0
[   24.588145] Data abort info:
[   24.588404]   ISV = 0, ISS = 0x00000004
[   24.588746]   CM = 0, WnR = 0
[   24.589014] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.589786] Mem abort info:
[   24.590038]   ESR = 0x96000004
[   24.590312]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.590781]   SET = 0, FnV = 0
[   24.591055]   EA = 0, S1PTW = 0
[   24.591336] Data abort info:
[   24.591594]   ISV = 0, ISS = 0x00000004
[   24.591934]   CM = 0, WnR = 0
[   24.592201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.592973] Mem abort info:
[   24.593225]   ESR = 0x96000004
[   24.593499]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.593968]   SET = 0, FnV = 0
[   24.594241]   EA = 0, S1PTW = 0
[   24.594522] Data abort info:
[   24.594780]   ISV = 0, ISS = 0x00000004
[   24.595121]   CM = 0, WnR = 0
[   24.595388] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.596161] Mem abort info:
[   24.596412]   ESR = 0x96000004
[   24.596686]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.597155]   SET = 0, FnV = 0
[   24.597428]   EA = 0, S1PTW = 0
[   24.597709] Data abort info:
[   24.597967]   ISV = 0, ISS = 0x00000004
[   24.598308]   CM = 0, WnR = 0
[   24.598576] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.599349] Mem abort info:
[   24.599601]   ESR = 0x96000004
[   24.599874]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.600343]   SET = 0, FnV = 0
[   24.600616]   EA = 0, S1PTW = 0
[   24.600896] Data abort info:
[   24.601154]   ISV = 0, ISS = 0x00000004
[   24.601495]   CM = 0, WnR = 0
[   24.601762] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.602534] Mem abort info:
[   24.602784]   ESR = 0x96000004
[   24.603058]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.603527]   SET = 0, FnV = 0
[   24.603800]   EA = 0, S1PTW = 0
[   24.604081] Data abort info:
[   24.604339]   ISV = 0, ISS = 0x00000004
[   24.604680]   CM = 0, WnR = 0
[   24.604946] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.605718] Mem abort info:
[   24.605968]   ESR = 0x96000004
[   24.606242]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.606711]   SET = 0, FnV = 0
[   24.606984]   EA = 0, S1PTW = 0
[   24.607263] Data abort info:
[   24.607521]   ISV = 0, ISS = 0x00000004
[   24.607863]   CM = 0, WnR = 0
[   24.608129] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.608901] Mem abort info:
[   24.609153]   ESR = 0x96000004
[   24.609426]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.609895]   SET = 0, FnV = 0
[   24.610168]   EA = 0, S1PTW = 0
[   24.610448] Data abort info:
[   24.610706]   ISV = 0, ISS = 0x00000004
[   24.611048]   CM = 0, WnR = 0
[   24.611314] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.612086] Mem abort info:
[   24.612336]   ESR = 0x96000004
[   24.612610]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.613079]   SET = 0, FnV = 0
[   24.613352]   EA = 0, S1PTW = 0
[   24.613633] Data abort info:
[   24.613891]   ISV = 0, ISS = 0x00000004
[   24.614232]   CM = 0, WnR = 0
[   24.614500] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.615272] Mem abort info:
[   24.615522]   ESR = 0x96000004
[   24.615796]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.616265]   SET = 0, FnV = 0
[   24.616538]   EA = 0, S1PTW = 0
[   24.616819] Data abort info:
[   24.617077]   ISV = 0, ISS = 0x00000004
[   24.617418]   CM = 0, WnR = 0
[   24.617684] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.618457] Mem abort info:
[   24.618707]   ESR = 0x96000004
[   24.618980]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.619450]   SET = 0, FnV = 0
[   24.619723]   EA = 0, S1PTW = 0
[   24.620004] Data abort info:
[   24.620262]   ISV = 0, ISS = 0x00000004
[   24.620603]   CM = 0, WnR = 0
[   24.620871] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.621644] Mem abort info:
[   24.621893]   ESR = 0x96000004
[   24.622167]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.622637]   SET = 0, FnV = 0
[   24.622910]   EA = 0, S1PTW = 0
[   24.623191] Data abort info:
[   24.623448]   ISV = 0, ISS = 0x00000004
[   24.623790]   CM = 0, WnR = 0
[   24.624056] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.624828] Mem abort info:
[   24.625080]   ESR = 0x96000004
[   24.625354]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.625823]   SET = 0, FnV = 0
[   24.626096]   EA = 0, S1PTW = 0
[   24.626377] Data abort info:
[   24.626635]   ISV = 0, ISS = 0x00000004
[   24.626976]   CM = 0, WnR = 0
[   24.627244] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.628016] Mem abort info:
[   24.628266]   ESR = 0x96000004
[   24.628539]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.629009]   SET = 0, FnV = 0
[   24.629282]   EA = 0, S1PTW = 0
[   24.629563] Data abort info:
[   24.629821]   ISV = 0, ISS = 0x00000004
[   24.630162]   CM = 0, WnR = 0
[   24.630428] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.631201] Mem abort info:
[   24.631451]   ESR = 0x96000004
[   24.631724]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.632193]   SET = 0, FnV = 0
[   24.632466]   EA = 0, S1PTW = 0
[   24.632747] Data abort info:
[   24.633005]   ISV = 0, ISS = 0x00000004
[   24.633346]   CM = 0, WnR = 0
[   24.633612] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.634384] Mem abort info:
[   24.634634]   ESR = 0x96000004
[   24.634908]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.635377]   SET = 0, FnV = 0
[   24.635650]   EA = 0, S1PTW = 0
[   24.635931] Data abort info:
[   24.636189]   ISV = 0, ISS = 0x00000004
[   24.636530]   CM = 0, WnR = 0
[   24.636798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.637570] Mem abort info:
[   24.637820]   ESR = 0x96000004
[   24.638094]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.638563]   SET = 0, FnV = 0
[   24.638836]   EA = 0, S1PTW = 0
[   24.639116] Data abort info:
[   24.639375]   ISV = 0, ISS = 0x00000004
[   24.639715]   CM = 0, WnR = 0
[   24.639982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.640754] Mem abort info:
[   24.641004]   ESR = 0x96000004
[   24.641277]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.641746]   SET = 0, FnV = 0
[   24.642019]   EA = 0, S1PTW = 0
[   24.642300] Data abort info:
[   24.642558]   ISV = 0, ISS = 0x00000004
[   24.642899]   CM = 0, WnR = 0
[   24.643165] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.643938] Mem abort info:
[   24.644188]   ESR = 0x96000004
[   24.644461]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.644930]   SET = 0, FnV = 0
[   24.645203]   EA = 0, S1PTW = 0
[   24.645484] Data abort info:
[   24.645742]   ISV = 0, ISS = 0x00000004
[   24.646083]   CM = 0, WnR = 0
[   24.646351] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.647123] Mem abort info:
[   24.647373]   ESR = 0x96000004
[   24.647646]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.648115]   SET = 0, FnV = 0
[   24.648388]   EA = 0, S1PTW = 0
[   24.648668] Data abort info:
[   24.648926]   ISV = 0, ISS = 0x00000004
[   24.649267]   CM = 0, WnR = 0
[   24.649533] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.650305] Mem abort info:
[   24.650555]   ESR = 0x96000004
[   24.650829]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.651298]   SET = 0, FnV = 0
[   24.651571]   EA = 0, S1PTW = 0
[   24.651852] Data abort info:
[   24.652109]   ISV = 0, ISS = 0x00000004
[   24.652451]   CM = 0, WnR = 0
[   24.652717] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
[   24.653489] Mem abort info:
[   24.653739]   ESR = 0x96000004
[   24.654013]   EC = 0x25: DABT (current EL), IL = 32 bits
[   24.654482]   SET = 0, FnV = 0
[   24.654755]   EA = 0, S1PTW = 0
[   24.655036] Data abort info:
[   24.655294]   ISV = 0, ISS = 0x00000004
[   24.655635]   CM = 0, WnR = 0
[   24.656017] Insufficient stack space to handle exception!
[   24.656021] ESR: 0x96000047 -- DABT (current EL)
[   24.656022] FAR: 0xffff800011b9fff0
[   24.656024] Task stack:     [0xffff800011ba0000..0xffff800011ba4000]
[   24.656026] IRQ stack:      [0xffff800011ad8000..0xffff800011adc000]
[   24.656028] Overflow stack: [0xffff0000f77932b0..0xffff0000f77942b0]
[   24.656031] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P         C OE     5.8.17-rockchip64 #20.08.21
[   24.656032] Hardware name: Helios64 (DT)
[   24.656034] pstate: 80000085 (Nzcv daIf -PAN -UAO BTYPE=--)
[   24.656036] pc : format_decode+0x4/0x4a8
[   24.656037] lr : vsnprintf+0x8c/0x728
[   24.656039] sp : ffff800011ba0020
[   24.656040] x29: ffff800011ba0020 x28: ffff8000111db6b8
[   24.656045] x27: ffff800011a1d238 x26: 0000000000000020
[   24.656049] x25: 0000000000000000 x24: 00000000000003e0
[   24.656053] x23: 00000000ffffffc8 x22: ffff800010ecb890
[   24.656056] x21: ffff800011ba0350 x20: ffff800011a1d238
[   24.656060] x19: ffff800011a1d618 x18: 0000000000000010
[   24.656064] x17: 0000000000000001 x16: 0000000000000019
[   24.656068] x15: ffff0000f6ea5ba8 x14: 0720072007200720
[   24.656072] x13: 0720072007200720 x12: 0720072007200720
[   24.656075] x11: ffff800011ba0350 x10: ffff800011ba0350
[   24.656079] x9 : ffff800011ba0350 x8 : ffff800011ba0350
[   24.656083] x7 : ffff800011ba0350 x6 : ffff800011ba0350
[   24.656086] x5 : 0000000000000000 x4 : ffff0000f6ea5700
[   24.656090] x3 : ffff800011ba00d0 x2 : ffff8000111db6b8
[   24.656094] x1 : ffff800011ba00a0 x0 : ffff8000111db6b8
[   24.656098] Kernel panic - not syncing: kernel stack overflow
[   24.656100] SMP: stopping secondary CPUs
[   24.656102] Kernel Offset: disabled
[   24.656103] CPU features: 0x240022,2000600c
[   24.656105] Memory Limit: none

 

 

And trying again

Spoiler

[   19.133928] Unable to handle kernel paging request at virtual address ffff80000ee0257c
[   19.134640] Mem abort info:
[   19.134892]   ESR = 0x86000006
[   19.135169]   EC = 0x21: IABT (current EL), IL = 32 bits
[   19.135639]   SET = 0, FnV = 0
[   19.135913]   EA = 0, S1PTW = 0
[   19.136197] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000035ec000
[   19.136789] [ffff80000ee0257c] pgd=00000000f7fff003, p4d=00000000f7fff003, pud=00000000f7ffe003, pmd=0000000000000000
[   19.137730] Internal error: Oops: 86000006 [#1] PREEMPT SMP
[   19.138224] Modules linked in: zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) r8152 snd_soc_hdmi_codec panfrost snd_soc_rockchip_i2s gpu_sched snd_soc_core leds_pwm snd_pcm_dmaengine pwm_fan gpio_charger snd_pcm hantro_vpu(C) snd_timer rockchip_rga rockchip_vdec(C) snd videobuf2_dma_sg soundcore v4l2_h264 videobuf2_dma_contig videobuf2_vmalloc fusb30x(C) v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc sg gpio_beeper cpufreq_dt sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek rockchipdrm analogix_dp dw_hdmi dw_mipi_dsi drm_kms_helper cec rc_core dwmac_rk stmmac_platform drm stmmac mdio_xpcs drm_panel_orientation_quirks adc_keys
[   19.144940] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P         C OE     5.8.17-rockchip64 #20.08.21
[   19.145721] Hardware name: Helios64 (DT)
[   19.146073] pstate: 80000085 (Nzcv daIf -PAN -UAO BTYPE=--)
[   19.146572] pc : 0xffff80000ee0257c
[   19.146894] lr : _raw_spin_lock_irqsave+0x28/0xa0
[   19.147313] sp : ffff800011adbeb0
[   19.147609] x29: ffff800011adbeb0 x28: 0000000000000001
[   19.148083] x27: ffff0000f6ea5700 x26: ffff800011adc000
[   19.148555] x25: ffff800011501d20 x24: 0000000000000000
[   19.149027] x23: 0000000000000000 x22: ffff0000f6ea5700
[   19.149498] x21: ffff0000f77a8b40 x20: 0000000000000080
[   19.149970] x19: ffff0000f77a8b40 x18: 0000000000000000
[   19.150441] x17: 0000000000000001 x16: 0000000000000019
[   19.150912] x15: 0000000000000006 x14: 000010670d7edc0e
[   19.151384] x13: 00000000000003fd x12: 0000000000000006
[   19.151855] x11: 0000000000000001 x10: 0000000000000a20
[   19.152327] x9 : ffff800011ba3e70 x8 : ffff0000f6ea6180
[   19.152798] x7 : 00000000ffffffff x6 : 00000000351d78da
[   19.153270] x5 : 00ffffffffffffff x4 : 002b646607bcf500
[   19.153741] x3 : 0000000000000000 x2 : 0000000000000001
[   19.154212] x1 : 0000000000000000 x0 : 0000000000000000
[   19.154684] Call trace:
[   19.154906]  0xffff80000ee0257c
[   19.155196]  sched_ttwu_pending+0x58/0x168
[   19.155566]  flush_smp_call_function_queue+0xec/0x258
[   19.156018]  generic_smp_call_function_single_interrupt+0x14/0x20
[   19.156561]  handle_IPI+0x258/0x3e8
[   19.156876]  gic_handle_irq+0x154/0x158
[   19.157220]  el1_irq+0xb8/0x180
[   19.157505]  arch_cpu_idle+0x28/0x218
[   19.157836]  default_idle_call+0x1c/0x44
[   19.158188]  do_idle+0x210/0x288
[   19.158478]  cpu_startup_entry+0x28/0x68
[   19.158830]  secondary_start_kernel+0x140/0x178
[   19.159239] Code: bad PC value
[   19.159524] ---[ end trace 99042d0e071b2912 ]---
[   19.159936] Kernel panic - not syncing: Fatal exception in interrupt
[   19.160500] SMP: stopping secondary CPUs
[   20.327519] SMP: failed to stop secondary CPUs 3-5
[   20.327945] Kernel Offset: disabled
[   20.328257] CPU features: 0x240022,2000600c
[   20.328629] Memory Limit: none
[   20.328915] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

 

Edited by jbergler
more details
Link to post
Share on other sites