Kernel panic in 5.8.17 20.08.21


Recommended Posts

Still experiencing random kernel panic with latest build. Sometimes it boots, sometimes it doesn't , when operating from a remote location this is very frustrating.

 

Spoiler

DDR Version 1.24 20191016
In
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 928MHZ
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 928 MHz, current 856MHz
OUT
Boot1: 2019-03-14, version: 1.19
CPUId = 0x0
ChipType = 0x10, 254
SdmmcInit=2 0
BootCapSize=100000
UserCapSize=14910MB
FwPartOffset=2000 , 100000
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=30436MB
FwPartOffset=2000 , 0
StorageInit ok = 79856
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!
LoadTrust Addr:0x4000
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0xdd6b0
RunBL31 0x40000
NOTICE:  BL31: v1.3(debug):42583b6
NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1190): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE init                                                                                        ialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.07-armbian (Oct 31 2020 - 08:21:38 +0100)

SoC: Rockchip rk3399
Reset cause: POR
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot/boot.scr
3185 bytes read in 6 ms (517.6 KiB/s)
## Executing script at 00500000
Boot script loaded from mmc 1
166 bytes read in 5 ms (32.2 KiB/s)
16003186 bytes read in 683 ms (22.3 MiB/s)
27331072 bytes read in 1160 ms (22.5 MiB/s)
79946 bytes read in 12 ms (6.4 MiB/s)
2698 bytes read in 8 ms (329.1 KiB/s)
Applying kernel provided DT fixup script (rockchip-fixup.scr)
## Executing script at 09000000
## Loading init Ramdisk from Legacy Image at 06000000 ...
   Image Name:   uInitrd
   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
   Data Size:    16003122 Bytes = 15.3 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 01f00000
   Booting using the fdt blob at 0x1f00000
   Loading Ramdisk to f4fa3000, end f5ee6032 ... OK
   Loading Device Tree to 00000000f4f27000, end 00000000f4fa2fff ... OK

Starting kernel ...

[   26.392050] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   26.392551] Modules linked in: r8152 snd_soc_hdmi_codec snd_soc_rockchip_i2s                                                                                         snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd panfrost leds_pwm gpio_char                                                                                        ger pwm_fan gpu_sched rockchipdrm soundcore rockchip_vdec(C) dw_mipi_dsi hantro_                                                                                        vpu(C) dw_hdmi v4l2_h264 rockchip_rga videobuf2_dma_contig analogix_dp videobuf2                                                                                        _vmalloc videobuf2_dma_sg v4l2_mem2mem videobuf2_memops videobuf2_v4l2 drm_kms_h                                                                                        elper videobuf2_common videodev sg cec rc_core mc fusb30x(C) drm drm_panel_orien                                                                                        tation_quirks gpio_beeper cpufreq_dt zfs(POE) zunicode(POE) zavl(POE) icp(POE) z                                                                                        lua(POE) nfsd auth_rpcgss zcommon(POE) nfs_acl znvpair(POE) lockd spl(OE) grace                                                                                         lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_me                                                                                        mcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek dwm                                                                                        ac_rk stmmac_platform stmmac mdio_xpcs adc_keys
[   26.399102] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P         C OE     5.8.17-                                                                                        rockchip64 #20.08.21
[   26.399884] Hardware name: Helios64 (DT)
[   26.400237] pstate: 00000085 (nzcv daIf -PAN -UAO BTYPE=--)
[   26.400740] pc : do_undefinstr+0x2ec/0x310
[   26.401107] lr : do_undefinstr+0x1e0/0x310
[   26.401472] sp : ffff800011adbcd0
[   26.401769] x29: ffff800011adbcd0 x28: ffff0000f6ea5700
[   26.402242] x27: ffff0000f6ea5700 x26: ffff800011adc000
[   26.402714] x25: ffff800011501d20 x24: 0000000000000000
[   26.403186] x23: 0000000060000085 x22: ffff800010df9fc0
[   26.403658] x21: ffff800011adbe80 x20: ffff0000f6ea5700
[   26.404130] x19: ffff800011adbd40 x18: 0000000000000000
[   26.404601] x17: 00018f3ebe947358 x16: 000142208873be78
[   26.405073] x15: 0000000000000006 x14: 000000000000021c
[   26.405544] x13: 000000000000029a x12: 00000000000002a4
[   26.406015] x11: 0000000000000001 x10: 0000000000000a20
[   26.406487] x9 : ffff800011ba3e70 x8 : ffff0000f6ea6180
[   26.406958] x7 : 00000000ffffffff x6 : 000000000000001f
[   26.407430] x5 : 0000000000000000 x4 : ffff800011816118
[   26.407901] x3 : 0000000000000005 x2 : 0000000000010002
[   26.408373] x1 : ffff0000f6ea5700 x0 : 0000000060000085
[   26.408845] Call trace:
[   26.409068]  do_undefinstr+0x2ec/0x310
[   26.409406]  el1_sync_handler+0x88/0x110
[   26.409757]  el1_sync+0x7c/0x100
[   26.410051]  check_preemption_disabled+0x18/0x108
[   26.410470]  debug_smp_processor_id+0x20/0x30
[   26.410862]  sched_ttwu_pending+0x34/0x168
[   26.411230]  flush_smp_call_function_queue+0xec/0x258
[   26.411682]  generic_smp_call_function_single_interrupt+0x14/0x20
[   26.412223]  handle_IPI+0x258/0x3e8
[   26.412539]  gic_handle_irq+0x154/0x158
[   26.412882]  el1_irq+0xb8/0x180
[   26.413166]  arch_cpu_idle+0x28/0x218
[   26.413496]  default_idle_call+0x1c/0x44
[   26.413847]  do_idle+0x210/0x288
[   26.414137]  cpu_startup_entry+0x28/0x68
[   26.414490]  secondary_start_kernel+0x140/0x178
[   26.414898] Code: f9401bf7 17ffff7d a9025bf5 f9001bf7 (d4210000)
[   26.415448] ---[ end trace 720e27ef39d9569d ]---
[   26.415860] Kernel panic - not syncing: Fatal exception in interrupt
[   26.416425] SMP: stopping secondary CPUs
[   26.416779] Kernel Offset: disabled
[   26.417091] CPU features: 0x240022,2000600c
[   26.417461] Memory Limit: none
[   26.417746] Rebooting in 90 seconds..

 

 

Edited by TRS-80
put long output inside code block inside spoiler
Link to post
Share on other sites
Donate and support the project!

Additional crash

 

Spoiler

Starting kernel ...

[   24.230268] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   24.230776] Modules linked in: snd_soc_hdmi_codec r8152 snd_soc_rockchip_i2s snd_soc_core rockchip_vdec(C) hantro_vpu(C) snd_pcm_dmaengine pwm_fan leds_pwm gpio_charger rockchip_rga snd_pcm v4l2_h264 videobuf2_dma_contig rockchipdrm snd_timer v4l2_mem2mem videobuf2_dma_sg videobuf2_vmalloc snd panfrost dw_mipi_dsi videobuf2_memops videobuf2_v4l2 dw_hdmi soundcore gpu_sched videobuf2_common analogix_dp videodev drm_kms_helper fusb30x(C) cec mc rc_core sg drm drm_panel_orientation_quirks zfs(POE) gpio_beeper cpufreq_dt zunicode(POE) zavl(POE) icp(POE) zlua(POE) nfsd auth_rpcgss zcommon(POE) nfs_acl znvpair(POE) lockd grace spl(OE) lm75 sunrpc ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod realtek dwmac_rk stmmac_platform stmmac mdio_xpcs adc_keys
[   24.237322] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P         C OE     5.8.17-rockchip64 #20.08.21
[   24.238105] Hardware name: Helios64 (DT)
[   24.238459] pstate: 80000085 (Nzcv daIf -PAN -UAO BTYPE=--)
[   24.238963] pc : sched_clock+0x3c/0x90
[   24.239301] lr : sched_clock_cpu+0x14/0x28
[   24.239664] sp : ffff800011ae3e60
[   24.239961] x29: 0000000000000100 x28: 0000000000000001
[   24.240434] x27: ffff0000f6ea6580 x26: ffff800011ae4000
[   24.240907] x25: ffff800011501d20 x24: 0000000000000168
[   24.241378] x23: ffff800011836f08 x22: 0000000000000004
[   24.241850] x21: ffff815c18a74108 x20: ffff800011836f00
[   24.242322] x19: ffff015c2a2ab010 x18: 0000000000000000
[   24.242793] x17: 00022f77f6ad6250 x16: ffff0000e0453fa8
[   24.243265] x15: 0000000000000006 x14: 00001401d89ead14
[   24.243736] x13: 0000000000000339 x12: 000000000000033f
[   24.244207] x11: 0000000000000001 x10: 0000000000000a20
[   24.244679] x9 : ffff800011babe70 x8 : ffff0000f6ea7000
[   24.245150] x7 : 00000000ffffffff x6 : 00000000388e7b38
[   24.245622] x5 : 00ffffffffffffff x4 : 003305860a2eef00
[   24.246093] x3 : 0000000000000000 x2 : 0000000000000080
[   24.246565] x1 : 0000000000000004 x0 : 0000000000000005
[   24.247036] Call trace:
[   24.247259]  sched_clock+0x3c/0x90
[   24.247570] Code: d50339bf 120002d5 9bb87eb5 8b1502f3 (f9400e60)
[   24.248120] ---[ end trace af056133bccc9297 ]---
[   24.248531] Kernel panic - not syncing: Fatal exception in interrupt
[   24.249096] SMP: stopping secondary CPUs
[   25.416116] SMP: failed to stop secondary CPUs 0-1,5
[   25.416556] Kernel Offset: disabled
[   25.416869] CPU features: 0x240022,2000600c
[   25.417241] Memory Limit: none
[   25.417527] Rebooting in 90 seconds..

 

 

Link to post
Share on other sites

@SymbiosisSystems

Since your system is crashing often, my guess is that you're not using it on a PROD environment yet.

Have you tried the test builds at the bottom of the downloads page with the newer kernel? You might have better luck with those. I'm using the test build from Nov.13 and it has been very stable.

I'm not using OMV just the OS and a few packages that I've configured manually to provide SMB, DLNA & iSCSI services.

Link to post
Share on other sites

So good news / bad news. The good news is that the 5.9.x kernel does indeed seem to have resolved the bootime kernel panics I was experiencing and I now seem to be able to reboot the board consistently without being stuck in any of the previous kernel panic / retry boot loops. The bad news is that I can't get zfs working as 0.8.4 in the backports repo doesn't support kernels after 5.6 and 0.8.5 isn't compatible with the zfsutils-linux package!

Link to post
Share on other sites

@SymbiosisSystems I take it you have a set of 0.8.5 modules built? They work fine with the 0.8.4 zfsutils-linux package, but it requires the zfs-dkms package which will fail to build. We can work around this by installing a dummy package that provides zfs-dkms so that we then can go ahead and install zfsutils-linux / zfs-zed / etc. from backports.

 

Here's how you can create a dummy package:

apt-get install --yes equivs

mkdir zfs-dkms-dummy; cd $_
cat <<EOF >zfs-dkms
Section: misc
Priority: optional
Standards-Version: 3.9.2

Package: zfs-dkms-dummy
Version: 0.8.4
Maintainer: Me <me@localhost>
Provides: zfs-dkms
Architecture: all
Description: Dummy zfs-dkms package for when using built kmod
EOF

equivs-build zfs-dkms
dpkg -i zfs-dkms-dummy_0.8.4_all.deb

 

After this, you can go ahead and install (if not already installed) the 0.8.5 modules (kmod-zfs-5.*-rockchip64_0.8.5-1_arm64.deb) and zfsutils-linux.

Link to post
Share on other sites