Jump to content

[RK3399 OPI4-LTS, 5.15-6.0] A NVME drive has a 50% chance of causing a kernel panic on boot


Recommended Posts

Posted (edited)

I'm using a kernel built from sources as there's another bug which the github sources resolve.

The bundled kernel with the OS image does not kernel panic. (However this image only has around a 5% chance of mounting any nvme drive)

Output:
 

Quote

[   22.057154] Kernel Offset: disabled
[   22.057155] CPU features: 0x800820f1,20000846
[   22.057159] Memory Limit: none
[   22.081651] ---[ end K panic - not syncing: Asynchronous SError Interrupt ]---
[   22.521737] ------------[ cut here ]------------
[   22.5217] WARNING: CPU: 4 PID: 1559 at kernel/sched/core.c:3027 set_task_cpu+0x168/0x230
[   22.521748] Modules linked in: hci_uart btqbtrtl btbcm btintel bluetooth dw_hdmi_i2s_audio dw_hdmi_cec snd_soc_hdmi_codec snd_soc_rockchip_2s snd_soc_rockchip_pcm hantro_v rockchip_vdec(C) rockchip_iep v4l2_h264 rockchip_rga videobuf2_dma_contig snd_soc_simple_card snd_soc_es8316 snd_soc_simple_card_utils videobuf2_vmalloc v4l2_mem2mem videobuf2_dma_sg videobufemops fusb302 tcpm typec videobuv4l2 snd_soc_core videobuf2_commdeodev snd_pcm_dmaengine snd_pcm mc cpufreq_dt snd_timer snd soundcore tcp_bbr binfmt_misc sch_fq ads7846 fbtft(C) sprdwl_ng nfs80211 auth_rpcgss nfs_acl lockd sprdbt_tty grace rfkill sunrpc ramoops ree   22.521832] CPU: 4 PID: 1559 Csmartd Tainted: G         C        5.15.79-rockchip64 #trunkac_rk spidev stmmac_platfostmmac pcs_xpcs adc_keys pwm_bl
[   22.521836] Hardware name: OrangePi 4 LTS (DT)
[   22.521841] pc : set_task_cpux168/0x230 -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   22.521845] lr : try_to_wake_up+0x1a4/0x584
[   22.521849] sp : ffff800009dfbd80[   22.521850] x29: ffff800009df80 x28: ffff0000f77832c0 x27: ff00bb3b7e0
[   22.521855] x26: ffff80000973d008 x25: ffff800009aadbf0 x24: 0000000000000005
[   22.521859] x23: ffff0000061c24a22: 00000000000000c0 x21: 0000000000000020
[   22.521864] x20: 0000000000000005 x19: ffff0000061c1d80 x18: fffffffffffee708
[   22.521868] x17: ffff8000ee041000 x16: ffff800009dfc000 x15: 00000000004000
[   22.521872] x14797341203a676e69 x13: 2d2d2d5d20747075 x12: 727265746e492072
[   22.521876] x11: 000000000000000 x10: ffff8000ee041000 x9 : 0000000000000004
[   22.521881] x8 : 0000000000000020 x7 : ffffffffffffe0 x6 : 0000000000000005
[   22.521885] x5 : 00000000000004 : 0000000000000001 x3 : 000000000000003f
[   22.521889] x2 : ffff0000061c1d80 x1 : ffff80000999e8 x0 : 0000000000000000
[   .521893] Call trace:
[   22.521894]  set_task_cpu+0x168/0x230
[   22.521899]  try_to_wake_up+01a4/0x584
[   22.521902]  wake_up_process+0x18/0x2c
[   22.521rtimer_wakeup+0x20/0x40
[   22.521913]  __hrtimer_run_queues+0x17c/0x340
[   22.521915]  hrtimer_interrupt+0xf4/0x250
[   22.918]  arch_timer_handler_phys+0x34/0x44
[   22.521923]  handle_percpu_devid_irq+0xa4/0x240
[   22.521929]  handle_domain_irq+0x98/0xe4
[   22.521932]  gic_handle_irq+0x54/0x130
[   22.521936]  call_on_irq_stack+0x28/0x5c
[   22.521940]  do_interrupt_hx54/0x60
[   22.521943]  el1_interrupt+0x30/0x80
[   22.521947]  el1h_64_irq_handler+0x18/0x24
[   22.521951]  el1h_64_irq+0x0x7c
[   22.521953]  __delay+0x90/0xb0
[   22.521958]  __const_udelay+0x2c/0x40
[   22.521961]  panic+0x324/0x35c
[   22.521]  nmi_panic+0x8c/0x90
[   22.521966]  arm64_serror_panic+0x64/0x74
[   22.521969]  do_serror+0x58/0x60
[   22.521971]  el1h_64_error_handler+0x30/0x50
[   22.521976]  el1h_64_error+0x78/0x7c
[   22.521978]  preempt_count_sub+0x44/0xdc
[   22.521981]spin_unlock+0x1c/0x44
[   22.521984]  nvme_submit_cmd+0xf0/0x110
[   22.521987]  nvme_queue_rq+0x380/0x8f0
[   22.521990]  blq_dispatch_rq_list+0x124/0x7ec
[   22.521992]  __blk_mq_sched_dispatch_requests+0xb4/0x154
[   22.521995]  blk_mq_sched_dispatch_requests+0x38/0x74
[   22.521998]  __blk_mq_run_hw_queue+0x5/0x90
[   22.522003]  __blk_mq_delay_run_hw_queue+0x1bc/0x1e0
522007]  blk_mq_run_hw_queue+0x94/0xfc
[   22.522011]  blk_mq_sched_insert_request+0xf4/0x120
[   22.522014]  blk_execute_rq_nowait+0x5c/0x90
[   22.522018]  blk_execute_rq+0x48/0xf0
[   22.522021]  nvme_execute_passthru_rq+0x6c/0x2e0
[   22.522025]  bmit_user_cmd+0x23c/0x3d0
[   22.522029]  nvme_user_cmd+0x13c/0x240
[   22.522032]  nvme_dev_ioctl+0xf0/0x260
[   22.522036] rm64_sys_ioctl+0xac/0xf0
[   22.522041]  invoke_syscall+0x48/0x114
[   22.522044]  el0_svc_common.constprop.0+0x44/0xec
[   222048]  do_el0_svc+0x24/0x90
[   22.522052]  el0_svc+0x20/0x60
[   22.522055]  el0t_64_sync_handler+0xe8/0x114
[   22.522060]  el0t_64_sync+0x1a0/0x1a4
[   22.522062] ---[ end trace 66009970b353284 ]---


 

Dump from 'armbianmonitor -U'

https://tmpfiles.org/dl/314994/dump.txt.tar

 

(The Armbian self-hosted system seems to be a bit moody when it wants to work)

Edited by Turbine
Posted

I don't have an nvme ssd around to replicate this problem, but still I don't understand what you mean with "github sources". Also your dump is a broken link.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines