Jump to content

Recommended Posts

Posted
Armbianmonitor:

Has anybody seen this message at boot:

[    5.352826] CPU4: failed to come online
[    5.352835] CPU4: failed in unknown state : 0x0
[   10.481328] CPU5: failed to come online
[   10.481336] CPU5: failed in unknown state : 0x0

 

Then, the A72 cores don't show up (unsurprisingly since these are the two cores that somehow did not come online). On top of that, the kernel throws a backtrace (related to sound if I understand correctly):

------------[ cut here ]------------
[   13.014782] WARNING: CPU: 3 PID: 1 at kernel/irq/manage.c:1990 request_threaded_irq+0x144/0x180
[   13.014784] Modules linked in:
[   13.014793] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.4.26-rockchip64 #20.02.5
[   13.014795] Hardware name: Pine64 RockPro64 (DT)
[   13.014799] pstate: a0000005 (NzCv daif -PAN -UAO)
[   13.014804] pc : request_threaded_irq+0x144/0x180
[   13.014808] lr : request_threaded_irq+0x6c/0x180
[   13.014810] sp : ffff80001004b9b0
[   13.014813] x29: ffff80001004b9b0 x28: 0000000000000000
[   13.014817] x27: ffff0000ef78c0c0 x26: ffff8000111c8d98
[   13.014822] x25: 0000000000000000 x24: 0000000000000007
[   13.014826] x23: ffff0000f0914870 x22: ffff800010b2dce0
[   13.014830] x21: ffff0000f142a000 x20: 0000000000000000
[   13.014834] x19: ffff80001141bee0 x18: 0000000000000001
[   13.014838] x17: ffff800011188d00 x16: ffff800011188d08
[   13.014843] x15: ffffffffffffffff x14: ffff80001137b508
[   13.014847] x13: ffff00016f1e14b7 x12: ffff0000ef1e14c3
[   13.014851] x11: ffff0000f67ac268 x10: 0000000000000040
[   13.014855] x9 : ffff80001139f028 x8 : ffff80001139f020
[   13.014859] x7 : ffff0000f10002a8 x6 : 0000000000000000
[   13.014863] x5 : ffff0000f1000248 x4 : 0000000000000000
[   13.014867] x3 : 0000000000000000 x2 : 0000000000000000
[   13.014871] x1 : 0000000000000007 x0 : 0000000000031600
[   13.014875] Call trace:
[   13.014880]  request_threaded_irq+0x144/0x180
[   13.014887]  snd_mtpav_probe+0x15c/0x3d8
[   13.014893]  platform_drv_probe+0x50/0xa0
[   13.014899]  really_probe+0xd8/0x300
[   13.014902]  driver_probe_device+0x54/0xe8
[   13.014906]  __device_attach_driver+0x80/0xb8
[   13.014910]  bus_for_each_drv+0x78/0xc8
[   13.014915]  __device_attach+0xd4/0x130
[   13.014918]  device_initial_probe+0x10/0x18
[   13.014922]  bus_probe_device+0x90/0x98
[   13.014927]  device_add+0x3c4/0x5f0
[   13.014930]  platform_device_add+0x10c/0x230
[   13.014934]  platform_device_register_full+0xc8/0x140
[   13.014940]  alsa_card_mtpav_init+0x74/0xd0
[   13.014945]  do_one_initcall+0x74/0x1b0
[   13.014950]  kernel_init_freeable+0x194/0x22c
[   13.014957]  kernel_init+0x10/0xfc
[   13.014961]  ret_from_fork+0x10/0x18
[   13.014969] ---[ end trace 34ce35f0c45c0a90 ]---

 

Mathias

Posted

I have installed the latest stale kernel from Armbian (5.4.27) and this does exactly the same... I will try to power down the system, leave it off for a few seconds and then restart, just in case...

Posted

After a cold boot, I don't have cpuerrors anymore (on 5.4.27). I've waited ~30s before restarting the system. I still have a crash but the kernel can recover (see http://ix.io/2fDO):

[   41.902116] ------------[ cut here ]------------
[   41.902135] WARNING: CPU: 4 PID: 1 at kernel/irq/manage.c:1990 request_threaded_irq+0x144/0x180
[   41.902138] Modules linked in:
[   41.902149] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.4.27-rockchip64 #20.02.6
[   41.902153] Hardware name: Pine64 RockPro64 (DT)
[   41.902158] pstate: a0000005 (NzCv daif -PAN -UAO)
[   41.902165] pc : request_threaded_irq+0x144/0x180
[   41.902171] lr : request_threaded_irq+0x6c/0x180
[   41.902174] sp : ffff80001004b9b0
[   41.902178] x29: ffff80001004b9b0 x28: 0000000000000000
[   41.902185] x27: ffff0000ef2428c0 x26: ffff8000111c8d98
[   41.902190] x25: 0000000000000000 x24: 0000000000000007
[   41.902195] x23: ffff0000f0d77870 x22: ffff800010b2dd80
[   41.902201] x21: ffff0000f142a000 x20: 0000000000000000
[   41.902206] x19: ffff80001141bee0 x18: 0000000000000001
[   41.902211] x17: ffff0000f0d75a00 x16: ffff800010aa6aa0
[   41.902216] x15: ffffffffffffffff x14: ffff80001137b508
[   41.902222] x13: ffff00016f291b37 x12: ffff0000ef291b43
[   41.902227] x11: ffff0000f67c2268 x10: 0000000000000040
[   41.902232] x9 : ffff80001139f028 x8 : ffff80001139f020
[   41.902238] x7 : ffff0000f10002a8 x6 : 0000000000000000
[   41.902243] x5 : ffff0000f1000248 x4 : 0000000000000000
[   41.902248] x3 : 0000000000000000 x2 : 0000000000000000
[   41.902253] x1 : 0000000000000007 x0 : 0000000000031600
[   41.902258] Call trace:
[   41.902265]  request_threaded_irq+0x144/0x180
[   41.902274]  snd_mtpav_probe+0x15c/0x3d8
[   41.902281]  platform_drv_probe+0x50/0xa0
[   41.902288]  really_probe+0xd8/0x300
[   41.902293]  driver_probe_device+0x54/0xe8
[   41.902297]  __device_attach_driver+0x80/0xb8
[   41.902303]  bus_for_each_drv+0x78/0xc8
[   41.902309]  __device_attach+0xd4/0x130
[   41.902313]  device_initial_probe+0x10/0x18
[   41.902319]  bus_probe_device+0x90/0x98
[   41.902324]  device_add+0x3c4/0x5f0
[   41.902329]  platform_device_add+0x10c/0x230
[   41.902334]  platform_device_register_full+0xc8/0x140
[   41.902341]  alsa_card_mtpav_init+0x74/0xd0
[   41.902348]  do_one_initcall+0x74/0x1b0
[   41.902354]  kernel_init_freeable+0x194/0x22c
[   41.902361]  kernel_init+0x10/0xfc
[   41.902367]  ret_from_fork+0x10/0x18
[   41.902374] ---[ end trace f53d3c1ec0afdd56 ]---

 

Mathias

Posted

For me, it seems to be working perfectly fine. I've just transferred more than 250G of backups back on my sata drive, no issues.

Posted
On 3/30/2020 at 3:07 PM, Mathias said:

For me, it seems to be working perfectly fine. I've just transferred more than 250G of backups back on my sata drive, no issues.

Thanks.

Posted

From what I can read from the backtrace, the issue appears while invoking the 'probe' function of the snd-mtpav driver so...

 

If you still have the same issue, could you try to blacklist snd-mtpav ( add "blacklist snd-mtpav" in /etc/modprobe.d/blacklist ). Of course, you might have no sound during the test.

Posted

I've tried to blacklist snd-mtpav, and besides not having the module loaded (lsmod shows that the module is not loaded), I still have the crash at boot... It does not makes sense to me...

Posted

Yeah, the causes of the CPU not coming online and the cause of the WARNING message are completely different.

 

It seems that CPU 4 and 5 belong to another part of the system :

 

[ 0.000000] GICv3: GIC: PPI partition interrupt-partition-0[0] { /cpus/cpu@0[0] /cpus/cpu@1[1] /cpus/cpu@2[2] /cpus/cpu@3[3] }

 

[ 0.000000] GICv3: GIC: PPI partition interrupt-partition-1[1] { /cpus/cpu@100[4] /cpus/cpu@101[5] }

 

So I *guess* that something is happening during the initialization of interrupt-partition-1[1] bringing both CPU down the drain.

 

However, your first logs seem to suggest that 5.5 kernel were able to put the CPU online... Hmm...

 

Does the problem happen on every boot ? Only after a cold boot ? Only after a reboot ?

If that happens from times to times, that might be a timing issue...

 

But I don't really see any patch to the GICv3 driver that could have fixed this issue directly :

https://github.com/torvalds/linux/commits/master/drivers/irqchip/irq-gic-v3.c

 

They only fixed a few issues that mattered to Cavium boards, when it comes to ARM64 specific patches.

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines