RSS Bot Posted February 6 Posted February 6 Description stmmac ethernet driver is in use on all rockchip SoCs; its statistics system has been revised on kernel 6.6, but something is not working correctly right now on 32 bit platforms and a lock contemption happens very often when an userland process tries to read the statistics of the ethernet device. One of the most common victim is vnstat daemon, which often gets stuck hogging a cpu core and skyrocketing the average load. This also makes the whole kernel angry, RCU complains, processes hang, and so on... The patch provides a horrible workaround (remove the mutual exclusion in the reader) to fix the problem temporarely. This may break ethernet statistics that can supply freaky numbers, but at least fixes the problem for the time being. This is the kernel dump from the hung task: [ 696.614056] Sending NMI from CPU 0 to CPUs 2: [ 696.614071] NMI backtrace for cpu 2 [ 696.614079] CPU: 2 PID: 24 Comm: migration/2 Tainted: G C 6.6.12-current-rockchip #25 [ 696.614088] Hardware name: Generic DT based system [ 696.614092] Stopper: multi_cpu_stop+0x0/0x128 <- stop_machine_cpuslocked+0x118/0x180 [ 696.614113] PC is at rcu_momentary_dyntick_idle+0x44/0xc4 [ 696.614122] LR is at multi_cpu_stop+0xe4/0x128 [ 696.614130] pc : [<b0197d74>] lr : [<b01e03b4>] psr: 60070113 [ 696.614135] sp : f0889ed0 ip : 00000000 fp : f0889edc [ 696.614140] r10: 00000001 r9 : 00000000 r8 : 00000000 [ 696.614143] r7 : b1505058 r6 : f093dddc r5 : 00000001 r4 : f093ddf0 [ 696.614148] r3 : ee6b7598 r2 : 2aaeeb3c r1 : 00000000 r0 : 3d234000 [ 696.614153] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 696.614160] Control: 10c5387d Table: 6f38806a DAC: 00000051 [ 696.614163] Backtrace: [ 696.614167] rcu_momentary_dyntick_idle from multi_cpu_stop+0xe4/0x128 [ 696.614181] multi_cpu_stop from cpu_stopper_thread+0x78/0x148 [ 696.614200] r10:f093ddf4 r9:ee6b606c r8:ee6b6074 r7:f093dddc r6:b01e02d0 r5:b9322b00 [ 696.614204] r4:ee6b6068 [ 696.614207] cpu_stopper_thread from smpboot_thread_fn+0xc0/0x15c [ 696.614223] r10:00000000 r9:00000002 r8:b15dd06c r7:00000001 r6:b9322b00 r5:b927dcc0 [ 696.614228] r4:00000000 [ 696.614231] smpboot_thread_fn from kthread+0xe8/0x104 [ 696.614246] r10:00000000 r9:f0821d8c r8:b927df00 r7:b927dcc0 r6:b014e190 r5:b9322b00 [ 696.614250] r4:b927de00 r3:00000000 [ 696.614254] kthread from ret_from_fork+0x14/0x28 [ 696.614263] Exception stack(0xf0889fb0 to 0xf0889ff8) [ 696.614269] 9fa0: 00000000 00000000 00000000 00000000 [ 696.614277] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 696.614283] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 696.614292] r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:b0149778 r4:b927de00 [ 696.615072] CPU: 0 PID: 1258 Comm: vnstatd Tainted: G C 6.6.12-current-rockchip #25 [ 696.615084] Hardware name: Generic DT based system [ 696.615091] PC is at stmmac_get_stats64+0x38/0x1a0 [ 696.615104] LR is at 0x1 [ 696.615115] pc : [<b098c188>] lr : [<00000001>] psr: 20000013 [ 696.615123] sp : f2d6dbf4 ip : f2d6dc20 fp : f2d6dc1c [ 696.615130] r10: 01000001 r9 : b96a0000 r8 : b633e4c8 [ 696.615138] r7 : 00000001 r6 : 00000000 r5 : b098c150 r4 : b96a3000 [ 696.615146] r3 : 000370b7 r2 : b96a2e48 r1 : f2d6dcc8 r0 : b96a0000 [ 696.615155] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 696.615164] Control: 10c5387d Table: 6f08006a DAC: 00000051 [ 696.615170] Backtrace: [ 696.615180] stmmac_get_stats64 from dev_get_stats+0x44/0x168 [ 696.615203] r10:01000001 r9:b96a0000 r8:b633e4c8 r7:b1087af8 r6:b96a0000 r5:b098c150 [ 696.615210] r4:f2d6dcc8 [ 696.615216] dev_get_stats from dev_seq_printf_stats+0x34/0x178 [ 696.615241] r9:b96a0000 r8:b633e4c8 r7:f2d6de20 r6:00000000 r5:b633e4b0 r4:b96a0000 [ 696.615247] dev_seq_printf_stats from dev_seq_show+0x18/0x34 [ 696.615264] r5:00000000 r4:b633e4b0 [ 696.615271] dev_seq_show from seq_read_iter+0x3cc/0x50c [ 696.615288] seq_read_iter from seq_read+0x8c/0xbc [ 696.615310] r10:f2d6df68 r9:00000400 r8:00000000 r7:00004004 r6:00000400 r5:b2f38600 [ 696.615319] r4:f2d6df68 [ 696.615326] seq_read from proc_reg_read+0xb4/0xd8 [ 696.615347] r8:01c70fa0 r7:b2f38600 r6:00000000 r5:b0343274 r4:b9448c00 [ 696.615355] proc_reg_read from vfs_read+0xb8/0x2dc [ 696.615374] r10:b0395cb4 r9:b56ba040 r8:01c70fa0 r7:f2d6df68 r6:b2f38600 r5:00000400 [ 696.615381] r4:00000000 r3:f2d6df68 [ 696.615388] vfs_read from ksys_read+0x68/0xec [ 696.615406] r10:00000003 r9:b56ba040 r8:b01002c8 r7:00000000 r6:00000000 r5:b2f38600 [ 696.615415] r4:b2f38600 [ 696.615420] ksys_read from sys_read+0x10/0x14 [ 696.615436] r7:00000003 r6:a6f425a0 r5:000005e8 r4:01c571a8 [ 696.615443] sys_read from __sys_trace_return+0x0/0x10 [ 696.615457] Exception stack(0xf2d6dfa8 to 0xf2d6dff0) [ 696.615468] dfa0: 01c571a8 000005e8 00000004 01c70fa0 00000400 00000000 [ 696.615480] dfc0: 01c571a8 000005e8 a6f425a0 00000003 0000000a aed87604 00000000 00000000 [ 696.615489] dfe0: 00000003 aed874f0 a6d339a7 a6cadb06 Important notes: this issue may affect other 32 bit platforms as well! stmmac driver is in most SoCs supported by armbian, with the curious exception of Allwinner H3. Meson8/Meson8b (S802/S805) also use it and may be affected by the same problem. 64 bit platforms are not affected, since the locking mechanism is a no-op on 64 bit archs. Jira reference number AR-2048 - Celebrate 2^11th ticket :smile_cat: How Has This Been Tested? [x] Kernel 6.6 compiled and tested on live system [x] Kernel 6.7 compiled Checklist: [x] My code follows the style guidelines of this project [x] I have performed a self-review of my own code [x] I have commented my code, particularly in hard-to-understand areas [x] I have made corresponding changes to the documentation [x] My changes generate no new warnings [x] Any dependent changes have been merged and published in downstream modules View the full article
Recommended Posts