Aurimas Posted January 16, 2019 Posted January 16, 2019 After upgrading from 4.14.70-sunxi64 to 4.19.13-sunxi64 Pine64+ board started to misbehave after running for some time. Dmesg shows: Quote saus. 14 03:26:23 pine64 kernel: rcu: INFO: rcu_sched self-detected stall on CPU saus. 14 03:26:23 pine64 kernel: rcu: 2-...0: (286 GPs behind) idle=ad6/0/0x1 softirq=380346/380347 fqs=2 saus. 14 03:26:23 pine64 kernel: rcu: (t=60094 jiffies g=291549 q=3) saus. 14 03:26:23 pine64 kernel: Task dump for CPU 2: saus. 14 03:26:23 pine64 kernel: swapper/2 R running task 0 0 1 0x0000002a saus. 14 03:26:23 pine64 kernel: Call trace: saus. 14 03:26:23 pine64 kernel: dump_backtrace+0x0/0x1c0 saus. 14 03:26:23 pine64 kernel: show_stack+0x14/0x20 saus. 14 03:26:23 pine64 kernel: sched_show_task+0x160/0x198 saus. 14 03:26:23 pine64 kernel: dump_cpu_task+0x40/0x50 saus. 14 03:26:23 pine64 kernel: rcu_dump_cpu_stacks+0xc0/0x100 saus. 14 03:26:23 pine64 kernel: rcu_check_callbacks+0x594/0x780 saus. 14 03:26:23 pine64 kernel: update_process_times+0x2c/0x58 saus. 14 03:26:23 pine64 kernel: tick_sched_handle.isra.5+0x30/0x48 saus. 14 03:26:23 pine64 kernel: tick_sched_timer+0x48/0x98 saus. 14 03:26:23 pine64 kernel: __hrtimer_run_queues+0xe4/0x1f8 saus. 14 03:26:23 pine64 kernel: hrtimer_interrupt+0xf4/0x2b0 saus. 14 03:26:23 pine64 kernel: arch_timer_handler_phys+0x28/0x40 saus. 14 03:26:23 pine64 kernel: handle_percpu_devid_irq+0x80/0x138 saus. 14 03:26:23 pine64 kernel: generic_handle_irq+0x24/0x38 saus. 14 03:26:23 pine64 kernel: __handle_domain_irq+0x5c/0xb0 saus. 14 03:26:23 pine64 kernel: gic_handle_irq+0x58/0xa8 saus. 14 03:26:23 pine64 kernel: el1_irq+0xb0/0x140 saus. 14 03:26:23 pine64 kernel: arch_cpu_idle+0x10/0x18 saus. 14 03:26:23 pine64 kernel: do_idle+0x1d4/0x298 saus. 14 03:26:23 pine64 kernel: cpu_startup_entry+0x24/0x28 saus. 14 03:26:23 pine64 kernel: secondary_start_kernel+0x18c/0x1c8 saus. 14 03:26:23 pine64 kernel: Task dump for CPU 3: saus. 14 03:26:23 pine64 kernel: tor R running task 0 2964 1 0x00000802 saus. 14 03:26:23 pine64 kernel: Call trace: saus. 14 03:26:23 pine64 kernel: __switch_to+0x94/0xd8 saus. 14 03:26:23 pine64 kernel: do_mem_abort+0x54/0x100 saus. 14 03:26:23 pine64 kernel: el0_da+0x20/0x24 Armbian-release: BOARD=pine64 BOARD_NAME="Pine64" BOARDFAMILY=sun50iw1 VERSION=5.70 LINUXFAMILY=sunxi64 BRANCH=next ARCH=arm64 IMAGE_TYPE=stable BOARD_TYPE=conf INITRD_ARCH=arm64 KERNEL_IMAGE_TYPE=Image With Armbian Pine64 3.10 kernel board worked without hanging for 2+ years. I also tried 4.18.xx-sunxi64 kernel from dev branch - same problem.
Aurimas Posted January 16, 2019 Author Posted January 16, 2019 armbianmonitor -u output is here http://ix.io/1ytO 1
Aurimas Posted January 30, 2019 Author Posted January 30, 2019 This morning upgraded to 4.19.17-sunxi64 (5.73), stalled after ~7.5 hours of runtime: Quote [26338.303225] mmc0: Card stuck in programming state! mmc_do_erase [26375.451469] mmc0: Card stuck in programming state! mmc_do_erase [26380.881967] mmc0: Card stuck in programming state! mmc_do_erase [26404.645290] mmc0: Card stuck in programming state! mmc_do_erase [26409.085215] mmc0: Card stuck in programming state! mmc_do_erase [26412.041056] mmc0: Card stuck in programming state! mmc_do_erase [26412.047006] print_req_error: I/O error, dev mmcblk0, sector 2830472 [26412.055674] sunxi-mmc 1c0f000.mmc: data error, sending stop command [26412.062057] sunxi-mmc 1c0f000.mmc: send stop command failed [27219.332955] rcu: INFO: rcu_sched self-detected stall on CPU [27219.338555] rcu: 1-...0: (1 GPs behind) idle=bde/1/0x4000000000000002 softirq=1744556/1744557 fqs=2577 [27219.348024] rcu: (t=5250 jiffies g=1620509 q=13602) [27219.353075] Task dump for CPU 1: [27219.353078] systemd-journal R running task 0 297 1 0x00000802 [27219.353087] Call trace: [27219.353103] dump_backtrace+0x0/0x1c0 [27219.353109] show_stack+0x14/0x20 [27219.353115] sched_show_task+0x160/0x198 [27219.353121] dump_cpu_task+0x40/0x50 [27219.353128] rcu_dump_cpu_stacks+0xc0/0x100 [27219.353132] rcu_check_callbacks+0x594/0x780 [27219.353138] update_process_times+0x2c/0x58 [27219.353144] tick_sched_handle.isra.5+0x30/0x48 [27219.353149] tick_sched_timer+0x48/0x98 [27219.353154] __hrtimer_run_queues+0xe4/0x1f8 [27219.353158] hrtimer_interrupt+0xf4/0x2b0 [27219.353166] arch_timer_handler_phys+0x28/0x40 [27219.353172] handle_percpu_devid_irq+0x80/0x138 [27219.353178] generic_handle_irq+0x24/0x38 [27219.353183] __handle_domain_irq+0x5c/0xb0 [27219.353187] gic_handle_irq+0x58/0xa8 [27219.353191] el1_irq+0xb0/0x140 [27219.353198] __seccomp_filter+0x20/0x428 [27219.353202] __secure_computing+0x38/0xc8 [27219.353207] syscall_trace_enter+0x98/0x110 [27219.353215] el0_svc_common+0xc0/0x100 [27219.353219] el0_svc_handler+0x24/0x80 [27219.353223] el0_svc+0x8/0xc
Gene Liu Posted February 19, 2019 Posted February 19, 2019 same here. any suggestions or workaround would be appreciated.
Magnets Posted May 4, 2019 Posted May 4, 2019 I have a similar issue on opi PC2 with 4.19.25-sunxi64 and a DVB-T2 USB dongle. The problem is solved by locking the frequency scaling. E.g. I have # WARNING: this file will be replaced on board support package (linux-root-...) upgrade ENABLE=true MIN_SPEED=120000 MAX_SPEED=1400000 #MAX_SPEED=500000 GOVERNOR=conservative If MIN_SPEED and MAX_SPEED are the same you don't get any problems. Performance gov works OK (it locks to max freq) ondemand or interactive don't fix it unless you lock frequency I can lock frequency at 420mhz and it works fine so it's not related to actual frequency If I set MIN=800 and max=1300 it still gives problems my error: [ 6382.756348] rcu: INFO: rcu_sched self-detected stall on CPU [ 6382.756892] rcu: 1-....: (5601 ticks this GP) idle=d8a/1/0x4000000000000004 softirq=227944/227947 fqs=2034 [ 6382.756957] rcu: (t=5250 jiffies g=486985 q=1835) [ 6382.757208] Task dump for CPU 1: [ 6382.757310] kworker/1:2 R running task 0 4002 2 0x0000002a [ 6382.757904] Workqueue: events dbs_work_handler [ 6382.758057] Call trace: [ 6382.758316] dump_backtrace+0x0/0x1c0 [ 6382.758526] show_stack+0x14/0x20 [ 6382.758703] sched_show_task+0x160/0x198 [ 6382.758900] dump_cpu_task+0x40/0x50 [ 6382.759103] rcu_dump_cpu_stacks+0xc0/0x100 [ 6382.759281] rcu_check_callbacks+0x594/0x780 [ 6382.759455] update_process_times+0x2c/0x58 [ 6382.759664] tick_sched_handle.isra.5+0x30/0x48 [ 6382.759844] tick_sched_timer+0x48/0x98 [ 6382.760013] __hrtimer_run_queues+0xe4/0x1f8 [ 6382.760172] hrtimer_interrupt+0xf4/0x2b0 [ 6382.760378] arch_timer_handler_phys+0x28/0x40 [ 6382.760563] handle_percpu_devid_irq+0x80/0x138 [ 6382.760765] generic_handle_irq+0x24/0x38 [ 6382.760960] __handle_domain_irq+0x5c/0xb0 [ 6382.761107] gic_handle_irq+0x58/0xa8 [ 6382.761249] el1_irq+0xb0/0x140 [ 6382.761466] __usb_hcd_giveback_urb+0x98/0x148 [ 6382.761649] usb_giveback_urb_bh+0xdc/0x170 [ 6382.761821] tasklet_action_common.isra.3+0x7c/0x168 [ 6382.761961] tasklet_action+0x24/0x30 [ 6382.762107] __do_softirq+0x10c/0x200 [ 6382.762238] irq_exit+0xac/0xc0 [ 6382.762431] __handle_domain_irq+0x60/0xb0 [ 6382.762567] gic_handle_irq+0x58/0xa8 [ 6382.762706] el1_irq+0xb0/0x140 [ 6382.762915] clk_propagate_rate_change+0x30/0xe0 [ 6382.763088] clk_propagate_rate_change+0x90/0xe0 [ 6382.763262] clk_propagate_rate_change+0x90/0xe0 [ 6382.763465] clk_core_set_rate_nolock+0x1c0/0x1f8 [ 6382.763649] clk_set_rate+0x38/0xa8 [ 6382.763807] dev_pm_opp_set_rate+0x1f0/0x540 [ 6382.764083] set_target+0x40/0x70 [cpufreq_dt] [ 6382.764276] __cpufreq_driver_target+0x184/0x5b0 [ 6382.764463] od_dbs_update+0x140/0x1a0 [ 6382.764660] dbs_work_handler+0x3c/0x70 [ 6382.764858] process_one_work+0x1e4/0x360 [ 6382.765042] worker_thread+0x48/0x4b0 [ 6382.765195] kthread+0x128/0x130 [ 6382.765367] ret_from_fork+0x10/0x1c some others with the issue: https://github.com/armbian/build/issues/1000
Aurimas Posted July 11, 2019 Author Posted July 11, 2019 4.19.57-sunxi64 (5.90) – still hangs with the same rcu_sched self-detected stall on CPU.
Recommended Posts