Jump to content

Pine64+ on 4.19.13-sunxi64: rcu_sched self-detected stall on CPU


Recommended Posts

Posted

After upgrading from 4.14.70-sunxi64 to 4.19.13-sunxi64 Pine64+ board started to misbehave after running for some time.

Dmesg shows:

Quote

saus. 14 03:26:23 pine64 kernel: rcu: INFO: rcu_sched self-detected stall on CPU
saus. 14 03:26:23 pine64 kernel: rcu:         2-...0: (286 GPs behind) idle=ad6/0/0x1 softirq=380346/380347 fqs=2
saus. 14 03:26:23 pine64 kernel: rcu:          (t=60094 jiffies g=291549 q=3)
saus. 14 03:26:23 pine64 kernel: Task dump for CPU 2:
saus. 14 03:26:23 pine64 kernel: swapper/2       R  running task        0     0      1 0x0000002a
saus. 14 03:26:23 pine64 kernel: Call trace:
saus. 14 03:26:23 pine64 kernel:  dump_backtrace+0x0/0x1c0
saus. 14 03:26:23 pine64 kernel:  show_stack+0x14/0x20
saus. 14 03:26:23 pine64 kernel:  sched_show_task+0x160/0x198
saus. 14 03:26:23 pine64 kernel:  dump_cpu_task+0x40/0x50
saus. 14 03:26:23 pine64 kernel:  rcu_dump_cpu_stacks+0xc0/0x100
saus. 14 03:26:23 pine64 kernel:  rcu_check_callbacks+0x594/0x780
saus. 14 03:26:23 pine64 kernel:  update_process_times+0x2c/0x58
saus. 14 03:26:23 pine64 kernel:  tick_sched_handle.isra.5+0x30/0x48
saus. 14 03:26:23 pine64 kernel:  tick_sched_timer+0x48/0x98
saus. 14 03:26:23 pine64 kernel:  __hrtimer_run_queues+0xe4/0x1f8
saus. 14 03:26:23 pine64 kernel:  hrtimer_interrupt+0xf4/0x2b0
saus. 14 03:26:23 pine64 kernel:  arch_timer_handler_phys+0x28/0x40
saus. 14 03:26:23 pine64 kernel:  handle_percpu_devid_irq+0x80/0x138
saus. 14 03:26:23 pine64 kernel:  generic_handle_irq+0x24/0x38
saus. 14 03:26:23 pine64 kernel:  __handle_domain_irq+0x5c/0xb0
saus. 14 03:26:23 pine64 kernel:  gic_handle_irq+0x58/0xa8
saus. 14 03:26:23 pine64 kernel:  el1_irq+0xb0/0x140
saus. 14 03:26:23 pine64 kernel:  arch_cpu_idle+0x10/0x18
saus. 14 03:26:23 pine64 kernel:  do_idle+0x1d4/0x298
saus. 14 03:26:23 pine64 kernel:  cpu_startup_entry+0x24/0x28
saus. 14 03:26:23 pine64 kernel:  secondary_start_kernel+0x18c/0x1c8
saus. 14 03:26:23 pine64 kernel: Task dump for CPU 3:
saus. 14 03:26:23 pine64 kernel: tor             R  running task        0  2964      1 0x00000802
saus. 14 03:26:23 pine64 kernel: Call trace:
saus. 14 03:26:23 pine64 kernel:  __switch_to+0x94/0xd8
saus. 14 03:26:23 pine64 kernel:  do_mem_abort+0x54/0x100
saus. 14 03:26:23 pine64 kernel:  el0_da+0x20/0x24

 

Armbian-release:

BOARD=pine64
BOARD_NAME="Pine64"
BOARDFAMILY=sun50iw1
VERSION=5.70
LINUXFAMILY=sunxi64
BRANCH=next
ARCH=arm64
IMAGE_TYPE=stable
BOARD_TYPE=conf
INITRD_ARCH=arm64
KERNEL_IMAGE_TYPE=Image

 

With Armbian Pine64 3.10 kernel board worked without hanging for 2+ years.

I also tried 4.18.xx-sunxi64 kernel from dev branch - same problem.

Posted

This morning upgraded to 4.19.17-sunxi64 (5.73), stalled after ~7.5 hours of runtime:

 

Quote

[26338.303225] mmc0: Card stuck in programming state! mmc_do_erase
[26375.451469] mmc0: Card stuck in programming state! mmc_do_erase                      
[26380.881967] mmc0: Card stuck in programming state! mmc_do_erase                        
[26404.645290] mmc0: Card stuck in programming state! mmc_do_erase
[26409.085215] mmc0: Card stuck in programming state! mmc_do_erase
[26412.041056] mmc0: Card stuck in programming state! mmc_do_erase
[26412.047006] print_req_error: I/O error, dev mmcblk0, sector 2830472
[26412.055674] sunxi-mmc 1c0f000.mmc: data error, sending stop command
[26412.062057] sunxi-mmc 1c0f000.mmc: send stop command failed         
[27219.332955] rcu: INFO: rcu_sched self-detected stall on CPU
[27219.338555] rcu:     1-...0: (1 GPs behind) idle=bde/1/0x4000000000000002 softirq=1744556/1744557 fqs=2577
[27219.348024] rcu:      (t=5250 jiffies g=1620509 q=13602)          
[27219.353075] Task dump for CPU 1:                            
[27219.353078] systemd-journal R  running task        0   297      1 0x00000802
[27219.353087] Call trace:                    
[27219.353103]  dump_backtrace+0x0/0x1c0                 
[27219.353109]  show_stack+0x14/0x20        
[27219.353115]  sched_show_task+0x160/0x198   
[27219.353121]  dump_cpu_task+0x40/0x50                      
[27219.353128]  rcu_dump_cpu_stacks+0xc0/0x100
[27219.353132]  rcu_check_callbacks+0x594/0x780
[27219.353138]  update_process_times+0x2c/0x58
[27219.353144]  tick_sched_handle.isra.5+0x30/0x48
[27219.353149]  tick_sched_timer+0x48/0x98
[27219.353154]  __hrtimer_run_queues+0xe4/0x1f8
[27219.353158]  hrtimer_interrupt+0xf4/0x2b0
[27219.353166]  arch_timer_handler_phys+0x28/0x40
[27219.353172]  handle_percpu_devid_irq+0x80/0x138
[27219.353178]  generic_handle_irq+0x24/0x38
[27219.353183]  __handle_domain_irq+0x5c/0xb0
[27219.353187]  gic_handle_irq+0x58/0xa8
[27219.353191]  el1_irq+0xb0/0x140
[27219.353198]  __seccomp_filter+0x20/0x428
[27219.353202]  __secure_computing+0x38/0xc8
[27219.353207]  syscall_trace_enter+0x98/0x110
[27219.353215]  el0_svc_common+0xc0/0x100
[27219.353219]  el0_svc_handler+0x24/0x80
[27219.353223]  el0_svc+0x8/0xc

 

Posted

I have a similar issue on opi PC2 with 4.19.25-sunxi64 and a DVB-T2 USB dongle. The problem is solved by locking the frequency scaling.

E.g. I have

 

# WARNING: this file will be replaced on board support package (linux-root-...) upgrade
ENABLE=true
MIN_SPEED=120000
MAX_SPEED=1400000
#MAX_SPEED=500000
GOVERNOR=conservative

If MIN_SPEED and MAX_SPEED are the same you don't get any problems.

Performance gov works OK (it locks to max freq)

ondemand or interactive don't fix it unless you lock frequency

I can lock frequency at 420mhz and it works fine so it's not related to actual frequency

If I set MIN=800 and max=1300 it still gives problems

 

my error:

[ 6382.756348] rcu: INFO: rcu_sched self-detected stall on CPU
[ 6382.756892] rcu:     1-....: (5601 ticks this GP) idle=d8a/1/0x4000000000000004 softirq=227944/227947 fqs=2034
[ 6382.756957] rcu:      (t=5250 jiffies g=486985 q=1835)
[ 6382.757208] Task dump for CPU 1:
[ 6382.757310] kworker/1:2     R  running task        0  4002      2 0x0000002a
[ 6382.757904] Workqueue: events dbs_work_handler
[ 6382.758057] Call trace:
[ 6382.758316]  dump_backtrace+0x0/0x1c0
[ 6382.758526]  show_stack+0x14/0x20
[ 6382.758703]  sched_show_task+0x160/0x198
[ 6382.758900]  dump_cpu_task+0x40/0x50
[ 6382.759103]  rcu_dump_cpu_stacks+0xc0/0x100
[ 6382.759281]  rcu_check_callbacks+0x594/0x780
[ 6382.759455]  update_process_times+0x2c/0x58
[ 6382.759664]  tick_sched_handle.isra.5+0x30/0x48
[ 6382.759844]  tick_sched_timer+0x48/0x98
[ 6382.760013]  __hrtimer_run_queues+0xe4/0x1f8
[ 6382.760172]  hrtimer_interrupt+0xf4/0x2b0
[ 6382.760378]  arch_timer_handler_phys+0x28/0x40
[ 6382.760563]  handle_percpu_devid_irq+0x80/0x138
[ 6382.760765]  generic_handle_irq+0x24/0x38
[ 6382.760960]  __handle_domain_irq+0x5c/0xb0
[ 6382.761107]  gic_handle_irq+0x58/0xa8
[ 6382.761249]  el1_irq+0xb0/0x140
[ 6382.761466]  __usb_hcd_giveback_urb+0x98/0x148
[ 6382.761649]  usb_giveback_urb_bh+0xdc/0x170
[ 6382.761821]  tasklet_action_common.isra.3+0x7c/0x168
[ 6382.761961]  tasklet_action+0x24/0x30
[ 6382.762107]  __do_softirq+0x10c/0x200
[ 6382.762238]  irq_exit+0xac/0xc0
[ 6382.762431]  __handle_domain_irq+0x60/0xb0
[ 6382.762567]  gic_handle_irq+0x58/0xa8
[ 6382.762706]  el1_irq+0xb0/0x140
[ 6382.762915]  clk_propagate_rate_change+0x30/0xe0
[ 6382.763088]  clk_propagate_rate_change+0x90/0xe0
[ 6382.763262]  clk_propagate_rate_change+0x90/0xe0
[ 6382.763465]  clk_core_set_rate_nolock+0x1c0/0x1f8
[ 6382.763649]  clk_set_rate+0x38/0xa8
[ 6382.763807]  dev_pm_opp_set_rate+0x1f0/0x540
[ 6382.764083]  set_target+0x40/0x70 [cpufreq_dt]
[ 6382.764276]  __cpufreq_driver_target+0x184/0x5b0
[ 6382.764463]  od_dbs_update+0x140/0x1a0
[ 6382.764660]  dbs_work_handler+0x3c/0x70
[ 6382.764858]  process_one_work+0x1e4/0x360
[ 6382.765042]  worker_thread+0x48/0x4b0
[ 6382.765195]  kthread+0x128/0x130
[ 6382.765367]  ret_from_fork+0x10/0x1c

some others with the issue:

 

https://github.com/armbian/build/issues/1000

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines