3 3
Aurimas

Pine64+ on 4.19.13-sunxi64: rcu_sched self-detected stall on CPU

Recommended Posts

After upgrading from 4.14.70-sunxi64 to 4.19.13-sunxi64 Pine64+ board started to misbehave after running for some time.

Dmesg shows:

Quote

saus. 14 03:26:23 pine64 kernel: rcu: INFO: rcu_sched self-detected stall on CPU
saus. 14 03:26:23 pine64 kernel: rcu:         2-...0: (286 GPs behind) idle=ad6/0/0x1 softirq=380346/380347 fqs=2
saus. 14 03:26:23 pine64 kernel: rcu:          (t=60094 jiffies g=291549 q=3)
saus. 14 03:26:23 pine64 kernel: Task dump for CPU 2:
saus. 14 03:26:23 pine64 kernel: swapper/2       R  running task        0     0      1 0x0000002a
saus. 14 03:26:23 pine64 kernel: Call trace:
saus. 14 03:26:23 pine64 kernel:  dump_backtrace+0x0/0x1c0
saus. 14 03:26:23 pine64 kernel:  show_stack+0x14/0x20
saus. 14 03:26:23 pine64 kernel:  sched_show_task+0x160/0x198
saus. 14 03:26:23 pine64 kernel:  dump_cpu_task+0x40/0x50
saus. 14 03:26:23 pine64 kernel:  rcu_dump_cpu_stacks+0xc0/0x100
saus. 14 03:26:23 pine64 kernel:  rcu_check_callbacks+0x594/0x780
saus. 14 03:26:23 pine64 kernel:  update_process_times+0x2c/0x58
saus. 14 03:26:23 pine64 kernel:  tick_sched_handle.isra.5+0x30/0x48
saus. 14 03:26:23 pine64 kernel:  tick_sched_timer+0x48/0x98
saus. 14 03:26:23 pine64 kernel:  __hrtimer_run_queues+0xe4/0x1f8
saus. 14 03:26:23 pine64 kernel:  hrtimer_interrupt+0xf4/0x2b0
saus. 14 03:26:23 pine64 kernel:  arch_timer_handler_phys+0x28/0x40
saus. 14 03:26:23 pine64 kernel:  handle_percpu_devid_irq+0x80/0x138
saus. 14 03:26:23 pine64 kernel:  generic_handle_irq+0x24/0x38
saus. 14 03:26:23 pine64 kernel:  __handle_domain_irq+0x5c/0xb0
saus. 14 03:26:23 pine64 kernel:  gic_handle_irq+0x58/0xa8
saus. 14 03:26:23 pine64 kernel:  el1_irq+0xb0/0x140
saus. 14 03:26:23 pine64 kernel:  arch_cpu_idle+0x10/0x18
saus. 14 03:26:23 pine64 kernel:  do_idle+0x1d4/0x298
saus. 14 03:26:23 pine64 kernel:  cpu_startup_entry+0x24/0x28
saus. 14 03:26:23 pine64 kernel:  secondary_start_kernel+0x18c/0x1c8
saus. 14 03:26:23 pine64 kernel: Task dump for CPU 3:
saus. 14 03:26:23 pine64 kernel: tor             R  running task        0  2964      1 0x00000802
saus. 14 03:26:23 pine64 kernel: Call trace:
saus. 14 03:26:23 pine64 kernel:  __switch_to+0x94/0xd8
saus. 14 03:26:23 pine64 kernel:  do_mem_abort+0x54/0x100
saus. 14 03:26:23 pine64 kernel:  el0_da+0x20/0x24

 

Armbian-release:

BOARD=pine64
BOARD_NAME="Pine64"
BOARDFAMILY=sun50iw1
VERSION=5.70
LINUXFAMILY=sunxi64
BRANCH=next
ARCH=arm64
IMAGE_TYPE=stable
BOARD_TYPE=conf
INITRD_ARCH=arm64
KERNEL_IMAGE_TYPE=Image

 

With Armbian Pine64 3.10 kernel board worked without hanging for 2+ years.

I also tried 4.18.xx-sunxi64 kernel from dev branch - same problem.

Share this post


Link to post
Share on other sites

This morning upgraded to 4.19.17-sunxi64 (5.73), stalled after ~7.5 hours of runtime:

 

Quote

[26338.303225] mmc0: Card stuck in programming state! mmc_do_erase
[26375.451469] mmc0: Card stuck in programming state! mmc_do_erase                      
[26380.881967] mmc0: Card stuck in programming state! mmc_do_erase                        
[26404.645290] mmc0: Card stuck in programming state! mmc_do_erase
[26409.085215] mmc0: Card stuck in programming state! mmc_do_erase
[26412.041056] mmc0: Card stuck in programming state! mmc_do_erase
[26412.047006] print_req_error: I/O error, dev mmcblk0, sector 2830472
[26412.055674] sunxi-mmc 1c0f000.mmc: data error, sending stop command
[26412.062057] sunxi-mmc 1c0f000.mmc: send stop command failed         
[27219.332955] rcu: INFO: rcu_sched self-detected stall on CPU
[27219.338555] rcu:     1-...0: (1 GPs behind) idle=bde/1/0x4000000000000002 softirq=1744556/1744557 fqs=2577
[27219.348024] rcu:      (t=5250 jiffies g=1620509 q=13602)          
[27219.353075] Task dump for CPU 1:                            
[27219.353078] systemd-journal R  running task        0   297      1 0x00000802
[27219.353087] Call trace:                    
[27219.353103]  dump_backtrace+0x0/0x1c0                 
[27219.353109]  show_stack+0x14/0x20        
[27219.353115]  sched_show_task+0x160/0x198   
[27219.353121]  dump_cpu_task+0x40/0x50                      
[27219.353128]  rcu_dump_cpu_stacks+0xc0/0x100
[27219.353132]  rcu_check_callbacks+0x594/0x780
[27219.353138]  update_process_times+0x2c/0x58
[27219.353144]  tick_sched_handle.isra.5+0x30/0x48
[27219.353149]  tick_sched_timer+0x48/0x98
[27219.353154]  __hrtimer_run_queues+0xe4/0x1f8
[27219.353158]  hrtimer_interrupt+0xf4/0x2b0
[27219.353166]  arch_timer_handler_phys+0x28/0x40
[27219.353172]  handle_percpu_devid_irq+0x80/0x138
[27219.353178]  generic_handle_irq+0x24/0x38
[27219.353183]  __handle_domain_irq+0x5c/0xb0
[27219.353187]  gic_handle_irq+0x58/0xa8
[27219.353191]  el1_irq+0xb0/0x140
[27219.353198]  __seccomp_filter+0x20/0x428
[27219.353202]  __secure_computing+0x38/0xc8
[27219.353207]  syscall_trace_enter+0x98/0x110
[27219.353215]  el0_svc_common+0xc0/0x100
[27219.353219]  el0_svc_handler+0x24/0x80
[27219.353223]  el0_svc+0x8/0xc

 

Share this post


Link to post
Share on other sites

I have a similar issue on opi PC2 with 4.19.25-sunxi64 and a DVB-T2 USB dongle. The problem is solved by locking the frequency scaling.

E.g. I have

 

# WARNING: this file will be replaced on board support package (linux-root-...) upgrade
ENABLE=true
MIN_SPEED=120000
MAX_SPEED=1400000
#MAX_SPEED=500000
GOVERNOR=conservative

If MIN_SPEED and MAX_SPEED are the same you don't get any problems.

Performance gov works OK (it locks to max freq)

ondemand or interactive don't fix it unless you lock frequency

I can lock frequency at 420mhz and it works fine so it's not related to actual frequency

If I set MIN=800 and max=1300 it still gives problems

 

my error:

[ 6382.756348] rcu: INFO: rcu_sched self-detected stall on CPU
[ 6382.756892] rcu:     1-....: (5601 ticks this GP) idle=d8a/1/0x4000000000000004 softirq=227944/227947 fqs=2034
[ 6382.756957] rcu:      (t=5250 jiffies g=486985 q=1835)
[ 6382.757208] Task dump for CPU 1:
[ 6382.757310] kworker/1:2     R  running task        0  4002      2 0x0000002a
[ 6382.757904] Workqueue: events dbs_work_handler
[ 6382.758057] Call trace:
[ 6382.758316]  dump_backtrace+0x0/0x1c0
[ 6382.758526]  show_stack+0x14/0x20
[ 6382.758703]  sched_show_task+0x160/0x198
[ 6382.758900]  dump_cpu_task+0x40/0x50
[ 6382.759103]  rcu_dump_cpu_stacks+0xc0/0x100
[ 6382.759281]  rcu_check_callbacks+0x594/0x780
[ 6382.759455]  update_process_times+0x2c/0x58
[ 6382.759664]  tick_sched_handle.isra.5+0x30/0x48
[ 6382.759844]  tick_sched_timer+0x48/0x98
[ 6382.760013]  __hrtimer_run_queues+0xe4/0x1f8
[ 6382.760172]  hrtimer_interrupt+0xf4/0x2b0
[ 6382.760378]  arch_timer_handler_phys+0x28/0x40
[ 6382.760563]  handle_percpu_devid_irq+0x80/0x138
[ 6382.760765]  generic_handle_irq+0x24/0x38
[ 6382.760960]  __handle_domain_irq+0x5c/0xb0
[ 6382.761107]  gic_handle_irq+0x58/0xa8
[ 6382.761249]  el1_irq+0xb0/0x140
[ 6382.761466]  __usb_hcd_giveback_urb+0x98/0x148
[ 6382.761649]  usb_giveback_urb_bh+0xdc/0x170
[ 6382.761821]  tasklet_action_common.isra.3+0x7c/0x168
[ 6382.761961]  tasklet_action+0x24/0x30
[ 6382.762107]  __do_softirq+0x10c/0x200
[ 6382.762238]  irq_exit+0xac/0xc0
[ 6382.762431]  __handle_domain_irq+0x60/0xb0
[ 6382.762567]  gic_handle_irq+0x58/0xa8
[ 6382.762706]  el1_irq+0xb0/0x140
[ 6382.762915]  clk_propagate_rate_change+0x30/0xe0
[ 6382.763088]  clk_propagate_rate_change+0x90/0xe0
[ 6382.763262]  clk_propagate_rate_change+0x90/0xe0
[ 6382.763465]  clk_core_set_rate_nolock+0x1c0/0x1f8
[ 6382.763649]  clk_set_rate+0x38/0xa8
[ 6382.763807]  dev_pm_opp_set_rate+0x1f0/0x540
[ 6382.764083]  set_target+0x40/0x70 [cpufreq_dt]
[ 6382.764276]  __cpufreq_driver_target+0x184/0x5b0
[ 6382.764463]  od_dbs_update+0x140/0x1a0
[ 6382.764660]  dbs_work_handler+0x3c/0x70
[ 6382.764858]  process_one_work+0x1e4/0x360
[ 6382.765042]  worker_thread+0x48/0x4b0
[ 6382.765195]  kthread+0x128/0x130
[ 6382.765367]  ret_from_fork+0x10/0x1c

some others with the issue:

 

https://github.com/armbian/build/issues/1000

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
3 3