1 1
sunzone

OPi Zero: xradio_wlan driver Kernel hang

Recommended Posts

ARMBIAN 5.75 stable Ubuntu 18.04.2 LTS 4.19.38-sunxi
Orange Pi Zero Board

Please refer  armbianmonitor.log  for armbianmonitor -U command output details

 

I have configured My Orange Pi Zero onboard wlan chip (xradio_wlan driver) to work on both Ap mode and Client modes simultaneously.

When both Ap and Client interfaces are up and working, and when the connected wifi channel of the Client interface changes, kernel hang occurs.

 

Error message

[20148.872296] rcu: INFO: rcu_sched self-detected stall on CPU
[20148.877893] rcu:     1-....: (4945709 ticks this GP) idle=8aa/1/0x40000002 softirq=9640/9640 fqs=2447771
[20148.887182] rcu:      (t=4967445 jiffies g=15521 q=274313)
[20148.892407] NMI backtrace for cpu 1
[20148.895897] CPU: 1 PID: 396 Comm: xradio_bh Tainted: G      D W         4.19.20-sunxi #5.75
[20148.904234] Hardware name: Allwinner sun8i Family
[20148.908959] [<c010dbbd>] (unwind_backtrace) from [<c010a7b1>] (show_stack+0x11/0x14)
[20148.916701] [<c010a7b1>] (show_stack) from [<c08c8dc1>] (dump_stack+0x69/0x78)
[20148.923921] [<c08c8dc1>] (dump_stack) from [<c08cd03b>] (nmi_cpu_backtrace+0x8f/0x90)
[20148.931746] [<c08cd03b>] (nmi_cpu_backtrace) from [<c08cd0eb>] (nmi_trigger_cpumask_backtrace+0xaf/0xe0)
[20148.941216] [<c08cd0eb>] (nmi_trigger_cpumask_backtrace) from [<c016c183>] (rcu_dump_cpu_stacks+0x7b/0x98)
[20148.950861] [<c016c183>] (rcu_dump_cpu_stacks) from [<c016b7b1>] (rcu_check_callbacks+0x4f5/0x6c8)
[20148.959815] [<c016b7b1>] (rcu_check_callbacks) from [<c0170af3>] (update_process_times+0x2b/0x48)
[20148.968681] [<c0170af3>] (update_process_times) from [<c017e96b>] (tick_sched_timer+0x37/0x74)
[20148.977277] [<c017e96b>] (tick_sched_timer) from [<c0171355>] (__hrtimer_run_queues+0x105/0x254)
[20148.986055] [<c0171355>] (__hrtimer_run_queues) from [<c0171e15>] (hrtimer_interrupt+0xb5/0x200)
[20148.994835] [<c0171e15>] (hrtimer_interrupt) from [<c0780ab1>] (arch_timer_handler_phys+0x25/0x28)
[20149.003787] [<c0780ab1>] (arch_timer_handler_phys) from [<c0162d9b>] (handle_percpu_devid_irq+0x57/0x19c)
[20149.013346] [<c0162d9b>] (handle_percpu_devid_irq) from [<c015f1f5>] (generic_handle_irq+0x1d/0x28)
[20149.022383] [<c015f1f5>] (generic_handle_irq) from [<c015f695>] (__handle_domain_irq+0x45/0x84)
[20149.031076] [<c015f695>] (__handle_domain_irq) from [<c059a475>] (gic_handle_irq+0x39/0x68)
[20149.039421] [<c059a475>] (gic_handle_irq) from [<c0101a65>] (__irq_svc+0x65/0x94)
[20149.046892] Exception stack(0xd66c9de8 to 0xd66c9e30)
[20149.051941] 9de0:                   c9ae0e4c 00000000 0000287c 0000287b 00000000 d6480e00
[20149.060110] 9e00: c0d04d48 c9ae0d18 d648100c 00000080 d64811fc 00000000 ce9bc61c d66c9e38
[20149.068276] 9e20: bfb2d42d c08db396 20070033 ffffffff
[20149.073328] [<c0101a65>] (__irq_svc) from [<c08db396>] (_raw_spin_lock+0x26/0x34)
[20149.080830] [<c08db396>] (_raw_spin_lock) from [<bfb2d42d>] (wsm_handle_rx+0x828/0xc7c [xradio_wlan])
[20149.090066] [<bfb2d42d>] (wsm_handle_rx [xradio_wlan]) from [<bfb29e05>] (xradio_bh_exchange+0x27c/0x588 [xradio_wlan])
[20149.100862] [<bfb29e05>] (xradio_bh_exchange [xradio_wlan]) from [<bfb2a239>] (xradio_bh+0x128/0x270 [xradio_wlan])
[20149.111301] [<bfb2a239>] (xradio_bh [xradio_wlan]) from [<c0132cb1>] (kthread+0xfd/0x104)
[20149.119470] [<c0132cb1>] (kthread) from [<c01010f9>] (ret_from_fork+0x11/0x38)
[20149.126680] Exception stack(0xd66c9fb0 to 0xd66c9ff8)
[20149.131726] 9fa0:                                     00000000 00000000 00000000 00000000
[20149.139894] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[20149.148061] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000

ifconfig output (wlan0_ap: ap interface, wlan0_station: client interface)

root@m6das:~# ifconfig
eth0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 02:42:93:7e:50:70  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 39

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 592  bytes 73021 (73.0 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 592  bytes 73021 (73.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0_ap: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.10.10.1  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::1042:93ff:fe7e:5071  prefixlen 64  scopeid 0x20<link>
        ether 12:42:93:7e:50:71  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 69  bytes 11042 (11.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0_station: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.118  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::1042:93ff:fe7e:5070  prefixlen 64  scopeid 0x20<link>
        ether 12:42:93:7e:50:70  txqueuelen 1000  (Ethernet)
        RX packets 16880  bytes 8748161 (8.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 15659  bytes 19025140 (19.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NetworkManager manages all interfaces

root@m6das:~# nmcli c s
NAME           UUID                                  TYPE      DEVICE
wlan0_ap       f26e29d0-d1bc-4eae-b6bc-c3d02ded2bf1  wifi      wlan0_ap
wlan0_station  3466adf1-ddbd-45fa-b9e4-8a11c265d5a8  wifi      wlan0_station
eth0           64db1031-abac-33fd-8055-2a757fe08d1a  ethernet  --

Can anyone help how to stop the kernal hang? any advice?

 

Thank You.

Sanju.

 

Share this post


Link to post
Share on other sites

Thanks, @Tido for the links.

 

[20149.090066] [<bfb2d42d>] (wsm_handle_rx [xradio_wlan]) from [<bfb29e05>] (xradio_bh_exchange+0x27c/0x588 [xradio_wlan])

In the error logs, it seems program stalled on wsm_handle_rx in xradio_wlan.

 

I found the official xradio_wlan driver to be fifteenhex

http://linux-sunxi.org/Wifi

driver.thumb.png.5ce9b5f24be876d8e840fa976f77f295.png

 

 

I checked the fifteenhex wlan_driver code for "wsm_handle_rx" and found this line in wsm.c

wsm.png.135cb807bc1bf02ea1499838f61d07aa.png

https://github.com/fifteenhex/xradio/blob/master/wsm.c

 

It seems like kernel hang is done on purpose.

 

After commenting out this section, I recompiled the xradio_wlan.ko module driver and retested.

After running 11 the OrangePi Zeros with the new xradio driver, connected to a router which changes its channel occasionally when multiple devices are connected, 5 OPis stop responding. 

I ran the test overnight. I tested after connecting all OPis through the serial port (COM).

 

Testing.thumb.jpg.411336aa576fd7ec7d6bccb6225023af.jpg

 

@zador.blood.stained @Tido @martinayotte any idea how I can further test the driver to stop the kernel hang?

p.s: I am ok with the dropped packets of the driver.

 

Thank You.

 

Sanju.

 

 

Share this post


Link to post
Share on other sites
On 6/3/2019 at 5:10 PM, Tido said:

any luck so far ?

I was not familiar with IRC. However, I just asked about this problem on the #linux-sunxi channel.

Let's see what replays I get.

 

In the meantime, I wrote a program to reboot the system if a channel change is detected.

But still, sometimes the device doesn't boot after calling 'reboot -f'.

 

Another option is just to use a wifi dongle to work as the client interface. I already tried this.

But then again, sometimes, the USB wifi module is not detected at the boot by the device; It has to be replugged again, which is not that ideal.

Share this post


Link to post
Share on other sites

Just a follow up

After changing the driver software some places and testing, without any in-depth understanding about this chip, I was not able to stop kernel hang on X-radio wifi driver.

 

I noticed problems with xradio in 2 modes.

1) When using xradio  in AP mode: kernel hang, reboot -f does not work

 

2) Using xradio in concurrent mode: kernel hang

 

In either cases AP mode had the problem.

I decided to stop using the xradio for good after all the testing.

 

I stopped xradio module loading from boot: commented out 'xradio_wlan' from /etc/modules file

I blacklisted xradio driver by adding  'blacklist xradio_wlan' to /etc/modprobe.d/blacklist.conf file

 

After that there were no kernel hangs or 'reboot -f' problems.

 

I use an external wifi module now. I have tested about 8 wifi modules now and I will make another post with my results and experience of them.

Cheers!

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
1 1