sunzone Posted May 27, 2019 Posted May 27, 2019 ARMBIAN 5.75 stable Ubuntu 18.04.2 LTS 4.19.38-sunxi Orange Pi Zero Board Please refer armbianmonitor.log for armbianmonitor -U command output details I have configured My Orange Pi Zero onboard wlan chip (xradio_wlan driver) to work on both Ap mode and Client modes simultaneously. When both Ap and Client interfaces are up and working, and when the connected wifi channel of the Client interface changes, kernel hang occurs. Error message [20148.872296] rcu: INFO: rcu_sched self-detected stall on CPU [20148.877893] rcu: 1-....: (4945709 ticks this GP) idle=8aa/1/0x40000002 softirq=9640/9640 fqs=2447771 [20148.887182] rcu: (t=4967445 jiffies g=15521 q=274313) [20148.892407] NMI backtrace for cpu 1 [20148.895897] CPU: 1 PID: 396 Comm: xradio_bh Tainted: G D W 4.19.20-sunxi #5.75 [20148.904234] Hardware name: Allwinner sun8i Family [20148.908959] [<c010dbbd>] (unwind_backtrace) from [<c010a7b1>] (show_stack+0x11/0x14) [20148.916701] [<c010a7b1>] (show_stack) from [<c08c8dc1>] (dump_stack+0x69/0x78) [20148.923921] [<c08c8dc1>] (dump_stack) from [<c08cd03b>] (nmi_cpu_backtrace+0x8f/0x90) [20148.931746] [<c08cd03b>] (nmi_cpu_backtrace) from [<c08cd0eb>] (nmi_trigger_cpumask_backtrace+0xaf/0xe0) [20148.941216] [<c08cd0eb>] (nmi_trigger_cpumask_backtrace) from [<c016c183>] (rcu_dump_cpu_stacks+0x7b/0x98) [20148.950861] [<c016c183>] (rcu_dump_cpu_stacks) from [<c016b7b1>] (rcu_check_callbacks+0x4f5/0x6c8) [20148.959815] [<c016b7b1>] (rcu_check_callbacks) from [<c0170af3>] (update_process_times+0x2b/0x48) [20148.968681] [<c0170af3>] (update_process_times) from [<c017e96b>] (tick_sched_timer+0x37/0x74) [20148.977277] [<c017e96b>] (tick_sched_timer) from [<c0171355>] (__hrtimer_run_queues+0x105/0x254) [20148.986055] [<c0171355>] (__hrtimer_run_queues) from [<c0171e15>] (hrtimer_interrupt+0xb5/0x200) [20148.994835] [<c0171e15>] (hrtimer_interrupt) from [<c0780ab1>] (arch_timer_handler_phys+0x25/0x28) [20149.003787] [<c0780ab1>] (arch_timer_handler_phys) from [<c0162d9b>] (handle_percpu_devid_irq+0x57/0x19c) [20149.013346] [<c0162d9b>] (handle_percpu_devid_irq) from [<c015f1f5>] (generic_handle_irq+0x1d/0x28) [20149.022383] [<c015f1f5>] (generic_handle_irq) from [<c015f695>] (__handle_domain_irq+0x45/0x84) [20149.031076] [<c015f695>] (__handle_domain_irq) from [<c059a475>] (gic_handle_irq+0x39/0x68) [20149.039421] [<c059a475>] (gic_handle_irq) from [<c0101a65>] (__irq_svc+0x65/0x94) [20149.046892] Exception stack(0xd66c9de8 to 0xd66c9e30) [20149.051941] 9de0: c9ae0e4c 00000000 0000287c 0000287b 00000000 d6480e00 [20149.060110] 9e00: c0d04d48 c9ae0d18 d648100c 00000080 d64811fc 00000000 ce9bc61c d66c9e38 [20149.068276] 9e20: bfb2d42d c08db396 20070033 ffffffff [20149.073328] [<c0101a65>] (__irq_svc) from [<c08db396>] (_raw_spin_lock+0x26/0x34) [20149.080830] [<c08db396>] (_raw_spin_lock) from [<bfb2d42d>] (wsm_handle_rx+0x828/0xc7c [xradio_wlan]) [20149.090066] [<bfb2d42d>] (wsm_handle_rx [xradio_wlan]) from [<bfb29e05>] (xradio_bh_exchange+0x27c/0x588 [xradio_wlan]) [20149.100862] [<bfb29e05>] (xradio_bh_exchange [xradio_wlan]) from [<bfb2a239>] (xradio_bh+0x128/0x270 [xradio_wlan]) [20149.111301] [<bfb2a239>] (xradio_bh [xradio_wlan]) from [<c0132cb1>] (kthread+0xfd/0x104) [20149.119470] [<c0132cb1>] (kthread) from [<c01010f9>] (ret_from_fork+0x11/0x38) [20149.126680] Exception stack(0xd66c9fb0 to 0xd66c9ff8) [20149.131726] 9fa0: 00000000 00000000 00000000 00000000 [20149.139894] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [20149.148061] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ifconfig output (wlan0_ap: ap interface, wlan0_station: client interface) root@m6das:~# ifconfig eth0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 02:42:93:7e:50:70 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 39 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 592 bytes 73021 (73.0 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 592 bytes 73021 (73.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 wlan0_ap: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.10.10.1 netmask 255.255.255.0 broadcast 10.10.10.255 inet6 fe80::1042:93ff:fe7e:5071 prefixlen 64 scopeid 0x20<link> ether 12:42:93:7e:50:71 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 69 bytes 11042 (11.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 wlan0_station: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.0.118 netmask 255.255.255.0 broadcast 192.168.0.255 inet6 fe80::1042:93ff:fe7e:5070 prefixlen 64 scopeid 0x20<link> ether 12:42:93:7e:50:70 txqueuelen 1000 (Ethernet) RX packets 16880 bytes 8748161 (8.7 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15659 bytes 19025140 (19.0 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 NetworkManager manages all interfaces root@m6das:~# nmcli c s NAME UUID TYPE DEVICE wlan0_ap f26e29d0-d1bc-4eae-b6bc-c3d02ded2bf1 wifi wlan0_ap wlan0_station 3466adf1-ddbd-45fa-b9e4-8a11c265d5a8 wifi wlan0_station eth0 64db1031-abac-33fd-8055-2a757fe08d1a ethernet -- Can anyone help how to stop the kernal hang? any advice? Thank You. Sanju.
Tido Posted May 27, 2019 Posted May 27, 2019 2 hours ago, sunzone said: Orange Pi Zero onboard wlan chip Please read - buggy driver https://forum.armbian.com/topic/10237-orange-pi-zero-wifi-issues-with-ar9280-ap-device-cnx/ https://forum.armbian.com/topic/9179-failed-to-start-load-kernel-modules-xradio_wlan/ https://forum.armbian.com/topic/8697-wifi-installation-in-orange-pi-pc-in-mainline-kernel/
sunzone Posted May 29, 2019 Author Posted May 29, 2019 Thanks, @Tido for the links. [20149.090066] [<bfb2d42d>] (wsm_handle_rx [xradio_wlan]) from [<bfb29e05>] (xradio_bh_exchange+0x27c/0x588 [xradio_wlan]) In the error logs, it seems program stalled on wsm_handle_rx in xradio_wlan. I found the official xradio_wlan driver to be fifteenhex http://linux-sunxi.org/Wifi I checked the fifteenhex wlan_driver code for "wsm_handle_rx" and found this line in wsm.c https://github.com/fifteenhex/xradio/blob/master/wsm.c It seems like kernel hang is done on purpose. After commenting out this section, I recompiled the xradio_wlan.ko module driver and retested. After running 11 the OrangePi Zeros with the new xradio driver, connected to a router which changes its channel occasionally when multiple devices are connected, 5 OPis stop responding. I ran the test overnight. I tested after connecting all OPis through the serial port (COM). @zador.blood.stained @Tido @martinayotte any idea how I can further test the driver to stop the kernel hang? p.s: I am ok with the dropped packets of the driver. Thank You. Sanju. 1
Tido Posted May 29, 2019 Posted May 29, 2019 If you are familiar with IRC, I would carefully try to get some hints over here: https://linux-sunxi.org/IRC 1
Tido Posted June 3, 2019 Posted June 3, 2019 On 5/29/2019 at 2:16 AM, sunzone said: test the driver to stop the kernel hang any luck so far ?
sunzone Posted June 5, 2019 Author Posted June 5, 2019 On 6/3/2019 at 5:10 PM, Tido said: any luck so far ? I was not familiar with IRC. However, I just asked about this problem on the #linux-sunxi channel. Let's see what replays I get. In the meantime, I wrote a program to reboot the system if a channel change is detected. But still, sometimes the device doesn't boot after calling 'reboot -f'. Another option is just to use a wifi dongle to work as the client interface. I already tried this. But then again, sometimes, the USB wifi module is not detected at the boot by the device; It has to be replugged again, which is not that ideal.
sunzone Posted July 24, 2019 Author Posted July 24, 2019 Just a follow up After changing the driver software some places and testing, without any in-depth understanding about this chip, I was not able to stop kernel hang on X-radio wifi driver. I noticed problems with xradio in 2 modes. 1) When using xradio in AP mode: kernel hang, reboot -f does not work 2) Using xradio in concurrent mode: kernel hang In either cases AP mode had the problem. I decided to stop using the xradio for good after all the testing. I stopped xradio module loading from boot: commented out 'xradio_wlan' from /etc/modules file I blacklisted xradio driver by adding 'blacklist xradio_wlan' to /etc/modprobe.d/blacklist.conf file After that there were no kernel hangs or 'reboot -f' problems. I use an external wifi module now. I have tested about 8 wifi modules now and I will make another post with my results and experience of them. Cheers! 1
Recommended Posts