vzoltan Posted June 23, 2020 Share Posted June 23, 2020 Armbianmonitor: http://ix.io/2pVH I'm not sure what is the cause and effect here, but what I saw today is weird. OS: Welcome to Armbian buster with Linux 4.9.219-meson64 Linux odroidn2 4.9.219-meson64 #1 SMP PREEMPT Wed Jun 3 12:42:50 CEST 2020 aarch64 GNU/Linux Symptom: I recently purchased a Seagate external HDD and wanted to use it with the N2. Started to replicate files from my other server over SFTP, and at the same time I also started Midnight Commander to check on the HDD content. As soon as I wanted to change directory to /mnt/seagate, my console became unresponsive, the SFTP connection failed and there wasn't really any way to unstuck this situation. Due to other reasons previously I applied quirks on the drive, therefore it is NOT in UAS mode: Bus 002 Device 006: ID 0bc2:331a Seagate RSS LLC root@odroidn2:/home/vz# lsusb -t /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M ... |__ Port 3: Dev 6, If 0, Class=Mass Storage, Driver=usb-storage, 5000M (* Other times I managed to "repro" this stuck behavior just by changing directory and executing a fairly simple ls -l from command prompt. I really don't know what's up with the USB storage handling...) Then I started a new SSH session and executed armbianmonitor -u. There I saw only CPU0 working, the other cores were idling: http://ix.io/2pVH Alright, somehow I managed to unmount the drive (needed to use the -f switch), then mount again, SFTP reconnected, file transfer started. There is an active transfer even right now, approx. 8MB/s, but still only CPU0 is working: root@odroidn2:/home/vz# armbianmonitor -n runtime network statistics: odroidn2 network interface: eth0 [tap 'd' to display column headings] [tap 'z' to reset counters] [use <ctrl-c> to exit] [bps: bits/s, Mbps: megabits/s, pps: packets/s, MB: megabytes] eth0 rx.stats____________________________________________________________ tx.stats____________________________________________________________ count bps Mbps ư.Mbps pps ư.pps Ʃ.MB bps Mbps ư.Mbps pps ư.pps Ʃ.MB 1 69363392 69.36 69.36 5853 5853 8.26 991072 .99 .99 1827 1827 .11 2 68955424 68.95 69.15 5827 5840 16.48 964480 .96 .97 1776 1801 .23 root@odroidn2:/home/vz# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 10: 0 0 0 0 0 0 GIC-0 29 Level arch_timer 11: 6819558 33868946 983894 427047 2782528 266967 GIC-0 30 Level arch_timer 14: 0 0 0 0 0 0 GIC-0 92 Edge Meson TimerF 15: 3 0 0 0 0 0 GIC-0 192 Level ffe40000.bifrost 16: 0 0 0 0 0 0 GIC-0 193 Level ffe40000.bifrost 17: 0 0 0 0 0 0 GIC-0 194 Level ffe40000.bifrost 20: 3 0 0 0 0 0 GIC-0 241 Edge 21: 1 0 0 0 0 0 GIC-0 242 Edge 22: 62991952 0 0 0 0 0 GIC-0 40 Edge eth0 23: 1336993 0 0 0 0 0 GIC-0 62 Level xhci-hcd:usb1 24: 1 0 0 0 0 0 GIC-0 48 Level amlogic_botg_detect 25: 0 0 0 0 0 0 GIC-0 63 Level dwc_otg, dwc_otg_pcd 26: 0 0 0 0 0 0 GIC-0 232 Edge meson-g12a-saradc 27: 0 0 0 0 0 0 GIC-0 67 Edge ff634800.p_tsensor 28: 0 0 0 0 0 0 GIC-0 68 Edge ff634c00.d_tsensor 29: 0 0 0 0 0 0 GIC-0 247 Edge ffd1d000.i2c 31: 10078 0 0 0 0 0 GIC-0 71 Edge ffd1c000.i2c 33: 0 0 0 0 0 0 GIC-0 225 Edge meson_uart 41: 0 0 0 0 0 0 GIC-0 89 Edge hdmitx 43: 0 0 0 0 0 0 GIC-0 235 Edge hdmi_aocecb 45: 0 0 0 0 0 0 GIC-0 178 Edge ge2d 46: 17579444 0 0 0 0 0 GIC-0 35 Edge vsync, osd-vsync 47: 0 0 0 0 0 0 GIC-0 176 Edge gdc 54: 17579444 0 0 0 0 0 GIC-0 121 Edge rdma 55: 0 0 0 0 0 0 GIC-0 88 Edge osd-vsync-viu2 56: 5 0 0 0 0 0 GIC-0 223 Edge meson-aml-mmc 57: 4121 0 0 0 0 0 GIC-0 222 Edge meson-aml-mmc 59: 0 0 0 0 0 0 GIC-0 83 Edge dmc_monitor 69: 0 0 0 0 0 0 GIC-0 33 Edge audiolocker 70: 0 0 0 0 0 0 GIC-0 78 Edge pre_di 71: 0 0 0 0 0 0 GIC-0 72 Edge post_di 72: 255 0 0 0 0 0 GIC-0 228 Edge ir-meson IPI0: 84717127 63361081 17705172 10069318 6700169 4492021 Rescheduling interrupts IPI1: 27291 52009 49225 28653 26197 24202 Function call interrupts IPI2: 0 0 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 0 0 Timer broadcast interrupts IPI4: 1238898 56645 442118 106174 1830596 58141 IRQ work interrupts IPI5: 0 0 0 0 0 0 CPU wake-up interrupts Err: 0 How is that possible? eth0 is definitely under load (alright, far away from the gigabit limit, but still), USB is also under load to write the received bytes to the disk... But it really seems like only CPU0 does care. Basically I'm a Windows guy for a looong time, but this is like unimaginable there, all the CPU cores do work and distribute the load... Maybe I just don't understand how this should work, but I thought it's better to report. Worst case I'll get educated on Linux. Link to comment Share on other sites More sharing options...
Igor Posted June 23, 2020 Share Posted June 23, 2020 2 hours ago, vzoltan said: Basically I'm a Windows guy for a looong time, but this is like unimaginable there, all the CPU cores do work and distribute the load... On Linux we have a daemon but it doesn't work properly on ARM. Perhaps now it does ... would need to check. That's why Armbian have this:https://github.com/armbian/build/blob/master/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization but it seems it doesn't work. Link to comment Share on other sites More sharing options...
vzoltan Posted June 29, 2020 Author Share Posted June 29, 2020 On 6/23/2020 at 4:26 PM, Igor said: but it seems it doesn't work. Hello @Igor thank you for your comment, however I'm afraid I don't follow you: do you say this should work (i.e. all CPU cores handling interrupts) and for some reason buggy with Armbian's N2 distro? Link to comment Share on other sites More sharing options...
Igor Posted June 29, 2020 Share Posted June 29, 2020 30 minutes ago, vzoltan said: I don't follow you Armbian is essentially a build system that makes Linux which is distributed in a Debian / Ubuntu taste. 31 minutes ago, vzoltan said: do you say this should work (i.e. all CPU cores handling interrupts) AFAIK this feature works well on X86 hardware while on ARM it doesn't work at all - is broken (perhaps not anymore - IDK. Need to check). Most distributions are not dealing with hardware functions - they distribute Linux (hw layer) as is under different logo, changed desktop, ... we have covered this feature since early days but everything needs to be maintained to preserve operational state. Maintenance at current levels is already insane expensive and most of users does not even notice that ... 24 minutes ago, vzoltan said: for some reason buggy with Armbian's N2 distro? Did you see this problem fixed elsewhere? Link to comment Share on other sites More sharing options...
tkaiser Posted June 30, 2020 Share Posted June 30, 2020 18 hours ago, vzoltan said: do you say this should work (i.e. all CPU cores handling interrupts) and for some reason buggy with Armbian's N2 distro? The reason is that nobody at Armbian cares any more about such low level stuff. The string 'meson-g12b' (N2's board family) is missing from the case construct in https://github.com/armbian/build/blob/master/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization so what you see is what is to be expected. All IRQ handling on cpu0 therefore being a nice bottleneck. Some people think irqbalanced would help but at least in the past it was common knowledge that for stuff like storage or networking static IRQ affinity is the way to go. BTW: You have massive filesystem corruption on the /var/log partition. As for your storage issues a simple web search for 'odroid n2 usb issues' might help. Link to comment Share on other sites More sharing options...
tkaiser Posted June 30, 2020 Share Posted June 30, 2020 https://github.com/armbian/build/commit/1a04b50674626cf0165c84ef463c2b9e3df07061#commitcomment-40258499 1 Link to comment Share on other sites More sharing options...
vzoltan Posted July 2, 2020 Author Share Posted July 2, 2020 On 6/30/2020 at 8:05 AM, tkaiser said: BTW: You have massive filesystem corruption on the /var/log partition. As for your storage issues a simple web search for 'odroid n2 usb issues' might help. /var/log is zram and it randomly throws errors, I really had more pressing issues to fix than that one... But thanks for the reminder! And I really thought that HK did fix the USB issues. Honestly, purchasing this N2 and investing so much time to make it work was one of the worst decisions in my life, if not the THE worst. Link to comment Share on other sites More sharing options...
Recommended Posts