robrob Posted October 1, 2021 Posted October 1, 2021 Hi guys, I have been using my Helios64 for more than a year now. I have it connected with the 1Gbps NIC. I have been reading about network bonding and getting both NICs up and running. The main purpose would be to have it running in mode "Adaptive Transmit Load Balancing" (mode 5). First I've discovered the issue with the 2.5Gbps not being able to keep a consistent (and performant) speed. So I've soldered the missing pin from the capacitor to the NIC (more on this later). I've checked the speed and now it is up to specs, getting very close to 1000Mbps (960-980Mbps). But here it comes the problem that I'm facing, if I enable both independently I do not get the 2.5Gbps to even appear all the time... at some point I did get it to work and that is when I performed the test of speed. I'm running current (stable) 5.10.63-rockchip64 #21.08.2 (and I have tried with no more success 5.13.x and 5.14.x) On the logs I find these lines that are a good idea that the 2.5Gbps NIC is having some issues: [50618.607406] usb 2-1.4: reset SuperSpeed Gen 1 USB device number 96 using xhci-hcd [50618.735965] r8152 2-1.4:1.0 eth1: v2.15.0 (2021/04/15) [50618.735975] r8152 2-1.4:1.0 eth1: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625. [50675.867182] r8152 2-1.4:1.0 eth1: get_registers -19 [50675.867653] r8152 2-1.4:1.0 eth1: Get ether addr fail So: 1.- has someone else seen these issues in the logs? If yes, has anyone been able to find a solution? 2.- the soldering went well, but the cable I used is a very small one, "hook-up calbe 32 AWG" (will try to update with a pic), could it be this the source of trouble? can both NICs run at the same time? (electric constrains, power rails limit, etc). Has anyone else attempted the soldering fix? Anyone can share a picture of their working setup? Thanks! Rob 0 Quote
flower Posted October 1, 2021 Posted October 1, 2021 if you just buy one 2.5GBe switch you can use only one interface and get more speed than with 2x1GBe 0 Quote
fri.K Posted October 9, 2021 Posted October 9, 2021 H, I'm also facing issues teaming and bonding both NICs in Helios64. Similar configuration works on Helios4 (second NIC added vie USB 3.0 port). Once 1Gb link sets the speed it crashes and can't be set up: $ dmesg [ 3097.357372] rk_gmac-dwmac fe300000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off [ 3097.357423] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 3097.372171] rk_gmac-dwmac fe300000.ethernet eth0: Link is Down [ 3097.379293] bond0: (slave eth0): Error: Device is in use and cannot be enslaved [ 3097.412528] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 3097.625230] rk_gmac-dwmac fe300000.ethernet: Failed to reset the dma [ 3097.625246] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 3097.625257] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_open: Hw setup failed [ 3097.676541] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 3097.885517] rk_gmac-dwmac fe300000.ethernet: Failed to reset the dma [ 3097.885538] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 3097.885554] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_open: Hw setup failed [ 3097.944560] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 3098.156236] rk_gmac-dwmac fe300000.ethernet: Failed to reset the dma [ 3098.156253] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 3098.156264] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_open: Hw setup failed [ 3098.204543] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 3098.410564] rk_gmac-dwmac fe300000.ethernet: Failed to reset the dma [ 3098.410581] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 3098.410592] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_open: Hw setup failed [ 3098.556619] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 3098.768443] rk_gmac-dwmac fe300000.ethernet: Failed to reset the dma [ 3098.768459] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 3098.768465] rk_gmac-dwmac fe300000.ethernet eth0: stmmac_open: Hw setup failed Looking at https://wiki.kobol.io/helios64/ethernet Quote LAN 1 is the native Gigabit Ethernet interface from SoC RK3399. The interfaces is exposed through the Ethernet transceiver RTL8211F connected to RK3399 via RGMII. $ ethtool -i eth0 driver: st_gmac version: Jan_2016 firmware-version: expansion-rom-version: bus-info: supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no $ lshw *-network:0 description: Ethernet interface physical id: 7 logical name: eth0 serial: 64:62:FF:FF:FF:FF capacity: 1Gbit/s capabilities: ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=st_gmac driverversion=Jan_2016 link=no multicast=yes port=twisted pair $ ethtool eth0 Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on MDI-X: Unknown Cannot get wake-on-lan settings: Operation not permitted Current message level: 0x0000003f (63) drv probe link timer ifdown ifup Link detected: no $ uname -a Linux nas 5.10.63-rockchip64 #21.08.2 SMP PREEMPT Wed Sep 8 10:57:23 UTC 2021 aarch64 GNU/Linux Is it a kernel bug? If yes, how to debug it further? 0 Quote
fri.K Posted October 12, 2021 Posted October 12, 2021 Problem looks very similar like on Odroid N2 SBC with the same SoC RK3399 [ 3.946307] kernel: meson8b-dwmac ff3f0000.ethernet: IRQ eth_wake_irq not found [ 3.947735] kernel: meson8b-dwmac ff3f0000.ethernet: IRQ eth_lpi not found [ 3.959955] kernel: meson8b-dwmac ff3f0000.ethernet: PTP uses main clock [ 3.960322] kernel: meson8b-dwmac ff3f0000.ethernet: no reset control found [ 3.987067] kernel: meson8b-dwmac ff3f0000.ethernet: User ID: 0x11, Synopsys ID: 0x37 [ 3.988569] kernel: meson8b-dwmac ff3f0000.ethernet: DWMAC1000 [ 3.993813] kernel: meson8b-dwmac ff3f0000.ethernet: DMA HW capability register supported [ 4.008458] kernel: meson8b-dwmac ff3f0000.ethernet: RX Checksum Offload Engine supported [ 4.017824] kernel: meson8b-dwmac ff3f0000.ethernet: COE Type 2 [ 4.038408] kernel: meson8b-dwmac ff3f0000.ethernet: TX Checksum insertion supported [ 4.045460] kernel: meson8b-dwmac ff3f0000.ethernet: Wake-Up On Lan supported [ 4.056526] kernel: meson8b-dwmac ff3f0000.ethernet: Normal descriptors [ 4.070716] kernel: meson8b-dwmac ff3f0000.ethernet: Ring mode enabled [ 4.076492] kernel: meson8b-dwmac ff3f0000.ethernet: Enable RX Mitigation via HW Watchdog Timer [ 4.076496] kernel: meson8b-dwmac ff3f0000.ethernet: device MAC address 00:11:22:33:44:FF [ 97.207023] kernel: meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.0:00] driver [RTL8211F Gigabit Ethernet] (irq=42) [ 97.209362] kernel: meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found [ 97.209373] kernel: meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW [ 97.209582] kernel: meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rgmii link mode [ 100.670345] kernel: meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off [ 100.677458] kernel: meson8b-dwmac ff3f0000.ethernet eth0: Link is Down [ 100.819033] kernel: meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.0:00] driver [RTL8211F Gigabit Ethernet] (irq=42) [ 100.861699] kernel: meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found [ 100.861714] kernel: meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW [ 100.861727] kernel: meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rgmii link mode If I manually try to activate a bit different USB card on Helios64 I got this: # nmcli conn up Team0 Connection successfully activated (master waiting for slaves) (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/8) root@helios64:~# [179508.201299] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [179508.201798] Modules linked in: macvlan veth nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge aufs team_mode_loadbalance team governor_performance rfkill cdc_ether usbnet xt_conntrack nft_counter nft_chain_nat xt_nat xt_tcpudp zram xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat nf_tables nfnetlink r8152 snd_soc_hdmi_codec ftdi_sio usbserial snd_soc_rockchip_i2s snd_soc_core gpio_charger rockchip_vdec(C) snd_pcm_dmaengine hantro_vpu(C) rockchipdrm snd_pcm leds_pwm pwm_fan dw_mipi_dsi snd_timer panfrost rockchip_rga v4l2_h264 dw_hdmi snd videobuf2_dma_contig analogix_dp soundcore videobuf2_dma_sg gpu_sched v4l2_mem2mem fusb302 drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 tcpm videobuf2_common cec videodev rc_core typec mc drm sg drm_panel_orientation_quirks gpio_beeper cpufreq_dt nfsd auth_rpcgss nfs_acl lockd grace ledtrig_netdev lm75 sunrpc ip_tables x_tables autofs4 raid10 raid1 raid0 multipath linear raid456 [179508.201952] async_raid6_recov async_memcpy async_pq async_xor async_tx md_mod uas realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys [179508.210617] CPU: 4 PID: 1029 Comm: kworker/4:42 Tainted: G C 5.10.63-rockchip64 #21.08.2 [179508.211445] Hardware name: Helios64 (DT) [179508.211809] Workqueue: usb_hub_wq hub_event [179508.212187] pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--) [179508.212732] pc : rtl8152_post_reset+0x1d4/0x1e0 [r8152] [179508.213201] lr : rtl8152_post_reset+0x3c/0x1e0 [r8152] [179508.213658] sp : ffff80001958bbd0 [179508.213959] x29: ffff80001958bbd0 x28: ffff00000754f800 [179508.214435] x27: 0000000000000010 x26: 0000000000000000 [179508.214911] x25: ffff000007610b20 x24: 00000000ffffffed [179508.215387] x23: 0000000000000000 x22: ffff000007885000 [179508.215863] x21: ffff000007464800 x20: ffff8000118b9948 [179508.216339] x19: ffff000007885980 x18: ffff8000118dee10 [179508.216815] x17: 0000000000000000 x16: 0000000000000000 [179508.217291] x15: 0000000000000428 x14: ffff80001958b790 [179508.217765] x13: 00000000ffffffea x12: ffff80001194ee48 [179508.218241] x11: 0000000000000003 x10: ffff800011936e08 [179508.218716] x9 : ffff800011936e60 x8 : 0000000000017fe8 [179508.219192] x7 : c0000000ffffefff x6 : 0000000000000001 [179508.219668] x5 : 0000000000000001 x4 : 0000000000000000 [179508.220144] x3 : 0000000000000001 x2 : b9eafe8d9839b500 [179508.220619] x1 : 0000000000000000 x0 : 0000000000000010 [179508.221096] Call trace: [179508.221324] rtl8152_post_reset+0x1d4/0x1e0 [r8152] [179508.221761] usb_reset_device+0x128/0x258 [179508.222122] hub_event+0xe94/0x1618 [179508.222439] process_one_work+0x1ec/0x4d0 [179508.222801] worker_thread+0x48/0x478 [179508.223133] kthread+0x140/0x150 [179508.223428] ret_from_fork+0x10/0x34 [179508.223752] Code: d2800100 910042a3 14009623 17ffffdc (d4210000) [179508.224297] ---[ end trace 087e94c7a84ef6dd ]--- [179508.224711] note: kworker/4:42[1029] exited with preempt_count 1 [179508.225383] ------------[ cut here ]------------ [179508.225812] WARNING: CPU: 4 PID: 0 at kernel/rcu/tree.c:624 rcu_eqs_enter.isra.63+0x138/0x140 [179508.226564] Modules linked in: macvlan veth nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge aufs team_mode_loadbalance team governor_performance rfkill cdc_ether usbnet xt_conntrack nft_counter nft_chain_nat xt_nat xt_tcpudp zram xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat nf_tables nfnetlink r8152 snd_soc_hdmi_codec ftdi_sio usbserial snd_soc_rockchip_i2s snd_soc_core gpio_charger rockchip_vdec(C) snd_pcm_dmaengine hantro_vpu(C) rockchipdrm snd_pcm leds_pwm pwm_fan dw_mipi_dsi snd_timer panfrost rockchip_rga v4l2_h264 dw_hdmi snd videobuf2_dma_contig analogix_dp soundcore videobuf2_dma_sg gpu_sched v4l2_mem2mem fusb302 drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 tcpm videobuf2_common cec videodev rc_core typec mc drm sg drm_panel_orientation_quirks gpio_beeper cpufreq_dt nfsd auth_rpcgss nfs_acl lockd grace ledtrig_netdev lm75 sunrpc ip_tables x_tables autofs4 raid10 raid1 raid0 multipath linear raid456 [179508.226715] async_raid6_recov async_memcpy async_pq async_xor async_tx md_mod uas realtek dwmac_rk stmmac_platform stmmac pcs_xpcs adc_keys [179508.235375] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G D C 5.10.63-rockchip64 #21.08.2 [179508.236156] Hardware name: Helios64 (DT) [179508.236511] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--) [179508.237046] pc : rcu_eqs_enter.isra.63+0x138/0x140 [179508.237474] lr : rcu_eqs_enter.isra.63+0x1c/0x140 [179508.237894] sp : ffff800011cebf10 [179508.238194] x29: ffff800011cebf10 x28: 0000000000000000 [179508.238671] x27: 0000000000000000 x26: ffff000000710e80 [179508.239147] x25: 0000000000000000 x24: ffff800011307b40 [179508.239623] x23: ffff80001157e978 x22: ffff8000118b9948 [179508.240097] x21: ffff8000118ba2e8 x20: ffff8000118b99c8 [179508.240573] x19: ffff800011580a40 x18: 0000000000000004 [179508.241049] x17: 0000000000000001 x16: 0000000000000019 [179508.241524] x15: ffff8000118da498 x14: 000000000000015a [179508.242000] x13: 0000000000000000 x12: 0000000000000001 [179508.242476] x11: 0000000000000040 x10: ffff8000118d9c98 [179508.242952] x9 : ffff8000118d9c90 x8 : ffff000000800028 [179508.243428] x7 : 0000000000000000 x6 : 000003eb30cab0ae [179508.243904] x5 : 00ffffffffffffff x4 : ffff8000e6226000 [179508.244380] x3 : 0000000000000001 x2 : 4000000000000000 [179508.244854] x1 : 4000000000000002 x0 : ffff0000f77a6a40 [179508.245330] Call trace: [179508.245556] rcu_eqs_enter.isra.63+0x138/0x140 [179508.245956] rcu_idle_enter+0x10/0x20 [179508.246286] default_idle_call+0x40/0x1bc [179508.246646] do_idle+0x204/0x278 [179508.246939] cpu_startup_entry+0x24/0x60 [179508.247295] secondary_start_kernel+0x168/0x178 [179508.247700] ---[ end trace 087e94c7a84ef6de ]--- 0 Quote
robrob Posted October 15, 2021 Author Posted October 15, 2021 Hi fri.K, it definitively looks like a driver issue (something goes wrong in rtl8152_post_reset)... I will try to debug further based on your information. I'm also wondering if there's an issue with power consumption when using both NIC (attached the image of my fix for 1Gbps)... cable size (AWG) and length could be an issue? Will try to test also an external USB-RTL8152 dongle to see if the same driver works. 0 Quote
robrob Posted November 26, 2021 Author Posted November 26, 2021 I have been trying for a while now the edge kernels and I have found that after the HW-mod (above) the network bonding has started to work... not consistently but once setup, the next reboot will actually enable the full network bonding. Currently I'm running 5.15.4, had to rebuild ZFS but that was it. Packages I've upgraded: - linux-dtb-edge-rockchip64/focal 21.11.0-trunk.65 arm64 - linux-headers-edge-rockchip64/focal 21.11.0-trunk.65 arm64 - linux-image-edge-rockchip64/focal 21.11.0-trunk.65 arm64 - Package: zfs-dkms - Version: 2.1.1-0york0~20.04 - Priority: optional - Section: kernel - Source: zfs-linux $ cat /proc/net/bonding/nm-bond0 Ethernet Channel Bonding Driver: v5.15.4-rockchip64 Bonding Mode: transmit load balancing Primary Slave: None Currently Active Slave: eth1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: eth1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 64:62:66:xx:xx:xx Slave queue ID: 0 Slave Interface: eth0 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 64:62:66:xx:xx:xx Slave queue ID: 0 1 Quote
ebin-dev Posted November 26, 2021 Posted November 26, 2021 On 10/1/2021 at 7:57 PM, flower said: if you just buy one 2.5GBe switch you can use only one interface and get more speed than with 2x1GBe @flowerThe 2.5G interface (eth1) indeed operates absolutely reliable if connected to a 2.5G switch - even without the hardware mod and using current Armbian Bullseye (kernel downgraded to 5.10.43). My interfaces are managed by systemd-networkd - the config files for the bridged setup are attached below. With this configuration the interfaces receive ipv4 and link local ipv6 addresses. Net transfer rates are around 255 MBytes/s (a nice alternative to bonding two 1Gig Ethernet interfaces). Armbian 21.08.6 Bullseye with Linux 5.10.43-rockchip64 # cd /etc/systemd/network/ # ls -la total 20 drwxr-xr-x 2 root root 4096 Nov 16 14:42 . drwxr-xr-x 6 root root 4096 Nov 26 07:00 .. -rw-r--r-- 1 root root 30 Nov 26 2020 br0.netdev -rw-r--r-- 1 root root 233 Oct 10 19:27 br0.network -rw-r--r-- 1 root root 40 Nov 16 14:42 eth0.network # cat br0.netdev [NetDev] Name=br0 Kind=bridge # cat br0.network [Match] Name=br0 [Network] IPForward=yes DHCP=no Address=192.168.xxx.xxx/24 Gateway=192.168.xxx.xxx DNS=192.168.xxx.xxx # cat eth0.network [Match] Name=eth1 [Network] Bridge=br0 # networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether off unmanaged 3 br0 bridge routable configured 4 eth1 ether enslaved configured 5 lxcbr0 bridge no-carrier unmanaged 6 vethpivccu ether degraded unmanaged 6 links listed. 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.