Armbian_User Posted October 4, 2020 Posted October 4, 2020 (edited) Armbianmonitor: http://ix.io/2zH6 Upgrading cubox-i Armbian Buster from kernel 5.7.x to Kernel 5.8.y breaks ethernet and I'm unable to get it connected again. Works fine again downgrading back to 5.7.y. This occurs on multiple cubox-i devices. System diagnosis information will now be uploaded to http://ix.io/2zH6 dmesg | grep eth0 [ 4.667327] fec 2188000.ethernet eth0: registered PHC device 0 [ 22.369472] fec 2188000.ethernet eth0: Unable to connect to phy nlcli wlan0: connected to sketch-wlan "Broadcom BCM4330" wifi (brcmfmac), 6C:AD:F8:1D:36:25, hw, mtu 1500 ip4 default inet4 10.1.0.41/22 route4 0.0.0.0/0 route4 10.1.0.0/22 route4 169.254.0.0/16 ... eth0: unavailable "eth0" ethernet (fec), D0:63:B4:00:87:DD, hw, mtu 1500 ... nmcli con NAME UUID TYPE DEVICE br-4ba53c1beb78 580cd8d0-4b87-432d-b072-7b2191fc3dc8 bridge br-4ba53c1beb78 br-667404d55215 21ae2dad-0462-4880-98f2-8b56ae09dafa bridge br-667404d55215 br-7d46b681eda0 3dc1b651-2795-45fe-9821-7b33111d038c bridge br-7d46b681eda0 my-wlan 5d87696c-418b-404b-a6be-a85fd10c89cf wifi wlan0 Armbian ethernet 0a5bb1f6-799d-476f-9fe4-2ecc7f4fe055 ethernet -- nmcli con up 'Armbian ethernet' Error: Connection activation failed: No suitable device found for this connection (device eth0 not available because device has no carrier). sudo ip a ... 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether d0:63:b4:00:87:dd brd ff:ff:ff:ff:ff:ff 3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 6c:ad:f8:1d:36:25 brd ff:ff:ff:ff:ff:ff inet 10.1.0.41/22 brd 10.1.3.255 scope global dynamic noprefixroute wlan0 valid_lft 85910sec preferred_lft 85910sec ... sudo ip link set eth0 up RTNETLINK answers: No such device EDIT: this is link to armbianmonitor after downgrading back to 5.7.15 and ethernet (eth0) working once again http://ix.io/2zHa Edited October 4, 2020 by Armbian_User Add link to diagnostics after downgrade back to working version 0 Quote
Igor Posted October 4, 2020 Posted October 4, 2020 I would suspect bug in Network Manager app since our Cubox-i in auto-testing facility - with Ubuntu Focal based Armbian - shows no troubles - no. 14: https://beta.armbian.com/autotest.html Upgrading Network Manager by hand or switching to some other network tooling? Did you try clean build? 0 Quote
Armbian_User Posted October 12, 2020 Author Posted October 12, 2020 On 10/4/2020 at 7:23 PM, Igor said: I would suspect bug in Network Manager app since our Cubox-i in auto-testing facility - with Ubuntu Focal based Armbian - shows no troubles - no. 14: https://beta.armbian.com/autotest.html Upgrading Network Manager by hand or switching to some other network tooling? Did you try clean build? Hello Igor, Thanks for your response. I've had a chance to try your suggestions. None worked for me. Here is a link to the armbianmonitor from a new, clean install using a fresh download of Armbian_20.08.1_Cubox-i_buster_current_5.8.5.img.xz http://ix.io/2AvJ and another after apt-get update && apt-get upgrade -y http://ix.io/2AvL Just to clarify, all my cubox-i's are the cubox-14x4 variant and all exhibit this issue. Do you have any further suggestions I could try? Thanks 0 Quote
armbian-cubox-i Posted October 20, 2020 Posted October 20, 2020 Hello, I'm experiencing exactly the same problem. The upgrade to kernel 5.8.x broke the ethernet on my Cubox-i, a completely fresh image with the same kernel had the same issue. My solution for now was a clean install with kernel 5.7.15 (downgrading wasn't successful for me) and frozen kernel updates in armbian-config. 0 Quote
Kaaf Posted October 21, 2020 Posted October 21, 2020 Same here after a apt-upgrade -> No ethernet Clean install latest image -> No ethernet Clean install old image 5.6 or so -> ETHERNET!! I hope there will be solution soon 0 Quote
Armbian_User Posted November 6, 2020 Author Posted November 6, 2020 Anyone got any ideas how this might be resolved? Thanks 0 Quote
Igor Posted November 6, 2020 Posted November 6, 2020 5 hours ago, Armbian_User said: Anyone got any ideas how this might be resolved? Thanks Can you try building image from sources? This is todays build, after upgrading to 5.9.y. Ethernet works, device Cubox i4 (first variant, the only one I have, not 4x4) http://ix.io/2Dgd 0 Quote
Kaaf Posted November 16, 2020 Posted November 16, 2020 I've installed the nighly on a i4x4 and now the ethernet kept working. I will stay on that version until a stable 5.9 will be available. Thanks for the good work! 0 Quote
Kaaf Posted December 14, 2020 Posted December 14, 2020 (edited) Apparently the issue is still not solved with version 5.9.14 of the kernel. After applying this kernel and a reboot the ethernet was broken again. Back on the old 5.6 kernel. Hope there will be a permanent solution soon because I would like my buster to be up to date. Edited December 14, 2020 by Kaaf wrong version 0 Quote
Igor Posted December 14, 2020 Posted December 14, 2020 14 minutes ago, Kaaf said: Hope there will be a permanent solution Probably the only permanent solution is constant maintenance which will not come just like that - you need dedication, lot of free time and cash. We can't cover nothing at the degree users would want - already most of the project costs goes from our private budgets and it is always dealing with our goals / problems / families or solving never ending problems that bug users. On 11/16/2020 at 9:11 AM, Kaaf said: Thanks for the good work! Feels good! A (rare) compliment usually helps to move things on, but we are too low on resources and swamped out with core things to move things around and fix after some random problem in upstream Linux kills support for certain device. It happens all the time ... We are asking people / your for help all the time so we could build things up - both are coming slowly, but significantly slower and totally out of sync from what you/users would wants to have. Helping people which could perhaps look and fix this someday? 2 Quote
chrismade Posted December 20, 2020 Posted December 20, 2020 this break happens when you update from linux-dtb-current-imx6_20.08_armhf.deb 29-Aug-2020 20:02 linux-image-current-imx6_20.08_armhf.deb 29-Aug-2020 20:03 (still working, linux-5.7.15-imx6) to linux-dtb-current-imx6_20.08.1_armhf.deb 30-Aug-2020 20:34 linux-image-current-imx6_20.08.1_armhf.deb 30-Aug-2020 20:34 (breaking on *some* cubox-i, linux-5.8.5-imx6 ) the obvious reason for breaking is that kernel driver ./kernel/drivers/net/phy/at803x.ko is no longer loading - driver exists, but manual loading via modprobe results in an error message now comparing the two packages shows a difference in file "modules.alias" /lib/modules/5.7.15-imx6 alias mdio:00000000010011011101000001000001 at803x alias mdio:000000000100110111010000011?0010 at803x alias mdio:000000000100110111010000011?0100 at803x alias mdio:000000000100110111010000011?0110 at803x (working) to /lib/modules/5.8.5-imx6 alias mdio:00000000010011011101000001000001 at803x alias mdio:00000000010011011101000001110010 at803x alias mdio:00000000010011011101000000100011 at803x alias mdio:00000000010011011101000001110100 at803x alias mdio:000000000100110111010000011?0110 at803x (not working on *some* cubox) not only the number of lines differs, also check the question marks in the working config! It seems that the driver checks these HW keys in order to ensure HW is really there - and if these keys does not match driver does not load and hence eth0 is not working. Depending which exact type of cubox you have ... it will load - or not - which might explain why it works on Igor's cubox while other have issues. Unfortunately just editing the modules-alias won't do the fix, we also need to generate a "modules.alias.bin" after the change - does anyone know how to do this (so I can resume testing)?? 0 Quote
chrismade Posted December 22, 2020 Posted December 22, 2020 I'm currently in contact with maintainers of this driver "at803x" - the first assumption that just a single line regarding the device ID (see my post above ... modules.aliases, which btw is created by depmod) was missing in the driver source was (unfortunately) wrong. The driver for this device has been widely reworked from 5.7.x (rather simple, only recognizing AT8030 and AT8035) to 5.8.x (now rather complex and supporting much more devices from that family beyond 8030/8035) - Another finding was that NOT ALL cuboxes are impacted - already reported by Igor that his regression box is still ok. From my collection 2 out of 3 are working ok also with 5.8.x kernel and higher This one has issues (stopped working after upgrade to 5.8.x) SolidRun i4P TV-300-D while these are still working SolidRun 4x4 300-D SolidRun i2EXW 300-D if the (last 3 bytes from) MAC address is just incrementing during production then the one box having issues is in between the two which are working - The two which are working have WiFi and the one which has issues does not - but that does not fit to the post above from "Armbian User" who reported issues on a WiFi version There might be one more thing find /sys -name phy_id /sys/devices/platform/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:00/phy_id reported on the systems which work (on 5.8.x and beyond) while my system which breaks reports /sys/devices/soc0/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:04/phy_id which has two differences: after /devices/ we see /soc0/ instead of /platform/ and further in the path /2188000.ethernet-1:04/ instead of /2188000.ethernet-1:00/ device tree seems to be the same for all there are small differences - and these seem to matter - if someone still reads this post ... and wants to contribute ... and has a system which broke from upgrading 5.7.x > 5.8.x *kindly check on your system* and post results here - for find /sys -name phy_id which is always an important part of troubleshooting to sort apart systems which still work and those which break - and to find reliable criteria which belongs to which group 1 Quote
Armbian_User Posted December 23, 2020 Author Posted December 23, 2020 18 hours ago, chrismade said: if someone still reads this post ... and wants to contribute ... and has a system which broke from upgrading 5.7.x > 5.8.x *kindly check on your system* and post results here - for find /sys -name phy_id I have this issue on 2 out of 3 cubox-i4x4-300-D models All 3 have WiFi working successfully Here is the output from the working (5.9.14) system: find /sys -name phy_id /sys/devices/platform/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:00/phy_id find: ‘/sys/module/at803x/drivers’: Input/output error Here is the output from one of the broken (working on 5.7.15) system: find /sys -name phy_id /sys/devices/soc0/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:04/phy_id find: ‘/sys/module/at803x/drivers’: Input/output error 0 Quote
chrismade Posted December 23, 2020 Posted December 23, 2020 Thanks @Armbian_User it seems we are coming closer to find reliable criteria which systems are affected - I see the same pattern on my cuboxes anyone else ? btw: the message find: ‘/sys/module/at803x/drivers’: Input/output error as well as the MM error / trace in dmesg is expected - at least seems not related to this issue 0 Quote
mwalle Posted December 24, 2020 Posted December 24, 2020 (edited) Hi, I'm trying to investigate this issue, but TBH its hard to follow the results. Zitat I have this issue on 2 out of 3 cubox-i4x4-300-D models So you have exactly the same model and one of them is not working? Could you please share the output of: find /sys -name phy_id cat /proc/device-tree/model dmesg for these models when booting the 5.9.14 image? -michael Edited December 24, 2020 by mwalle 0 Quote
chrismade Posted December 24, 2020 Posted December 24, 2020 There is another observation ... the ones which started early on Armbian Debian Buster for cubox don't get any kernel updates - which might be another reason why the number of Armbian users reporting this issue is low - early Armbian Debian buster images receive updates on other packages and will be updated to 10.7 after "apt update && apt upgrade" but kernel remains in 5.3.1 (forever or only when updated manually) - the HW effect reported in this thread, that the PHY is usually on addr 00 while it seems on _some_ hardware it is on addr 04 was either ignored or somehow handled by the old driver until 5.7.x - so all these users won't experience this issue usually (however, remaining on older kernels have risks, too) Regarding the issue it looks like a period of production - neither at the beginning nor at the end - PHY got a different addr (04 instead of 00) and the new driver from 5.8.x onwards only expects addr 00 - and hence does not work if PHY is on addr 04 I wonder if there is a serial number or similar - I haven't found anything like this yet - if there is really none - maybe the MAC address printed on the bottom helps is to identify which cuboxes have the PHY on addr 04 ? My only cubox having this addr on 04 instead of 00 has MAC = D0 63 B4 - 00 77 BB @Armbian_User- or anyone else - can you pls check if your affected systems are near that range (the last 3 bytes matter) ? 0 Quote
umiddelb Posted December 25, 2020 Posted December 25, 2020 root@mgmt:~# cat /etc/armbian-release # PLEASE DO NOT EDIT THIS FILE BOARD=cubox-i BOARD_NAME="Cubox i2eX/i4" BOARDFAMILY=cubox BUILD_REPOSITORY_URL=https://github.com/armbian/build BUILD_REPOSITORY_COMMIT=b9adf0ea-dirty VERSION=5.98 LINUXFAMILY=cubox BRANCH=next ARCH=arm IMAGE_TYPE=stable BOARD_TYPE=conf INITRD_ARCH=arm KERNEL_IMAGE_TYPE=zImage root@mgmt:~# uname -a Linux mgmt 5.3.1-cubox #5.98 SMP Fri Sep 27 23:11:49 CEST 2019 armv7l armv7l armv7l GNU/Linux root@mgmt:~# find /sys -name phy_id /sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:04/phy_id root@mgmt:~# ifconfig eth0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.1.0.1 netmask 255.255.255.0 broadcast 10.1.0.255 inet6 fe80::4eee:3b87:e07b:7419 prefixlen 64 scopeid 0x20<link> ether d0:63:b4:00:83:30 txqueuelen 1000 (Ethernet) RX packets 3342175 bytes 606622753 (606.6 MB) RX errors 0 dropped 1 overruns 0 frame 0 TX packets 2607316 bytes 244851799 (244.8 MB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 0 Quote
usual user Posted December 25, 2020 Posted December 25, 2020 Just switched to 5.10.0. I noticed that the DTB has changed multiple node names. As a result, my user space could no longer apply all configuration quirks due to the changed paths in sysfs. Useing a DTB before 5.7.x resolved all observed regressions for me. Perhaps it is also worth trying in your case to run the current kernel with a pre 5.7.x DTB as a test. 0 Quote
armbian-cubox-i Posted December 26, 2020 Posted December 26, 2020 Here the details about my cubox i4 which also had network issues with 5.8.x. It is currently running smoothly with an older version. I hope somebody helps this information as such driver issues go beyond my knowledge. cat /etc/armbian-release # PLEASE DO NOT EDIT THIS FILE BOARD=cubox-i BOARD_NAME="Cubox i2eX/i4" BOARDFAMILY=imx6 BUILD_REPOSITORY_URL=https://github.com/armbian/build BUILD_REPOSITORY_COMMIT=b9814056 DISTRIBUTION_CODENAME=buster DISTRIBUTION_STATUS=supported VERSION=20.11.3 LINUXFAMILY=imx6 BRANCH=current ARCH=arm IMAGE_TYPE=stable BOARD_TYPE=conf INITRD_ARCH=arm KERNEL_IMAGE_TYPE=Image uname -a Linux <hostname> 5.7.15-imx6 #20.08 SMP Mon Aug 17 07:36:36 CEST 2020 armv7l GNU/Linux find /sys -name phy_id /sys/devices/soc0/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/$ find: ‘/sys/module/at803x/drivers’: Input/output error ifconfig eth0 eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet <ip router> netmask 255.255.255.0 broadcast <ip cubox> inet6 2001:9e8:208c:6800:84e5:8994:7642:4da2 prefixlen 64 scopeid 0x0<global> inet6 fe80::3721:95e0:891b:d0ef prefixlen 64 scopeid 0x20<link> ether d0:63:b4:00:32:d9 txqueuelen 1000 (Ethernet) RX packets 17508632 bytes 3075857589 (2.8 GiB) RX errors 0 dropped 4 overruns 0 frame 0 TX packets 6162802 bytes 803661074 (766.4 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0Cubox 0 Quote
chrismade Posted January 12, 2021 Posted January 12, 2021 I just tested successfully a patch in the device tree from one of the kernel developers which fixes that problem for all cuboxes which have PHY on addr 0x04 instead of 0x00 for newer kernels - stay tuned.. 0 Quote
chrismade Posted January 18, 2021 Posted January 18, 2021 I haven't heard back from kernel developers if/when we could expect a fix - I assume they are currently busy fixing other issues - if you are impacted or afraid to update your cubox because you might be impacted ... here is how you can fix it yourself (modify the required device-tree file). Luckily the tool "dtc" should be in our armbian image already and this tool works bi-directional, can make binaries out of source - and vice versa! Here is how check if "dtc" = device-tree-compiler available - and if it is the required version: cubox-i:~$ dtc -v Version: DTC 1.4.7 cubox-i:~$ next... read device-tree-binary (as non-root) from /boot/dtb/ and write source back into current directory: cubox-i:~$ dtc -I dtb -O dts -o imx6q-cubox-i.dts /boot/dtb/imx6q-cubox-i.dtb ignore warning: imx6q-cubox-i.dts: Warning (unique_unit_address): /soc/bus@2000000/iomuxc-gpr@20e0000: duplicate unit-address (also used in node /soc/bus@2000000/pinctrl@20e0000) cubox-i:~$ next... open in editor of your choice and find this sequence (should appear approx line 1250 ): ethernet@2188000 { compatible = "fsl,imx6q-fec"; reg = < 0x2188000 0x4000 >; interrupt-names = "int0\0pps"; interrupts = < 0x00 0x76 0x04 0x00 0x77 0x04 >; clocks = < 0x02 0x75 0x02 0x75 0x02 0xbe >; clock-names = "ipg\0ahb\0ptp"; fsl,stop-mode = < 0x01 0x34 0x1b >; status = "okay"; pinctrl-names = "default"; pinctrl-0 = < 0x2e >; phy-handle = < 0x2f >; delete the last line "phy-handle = < 0x2f >;" a few lines below you will find this block ethernet-phy@0 { reg = < 0x00 >; qca,clk-out-frequency = < 0x7735940 >; phandle = < 0x2f >; }; again... delete line "phandle = < 0x2f >;" next dublicate this block ethernet-phy@0 { reg = < 0x00 >; qca,clk-out-frequency = < 0x7735940 >; }; and modify the first two lines in the 2nd block to ethernet-phy@4 { reg = < 0x04 >; qca,clk-out-frequency = < 0x7735940 >; }; both blocks then look like ethernet-phy@0 { reg = < 0x00 >; qca,clk-out-frequency = < 0x7735940 >; }; ethernet-phy@4 { reg = < 0x04 >; qca,clk-out-frequency = < 0x7735940 >; }; safe the file and compile to binary cubox-i:~$ dtc -I dts -O dtb -o imx6q-cubox-i.dtb imx6q-cubox-i.dts unfortunately you will (again) see lots of warnings (really ... a lot): imx6q-cubox-i.dtb: Warning (unique_unit_address): /soc/bus@2000000/iomuxc-gpr@20e0000: duplicate unit-address (also used in node /soc/bus@2000000/pinctrl@20e0000) imx6q-cubox-i.dtb: Warning (clocks_property): /ldb:clocks: cell 0 is not a phandle reference imx6q-cubox-i.dtb: Warning (clocks_property): /ldb:clocks: cell 2 is not a phandle reference imx6q-cubox-i.dtb: Warning (clocks_property): /ldb:clocks: cell 4 is not a phandle reference ignore warning - I usually take warning always very seriously but not in this case now the fixed device tree is in file imx6q-cubox-i.dtb in your current directory before you copy (now as root) into /boot/dtb you may want to _rename_ the old file - so you could mount the SDcard of your cubox on another computer and restore this original file if required with this modified (self fixed) device tree newer kernels should have ethernet regardless if your cubox has PHY on addr #0 or #4 Note: device-tree-files can be used on various kernels from the same generation - between generations there might be breaking changes 0 Quote
chrismade Posted January 18, 2021 Posted January 18, 2021 alternatively ... and more elegant ... you can build that device-tree from kernel sources - e.g. download a 5.x.y kernel from kernel.org (I tested if with linux-5.10.6 ), unpack it and enter the source directory ( cd linux-5.10.6 in my case ) This is the diff which I got from the kernel developer diff --git a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi index b06577808ff4..3db08363d3fb 100644 --- a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi +++ b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi @@ -53,7 +53,6 @@ &fec { pinctrl-names = "default"; pinctrl-0 = <&pinctrl_microsom_enet_ar8035>; - phy-handle = <&phy>; phy-mode = "rgmii-id"; phy-reset-duration = <2>; phy-reset-gpios = <&gpio4 15 GPIO_ACTIVE_LOW>; @@ -63,10 +62,15 @@ #address-cells = <1>; #size-cells = <0>; - phy: ethernet-phy@0 { + ethernet-phy@0 { reg = <0>; qca,clk-out-frequency = <125000000>; }; + + ethernet-phy@4 { + reg = <4>; + qca,clk-out-frequency = <125000000>; + }; }; }; as you can see ... changes are applied to include-file "imx6qdl-sr-som.dtsi" in directory "arch/arm/boot/dts/" getting the device-tree binary requires a step in between because the device-tree source files make use of "#include" - a directive which the device tree compiler does NOT understand, so the gcc preprocessor must help here cpp -nostdinc -I include -I arch -undef -x assembler-with-cpp arch/arm/boot/dts/imx6q-cubox-i.dts imx6q-cubox-i.dts.preprocessed then you can start "dtc" next dtc -I dts -O dtb -p 0x1000 imx6q-cubox-i.dts.preprocessed -o imx6q-cubox-i-new.dtb the fixed device-tree-binary is now in file " imx6q-cubox-i-new.dtb" which need to replace "imx6q-cubox-i.dtb" in /boot/dtb (again - consider rename the orgininal file instead of delete or overwrite) 0 Quote
chrismade Posted January 27, 2021 Posted January 27, 2021 ** update - received this feedback today: The fix has been submitted, Sh___ G__* applied it, it should eventually end up in mainline soon and backported to stable kernels. *) name modified due protect privacy Thats good news - keep your cubox(es) updated 0 Quote
Armbian_User Posted January 27, 2021 Author Posted January 27, 2021 @chrismade Thats great news. Thanks for your efforts on this! 0 Quote
usual user Posted January 30, 2021 Posted January 30, 2021 On 1/18/2021 at 11:41 PM, chrismade said: I haven't heard back from kernel developers if/when we could expect a fix The fix has landed in 5.11, but as I don't see any stable tag, you have to wait till Armbian moves to 5.11 or compose an Armbinan PR for the patch to have it backported for older kernels. 0 Quote
chrismade Posted January 30, 2021 Posted January 30, 2021 thanks @usual userfor looking this up - now we even have a version number - IMHO we can wait - anyone impacted by this issue needing an urgent fix can modify the DT file following the posted instructions above 0 Quote
Kaaf Posted May 25, 2021 Posted May 25, 2021 I'm running for quite a while now on the stable 5.10 buster with no issues. I consider it solved...hahah. Thanks for the good work No need to wait for stable 5.11 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.