Junkman Posted November 24, 2023 Posted November 24, 2023 (edited) Hi, all. After I upgraded the system and rebooted, the secondary ethernet port(i.e. the LAN port) is gone. The system info: https://paste.armbian.com/ekotoqigos nanopi-r4s:~:% ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 68:27:19:ad:02:09 brd ff:ff:ff:ff:ff:ff inet 192.168.100.180/24 brd 192.168.100.255 scope global dynamic noprefixroute eth0 valid_lft 85963sec preferred_lft 85963sec inet6 fe80::ca55:3549:7a23:57fa/64 scope link noprefixroute valid_lft forever preferred_lft forever nanopi-r4s:~:% sudo dmesg | grep -i pcie [ 2.211030] rockchip-pcie f8000000.pcie: host bridge /pcie@f8000000 ranges: [ 2.211102] rockchip-pcie f8000000.pcie: MEM 0x00fa000000..0x00fbdfffff -> 0x00fa000000 [ 2.211138] rockchip-pcie f8000000.pcie: IO 0x00fbe00000..0x00fbefffff -> 0x00fbe00000 [ 2.212106] rockchip-pcie f8000000.pcie: no bus scan delay, default to 0 ms [ 2.212182] rockchip-pcie f8000000.pcie: no vpcie12v regulator found [ 2.713110] rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout! [ 2.713187] rockchip-pcie: probe of f8000000.pcie failed with error -110 nanopi-r4s:~:% uname -a Linux nanopi-r4s 6.1.50-current-rockchip64 #3 SMP PREEMPT Wed Aug 30 14:11:13 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux nanopi-r4s:~:% cat /etc/os-release PRETTY_NAME="Armbian 23.8.1 jammy" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.3 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.armbian.com" SUPPORT_URL="https://forum.armbian.com" BUG_REPORT_URL="https://www.armbian.com/bugs" PRIVACY_POLICY_URL="https://www.armbian.com" UBUNTU_CODENAME=jammy ARMBIAN_PRETTY_NAME="Armbian 23.8.1 jammy" I've checked the following post, and this bug should already be fixed, I don't know why it reappeared. The kernel message is old message: `PCIe link training gen1 timeout!`. I expect the new message should be something like: `PCIe link training gen1 timeout with x%d!\n`. It seems that the `drivers/pci/controller/pcie-rockchip-host.c` patch is not included in the latest nanopi r4s jammy 6.1 kernel? https://github.com/armbian/build/pull/4308/files NanoPi R4s, enp1s0 ethernet device not showed up after reboot Edited November 24, 2023 by Junkman 0 Quote
Junkman Posted November 24, 2023 Author Posted November 24, 2023 (edited) FYI, I burned the `rk3399-sd-ubuntu-focal-desktop-4.19-arm64-20230915.img.gz` to my SD card. https://wiki.friendlyelec.com/wiki/index.php/NanoPi_R4S#Flash_to_TF All the ethernet ports can be found and working (tested). So I think this might be some bugs in the Armbian build. $ cat /etc/os-release NAME="Ubuntu" VERSION="20.04.6 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.6 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal $ uname -a Linux NanoPi-R4S 4.19.193 #4 SMP PREEMPT Tue Sep 5 13:32:27 CST 2023 aarch64 aarch64 aarch64 GNU/Linux pi@NanoPi-R4S:~$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 68:27:19:ad:02:09 brd ff:ff:ff:ff:ff:ff inet 192.168.100.180/24 brd 192.168.100.255 scope global dynamic noprefixroute eth0 valid_lft 85478sec preferred_lft 85478sec inet6 fe80::ca10:91c8:345b:dcc4/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 66:a8:c1:46:84:83 brd ff:ff:ff:ff:ff:ff inet 192.168.100.181/24 brd 192.168.100.255 scope global dynamic noprefixroute eth1 valid_lft 85926sec preferred_lft 85926sec inet6 fe80::73e6:a4c0:efc2:da25/64 scope link noprefixroute valid_lft forever preferred_lft forever $ curl -IL --interface eth0 https://1.1.1.1/dns-query HTTP/2 415 server: cloudflare date: Fri, 24 Nov 2023 13:18:11 GMT access-control-allow-origin: * cf-ray: 82b1efb8a8d47d7c-LAX $ curl -IL --interface eth1 https://1.1.1.1/dns-query HTTP/2 415 server: cloudflare date: Fri, 24 Nov 2023 13:18:15 GMT access-control-allow-origin: * cf-ray: 82b1efd0c9a67eb7-LAX Edited November 24, 2023 by Junkman 0 Quote
Igor Posted November 24, 2023 Posted November 24, 2023 7 minutes ago, Junkman said: So I think this might be some bugs in the Armbian build. Nope, build is passing: There is nothing wrong with Armbian or its build framework. The problem is lack of general understanding how expensive is keeping this cheap hardware working with constantly moving kernel. https://docs.armbian.com/User-Guide_FAQ/#why-does-hardware-feature-xy-work-in-old-kernel-but-not-in-more-recent-one This is open source. Nobody prevents anyone to research and fix the problem. 0 Quote
Junkman Posted November 25, 2023 Author Posted November 25, 2023 (edited) Hi, @Igor. Thanks for your reply. Totally understand the tradeoffs between maintaining a mainline kernel and custom patches to the specified boards. I wonder how could I manually apply the PCIe driver patch to the current kernel. Do I have to rebuild the whole kernel? Edited November 25, 2023 by Junkman 0 Quote
Werner Posted November 25, 2023 Posted November 25, 2023 5 hours ago, Junkman said: I wonder how could I manually apply the PCIe driver patch to the current kernel. Do I have to rebuild the whole kernel? Both is possible using the build framework https://docs.armbian.com/Developer-Guide_Build-Preparation/ Check for the "kernel-patch" switch https://docs.armbian.com/Developer-Guide_Build-Options/ 0 Quote
andrewz1 Posted December 4, 2023 Posted December 4, 2023 (edited) Hi! The root of cause is not controller itself, but pcie phy. It's not properly initialized on module init. So when device is cold-booted (after power on) phy stay in "factory" state, but after warm reboot phy stay in "power_on" state and prevent any training. Patch https://github.com/armbian/build/pull/4308/files works because after first fail phy reset by pcie driver. https://github.com/andrewz1/rk3399-pcie-phy this is updated phy module and dtb overlay for switch to it (I just changed name). This is just proof of concept. patch attached (tested on edge kernel) phy-rockchip-pcie.patch Edited December 4, 2023 by andrewz1 1 Quote
Igor Posted December 8, 2023 Posted December 8, 2023 On 12/4/2023 at 10:49 AM, andrewz1 said: The root of cause is not controller itself, but pcie phy. It's not properly initialized on module init. So when device is cold-booted (after power on) phy stay in "factory" state, but after warm reboot phy stay in "power_on" state and prevent any training. Thanks for the tip! On 12/4/2023 at 10:49 AM, andrewz1 said: This is just proof of concept. Our current dirty hack is not even that and anyway doesn't work. You are welcome to remove that patch an add this one. Once merged, I can enable hardware on automated testings to see long term effects. 0 Quote
Solution andrewz1 Posted December 16, 2023 Solution Posted December 16, 2023 @Igor I created pull request for this issue https://github.com/armbian/build/pull/6057. 2 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.