Jump to content

nanopi-r4s, enp1s0 ethernet device not found after reboot


Go to solution Solved by andrewz1,

Recommended Posts

Posted (edited)

Hi, all. After I upgraded the system and rebooted, the secondary ethernet port(i.e. the LAN port) is gone.

 

The system info: https://paste.armbian.com/ekotoqigos

 

nanopi-r4s:~:% ip a 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 68:27:19:ad:02:09 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.180/24 brd 192.168.100.255 scope global dynamic noprefixroute eth0
       valid_lft 85963sec preferred_lft 85963sec
    inet6 fe80::ca55:3549:7a23:57fa/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

nanopi-r4s:~:% sudo dmesg | grep -i pcie
[    2.211030] rockchip-pcie f8000000.pcie: host bridge /pcie@f8000000 ranges:
[    2.211102] rockchip-pcie f8000000.pcie:      MEM 0x00fa000000..0x00fbdfffff -> 0x00fa000000
[    2.211138] rockchip-pcie f8000000.pcie:       IO 0x00fbe00000..0x00fbefffff -> 0x00fbe00000
[    2.212106] rockchip-pcie f8000000.pcie: no bus scan delay, default to 0 ms
[    2.212182] rockchip-pcie f8000000.pcie: no vpcie12v regulator found
[    2.713110] rockchip-pcie f8000000.pcie: PCIe link training gen1 timeout!
[    2.713187] rockchip-pcie: probe of f8000000.pcie failed with error -110

nanopi-r4s:~:% uname -a
Linux nanopi-r4s 6.1.50-current-rockchip64 #3 SMP PREEMPT Wed Aug 30 14:11:13 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

nanopi-r4s:~:% cat /etc/os-release 
PRETTY_NAME="Armbian 23.8.1 jammy"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.armbian.com"
SUPPORT_URL="https://forum.armbian.com"
BUG_REPORT_URL="https://www.armbian.com/bugs"
PRIVACY_POLICY_URL="https://www.armbian.com"
UBUNTU_CODENAME=jammy
ARMBIAN_PRETTY_NAME="Armbian 23.8.1 jammy"

 

 

I've checked the following post, and this bug should already be fixed, I don't know why it reappeared.

 

The kernel message is old message: `PCIe link training gen1 timeout!`.

I expect the new message should be something like: `PCIe link training gen1 timeout with x%d!\n`.

It seems that the `drivers/pci/controller/pcie-rockchip-host.c` patch is not included in the latest nanopi r4s jammy 6.1 kernel?

https://github.com/armbian/build/pull/4308/files

 

NanoPi R4s, enp1s0 ethernet device not showed up after reboot

 

Edited by Junkman
Posted (edited)

FYI, I burned the `rk3399-sd-ubuntu-focal-desktop-4.19-arm64-20230915.img.gz` to my SD card.

https://wiki.friendlyelec.com/wiki/index.php/NanoPi_R4S#Flash_to_TF

 

All the ethernet ports can be found and working (tested).

So I think this might be some bugs in the Armbian build.

 

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

$ uname -a
Linux NanoPi-R4S 4.19.193 #4 SMP PREEMPT Tue Sep 5 13:32:27 CST 2023 aarch64 aarch64 aarch64 GNU/Linux

pi@NanoPi-R4S:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 68:27:19:ad:02:09 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.180/24 brd 192.168.100.255 scope global dynamic noprefixroute eth0
       valid_lft 85478sec preferred_lft 85478sec
    inet6 fe80::ca10:91c8:345b:dcc4/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 66:a8:c1:46:84:83 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.181/24 brd 192.168.100.255 scope global dynamic noprefixroute eth1
       valid_lft 85926sec preferred_lft 85926sec
    inet6 fe80::73e6:a4c0:efc2:da25/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

$ curl -IL --interface eth0 https://1.1.1.1/dns-query
HTTP/2 415 
server: cloudflare
date: Fri, 24 Nov 2023 13:18:11 GMT
access-control-allow-origin: *
cf-ray: 82b1efb8a8d47d7c-LAX

$ curl -IL --interface eth1 https://1.1.1.1/dns-query
HTTP/2 415 
server: cloudflare
date: Fri, 24 Nov 2023 13:18:15 GMT
access-control-allow-origin: *
cf-ray: 82b1efd0c9a67eb7-LAX

 

Edited by Junkman
Posted
7 minutes ago, Junkman said:

So I think this might be some bugs in the Armbian build.


Nope, build is passing:

Artifacts generation


There is nothing wrong with Armbian or its build framework. The problem is lack of general understanding how expensive is keeping this cheap hardware working with constantly moving kernel.
https://docs.armbian.com/User-Guide_FAQ/#why-does-hardware-feature-xy-work-in-old-kernel-but-not-in-more-recent-one

This is open source. Nobody prevents anyone to research and fix the problem.

 

Posted (edited)

Hi, @Igor. Thanks for your reply.

 

Totally understand the tradeoffs between maintaining a mainline kernel and custom patches to the specified boards.

 

I wonder how could I manually apply the PCIe driver patch to the current kernel.

Do I have to rebuild the whole kernel?

Edited by Junkman
Posted (edited)

Hi!

 

The root of cause is not controller itself, but pcie phy. It's not properly initialized on module init. So when device is cold-booted (after power on) phy stay in "factory" state, but after warm reboot phy stay in "power_on" state and prevent any training.

 

Patch https://github.com/armbian/build/pull/4308/files works because after first fail phy reset by pcie driver.

 

https://github.com/andrewz1/rk3399-pcie-phy this is updated phy module and dtb overlay for switch to it (I just changed name). This is just proof of concept.

 

patch attached (tested on edge kernel)

phy-rockchip-pcie.patch

Edited by andrewz1
Posted
On 12/4/2023 at 10:49 AM, andrewz1 said:

The root of cause is not controller itself, but pcie phy. It's not properly initialized on module init. So when device is cold-booted (after power on) phy stay in "factory" state, but after warm reboot phy stay in "power_on" state and prevent any training.


Thanks for the tip!

 

On 12/4/2023 at 10:49 AM, andrewz1 said:

This is just proof of concept.


Our current dirty hack is not even that ;) and anyway doesn't work. You are welcome to remove that patch an add this one. Once merged, I can enable hardware on automated testings to see long term effects. 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines