Rötti Posted January 24, 2021 Posted January 24, 2021 Hello every one, I own two ESPRESSOBin boards V5. And to both I attached an XCSOURCE® MiniPCIe Sata3.0 AC696 extension card via MiniPCIe. This is the link to amazon: https://www.amazon.de/dp/B06XRG2TGV I tested several images from https://www.armbian.com/espressobin/#kernels-archive-all Unfortunatelly all the old images have been deleted last week, so I could not continue testing. Tested Kernels 8 weeks ago + the latest two this week: - 5.10.09-mvebu64 #21.02.0-hirsute (trunk) <-- works not - 5.08.18-mvebu64 #20.11.6-bionic <-- works not - 5.08.18-mvebu64 #20.11.3-focal <-- works not - 5.08.18-mvebu64 #20.11.3-bionic <-- works not - 5.08.06-mvebu64 #20.08.2-focal <-- works not - 4.14.135-mvebu64 #19.11.3-bionic <-- works Here is the whole UART-dump: TIM-1.0 WTMI-devel-18.12.0-a0a1cb8 WTMI: system early-init SVC REV: 3, CPU VDD voltage: 1.155V NOTICE: Booting Trusted Firmware NOTICE: BL1: v1.5(release):1f8ca7e (Marvell-devel-18.12.2) NOTICE: BL1: Built : 09:48:09, Feb 20 2019 NOTICE: BL1: Booting BL2 NOTICE: BL2: v1.5(release):1f8ca7e (Marvell-devel-18.12.2) NOTICE: BL2: Built : 09:48:10, Feb 20 2019 NOTICE: BL1: Booting BL31 NOTICE: BL31: v1.5(release):1f8ca7e (Marvell-devel-18.12.2) NOTICE: BL31: Built : 09:4 U-Boot 2018.03-devel-18.12.3-gc9aa92c-armbian (Feb 20 2019 - 09:45:04 +0100) Model: Marvell Armada 3720 Community Board ESPRESSOBin CPU 1000 [MHz] L2 800 [MHz] TClock 200 [MHz] DDR 800 [MHz] DRAM: 2 GiB Comphy chip #0: Comphy-0: USB3 5 Gbps Comphy-1: PEX0 2.5 Gbps Comphy-2: SATA0 6 Gbps Target spinup took 0 ms. AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl SATA mode flags: ncq led only pmp fbss pio slum part sxs PCIE-0: Link up MMC: sdhci@d0000: 0, sdhci@d8000: 1 Loading Environment from SPI Flash... SF: Detected w25q32dw with page size 256 Bytes, erase size 4 KiB, total 4 MiB OK Model: Marvell Armada 3720 Community Board ESPRESSOBin Net: eth0: neta@30000 [PRIME] Hit any key to stop autoboot: 0 starting USB... USB0: Register 2000104 NbrPorts 2 Starting the controller USB XHCI 1.00 USB1: USB EHCI 1.00 scanning bus 0 for devices... 1 USB Device(s) found scanning bus 1 for devices... 1 USB Device(s) found scanning usb for storage devices... 0 Storage Device(s) found ## Loading init Ramdisk from Legacy Image at 01100000 ... Image Name: uInitrd Image Type: AArch64 Linux RAMDisk Image (gzip compressed) Data Size: 10750023 Bytes = 10.3 MiB Load Address: 00000000 Entry Point: 00000000 Verifying Checksum ... OK ## Flattened Device Tree blob at 06000000 Booting using the fdt blob at 0x6000000 Loading Ramdisk to 7ebea000, end 7f62a847 ... OK Using Device Tree in place at 0000000006000000, end 00000000060059cd Starting kernel ... [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034] [ 0.000000] Linux version 5.8.18-mvebu64 (root@beast) (aarch64-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 8.3-2019.03 (arm-rel-8.36)) 8.3.0, GNU ld (GNU Toolchain for the A-profile Architecture 8.3-2019.03 (arm-rel-8.36)) 2.32.0.20190321) #20.11.3 SMP PREEMPT Fri Dec 11 21:10:52 CET 2020 [ 0.000000] Machine model: Globalscale Marvell ESPRESSOBin Board [ 0.000000] earlycon: ar3700_uart0 at MMIO 0x00000000d0012000 (options '') [ 0.000000] printk: bootconsole [ar3700_uart0] enabled Loading, please wait... Starting version 245.4-4ubuntu3.3 Begin: Loading essential drivers ... done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... Scanning for Btrfs filesystems done. Begin: Will now check root file system ... fsck from util-linux 2.34 [/usr/sbin/fsck.ext4 (1) -- /dev/mmcblk0p1] fsck.ext4 -a -C0 /dev/mmcblk0p1 /dev/mmcblk0p1: clean, 41739/1828336 files, 439779/7502824 blocks done. done. Begin: Running /scripts/local-bottom ... done. Begin: Running /scripts/init-bottom ... done. [ 3.694604] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP [ 3.699465] Modules linked in: tag_edsa mv88e6xxx dsa_core bridge stp llc phy_mvebu_a3700_comphy [ 3.708518] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.8.18-mvebu64 #20.11.3 [ 3.716037] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT) [ 3.722685] Workqueue: events free_work [ 3.726614] pstate: 00000085 (nzcv daIf -PAN -UAO BTYPE=--) [ 3.732352] pc : ahci_single_level_irq_intr+0x1c/0x90 [ 3.737549] lr : __handle_irq_event_percpu+0x5c/0x168 [ 3.742737] sp : ffffffc0113bbd10 [ 3.746142] x29: ffffffc0113bbd10 x28: ffffff807d48b700 [ 3.751608] x27: 0000000000000060 x26: ffffffc010f085e8 [ 3.757073] x25: ffffffc0113075a5 x24: ffffff8079101800 [ 3.762539] x23: 000000000000002d x22: ffffffc0113bbdd4 [ 3.768004] x21: 0000000000000000 x20: ffffffc011465008 [ 3.773470] x19: ffffff8079381600 x18: 0000000000000000 [ 3.778936] x17: 0000000000000000 x16: 0000000000000000 [ 3.784401] x15: 000000d2c010fc50 x14: 0000000000000323 [ 3.789867] x13: 00000000000002d4 x12: 0000000000000000 [ 3.795332] x11: 0000000000000040 x10: ffffffc011282dd8 [ 3.800798] x9 : ffffffc011282dd0 x8 : ffffff807d000270 [ 3.806263] x7 : 0000000000000000 x6 : 0000000000000000 [ 3.811729] x5 : ffffffc06ea93000 x4 : ffffffc0113bbe10 [ 3.817196] x3 : ffffffc06ea93000 x2 : ffffff8079101a80 [ 3.822661] x1 : ffffff8078803e00 x0 : 000000000000002d [ 3.828126] Call trace: [ 3.830642] ahci_single_level_irq_intr+0x1c/0x90 [ 3.835478] __handle_irq_event_percpu+0x5c/0x168 [ 3.840315] handle_irq_event_percpu+0x38/0x90 [ 3.844885] handle_irq_event+0x48/0xe0 [ 3.848828] handle_simple_irq+0x94/0xd0 [ 3.852860] generic_handle_irq+0x30/0x48 [ 3.856985] advk_pcie_irq_handler+0x214/0x240 [ 3.861552] __handle_irq_event_percpu+0x5c/0x168 [ 3.866389] handle_irq_event_percpu+0x38/0x90 [ 3.870959] handle_irq_event+0x48/0xe0 [ 3.874900] handle_fasteoi_irq+0xb8/0x170 [ 3.879112] generic_handle_irq+0x30/0x48 [ 3.883234] __handle_domain_irq+0x64/0xc0 [ 3.887447] gic_handle_irq+0xc8/0x168 [ 3.891298] el1_irq+0xb8/0x180 [ 3.894524] unmap_kernel_range_noflush+0x128/0x188 [ 3.899540] remove_vm_area+0xac/0xd0 [ 3.903303] __vunmap+0x48/0x298 [ 3.906618] free_work+0x44/0x60 [ 3.909937] process_one_work+0x1e8/0x360 [ 3.914057] worker_thread+0x44/0x480 [ 3.917820] kthread+0x154/0x158 [ 3.921135] ret_from_fork+0x10/0x34 [ 3.924812] Code: a90153f3 f9401022 f9400854 91002294 (b9400293) [ 3.931087] ---[ end trace 98b323414bb99c99 ]--- [ 3.935829] Kernel panic - not syncing: Fatal exception in interrupt [ 3.942368] SMP: stopping secondary CPUs [ 3.946403] Kernel Offset: disabled [ 3.949985] CPU features: 0x240002,2000200c [ 3.954283] Memory Limit: none [ 3.957424] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- The boards boots up if I don't plug in any SATA HDDs into the extension card. I hope this helps. If you need any other information just let me know, I'm absolutely willing to help. But please be aware that I'm a software developer coming from windows trying to get into linux. But I have no clue of kernel patching/compiling etc. Sorry! Thank you very, very much in advance! You're doing an awesome job. Sincerely Rötti 0 Quote
Igor Posted January 25, 2021 Posted January 25, 2021 10 hours ago, Rötti said: But I have no clue of kernel patching/compiling etc. Sorry! We have no time and absolutely no budget to cover bugs on this very buggy hardware. Officially Armbian don't have any maintainer(s) for this board anymore, so its basically a community supported, as is. @Pali is trying to bring up mainline u-boot support and that's about it. Don't know how far things are and if upgrading u-boot helps in this case. It's only alternative, so worth trying. This is just a bug like any other - moved to bug tracker forums. 10 hours ago, Rötti said: But please be aware that I'm a software developer coming from windows trying to get into linux. You couldn't choose worse hardware 10 hours ago, Rötti said: I'm absolutely willing to help. So help. 0 Quote
Pali Posted January 27, 2021 Posted January 27, 2021 @RöttiThis looks like an AHCI or PCIe issue. Please report bug to the linux-ide@vger.kernel.org mailing list where are more SATA/AHCI developers and could help you to debug issue. Maybe it is also A3720 related PCIe issue, so send me an email I can provide you a A3720 PCIe patch which could fix some stability issues. 0 Quote
Rötti Posted January 27, 2021 Author Posted January 27, 2021 (edited) Hello @Igor, hello @Pali thank you for the information and thank you for moving the topic into the right forum. On 1/25/2021 at 8:24 AM, Igor said: You couldn't choose worse hardware Again thanks for that information, but I'm still here to get Armbian run on even this buggy hardware ;-) On 1/25/2021 at 8:24 AM, Igor said: So help. If you expect me to do more than writing to the linux kernel mailing list and the EspressoBIN forum, please point to the right direction and I will help. For example, do you have any backups of all previous images? If so, where can I find them. I'd like to try all images to narrow down when the bug has been introduced. Update: @Pali Thank you so much for pointing me to this mailing list. This was exactly what I was looking for. I wrote them a mail and will report back as soon as there are new information being revealed. Furthermore I added a post to the EspressoBIN Forum, which is currently on review. As soon as it's being approved, I'll link the post here. Thank in advance. Edited January 27, 2021 by Rötti Saw the later post of Pali 0 Quote
lampra Posted January 27, 2021 Posted January 27, 2021 2 hours ago, Rötti said: For example, do you have any backups of all previous images? If you scroll down the download section you will find the archives link. Some (v5.59 (2018-08-18) and later) previous images do exist there. 0 Quote
Rötti Posted January 29, 2021 Author Posted January 29, 2021 Hello @lampra, thanks a lot for this awesome hint! I have been using this link: https://redirect.armbian.com/region/EU/espressobin, on which I found the archived images, but were removed approx 2 weeks ago. Feeling so dumb, this archive-button is so big, that my brain based ad exclusion just removed it ;-) This is going to help a lot! I'll be able to find out in which exact version the bug has been introduced. 0 Quote
lampra Posted January 31, 2021 Posted January 31, 2021 @Rötti Are you able to boot from the onboard SATA? (not from the extension card via MiniPCIe). I couldn't boot from SATA a year ago so I gave up and used openwrt but I would prefer using Armbian (I can't test newer releases as it is installed far away from home). 0 Quote
Rötti Posted February 7, 2021 Author Posted February 7, 2021 Hello guys, On 1/27/2021 at 12:22 PM, Pali said: @RöttiThis looks like an AHCI or PCIe issue. Please report bug to the linux-ide@vger.kernel.org mailing list where are more SATA/AHCI developers and could help you to debug issue. Maybe it is also A3720 related PCIe issue, so send me an email I can provide you a A3720 PCIe patch which could fix some stability issues. @Pali I posted the problem to the ide-linux kernel mailing list as proposed, but unfortunately received no answer. Here is the link: https://www.spinics.net/lists/linux-ide/msg60178.html Furthermore I were able to narrow down the kernel versions and exact image version of Armbian where it broke: Armbian 19.11.3 with Kernel 4.14.135 <- last version which was working Armbian 5.65 with Kernel 4.18.16 <- first version which is not working On 1/31/2021 at 3:28 AM, lampra said: @Rötti Are you able to boot from the onboard SATA? (not from the extension card via MiniPCIe). I couldn't boot from SATA a year ago so I gave up and used openwrt but I would prefer using Armbian (I can't test newer releases as it is installed far away from home). @lampra I'm not booting from sata. I'm still booting from SD-card, but get a kernel panic by mere plugging in the SATA cable via the pcie sata-controller. I'm sorry I'm not having further information about your issue. 0 Quote
Pali Posted February 7, 2021 Posted February 7, 2021 You need to wait or you can try to remind your email on mailing list... I did not have time to look at it. 0 Quote
Pali Posted February 7, 2021 Posted February 7, 2021 On 1/31/2021 at 3:28 AM, lampra said: I couldn't boot from SATA a year ago so I gave up and used openwrt but I would prefer using Armbian (I can't test newer releases as it is installed far away from home). Do you mean loading firmware directly from SATA (without NOR)? Or loading firmware from NOR and only loading kernel from SATA? Last week I updated documentation how to build & store fimware to SATA disk (needs special partition), see: https://trustedfirmware-a.readthedocs.io/en/latest/plat/marvell/armada/build.html (search for "SATA device boot") And last year I have tested that U-Boot loaded from NOR is able to access SATA disk and load kernel from it without need to use uSD card. If it does not work post U-Boot output from serial console. 0 Quote
Pali Posted February 10, 2021 Posted February 10, 2021 On 2/7/2021 at 11:19 PM, Rötti said: @Pali I posted the problem to the ide-linux kernel mailing list as proposed, but unfortunately received no answer. Here is the link: https://www.spinics.net/lists/linux-ide/msg60178.html Furthermore I were able to narrow down the kernel versions and exact image version of Armbian where it broke: Armbian 19.11.3 with Kernel 4.14.135 <- last version which was working Armbian 5.65 with Kernel 4.18.16 <- first version which is not working I have looked at email which you sent to mailing list https://lore.kernel.org/linux-ide/cbbb2496501fed013ccbeba524e8d573@posteo.de/T/#u and you did not provide all / enough information. At least output from lspci -nn -vv is needed to correctly identify type of your PCIe SATA controller. Also there is missing dmesg output between [ 0.000000] and [ 3.694604] period. Please provide these informations (to mailing list). 0 Quote
Pali Posted March 5, 2021 Posted March 5, 2021 @Rötti: Also please boot linux kernel with console=ttyMV0,115200 earlycon=ar3700_uart,0xd0012000 command line option so output on UART would contain full boot log. 0 Quote
Rötti Posted March 7, 2021 Author Posted March 7, 2021 Hi Pali, deeply sorry for the long delay, but we struggled with some Covid-19 related issue within the family. On 2/10/2021 at 1:50 PM, Pali said: I have looked at email which you sent to mailing list https://lore.kernel.org/linux-ide/cbbb2496501fed013ccbeba524e8d573@posteo.de/T/#u and you did not provide all / enough information. At least output from lspci -nn -vv is needed to correctly identify type of your PCIe SATA controller. Also there is missing dmesg output between [ 0.000000] and [ 3.694604] period. Please provide these informations (to mailing list). I Added the lspci -nn -vv output to the mailing list. But I could not find the according dmesg output. After unplugging the SATA cable and rebooting I were able to login and looking at /var/dmesg but I couldn't find any information from the time around the crash. On 3/5/2021 at 10:41 PM, Pali said: @Rötti: Also please boot linux kernel with console=ttyMV0,115200 earlycon=ar3700_uart,0xd0012000 command line option so output on UART would contain full boot log. As you can see in the output below I already have these parameters in the console variable. Is there a special way to boot with this parameter, or is it automatically used when I call 'boot' because of 'set_bootargs' which contains 'console' already? Marvell>> printenv arch=arm baudrate=115200 board=mvebu_armada-37xx board_name=mvebu_armada-37xx boot_a_script=ext4load ${boot_interface} ${devnum}:1 ${scriptaddr} ${prefix}boot.scr;source ${scriptaddr}; boot_prefixes=/ /boot/ boot_targets=usb mmc1 mmc0 bootcmd=for target in ${boot_targets}; do run bootcmd_${target}; done bootcmd_mmc0=setenv devnum 0; setenv boot_interface mmc; run scan_dev_for_boot; bootcmd_mmc1=setenv devnum 1; setenv boot_interface mmc; run scan_dev_for_boot; bootcmd_usb=setenv devnum 0; usb start;setenv boot_interface usb; run scan_dev_for_boot; bootdelay=2 console=console=ttyMV0,115200 earlycon=ar3700_uart,0xd0012000 cpu=armv8 eth1addr=00:51:82:11:22:01 eth2addr=00:51:82:11:22:02 eth3addr=00:51:82:11:22:03 ethact=neta@30000 ethaddr=00:51:82:11:22:00 ethprime=eth0 extra_params=pci=pcie_bus_safe fdt_addr=0x6000000 fdt_addr_r=0x6f00000 fdt_high=0xffffffffffffffff fdt_name=fdt.dtb fdtcontroladdr=7f62d490 gatewayip=10.4.50.254 get_images=tftpboot $kernel_addr_r $image_name; tftpboot $fdt_addr_r $fdt_name; run get_ramfs get_ramfs=if test "${ramfs_name}" != "-"; then setenv ramdisk_addr_r 0x8000000; tftpboot $ramdisk_addr_r $ramfs_name; else setenv ramdisk_addr_r -;fi hostname=marvell image_name=Image initrd_addr=0x1100000 initrd_image=uInitrd initrd_size=0x2000000 ipaddr=0.0.0.0 kernel_addr=0x7000000 kernel_addr_r=0x7000000 loadaddr=0x8000000 netdev=eth0 netmask=255.255.255.0 ramdisk_addr_r=0x8000000 ramfs_name=- root=root=/dev/nfs rw rootpath=/srv/nfs/ scan_dev_for_boot=for prefix in ${boot_prefixes}; do echo ${prefix};run boot_a_script; done scriptaddr=0x6d00000 serverip=0.0.0.0 set_bootargs=setenv bootargs $console $root ip=$ipaddr:$serverip:$gatewayip:$netmask:$hostname:$netdev:none nfsroot=$serverip:$rootpath,tcp,v3 $extra_params $cpuidle soc=mvebu stderr=serial@12000 stdin=serial@12000 stdout=serial@12000 vendor=Marvell Environment size: 1962/65532 bytes I thank you very much for your awesome support! 0 Quote
Pali Posted March 7, 2021 Posted March 7, 2021 14 minutes ago, Rötti said: As you can see in the output below I already have these parameters in the console variable. Is there a special way to boot with this parameter, or is it automatically used when I call 'boot' because of 'set_bootargs' which contains 'console' already? If you call 'boot' command it executes 'bootcmd' variable. And if you trace 'bootcmd' from your printenv output it can be clear that 'set_bootargs' is not called in this path. Seems that your 'bootcmd' ends in 'boot_a_script' variable which loads external boot script (from uSD card?) and this one boots kernel. Script can do anything, including setting new variables, etc. So it may be possible that this script set or does not set 'console' into 'bootargs'. You need to investigate it. You could try to unset 'console' (= booting without console=ttyMV0,115200), maybe it helps. For recent kernels this console should not be needed. 0 Quote
Pali Posted March 17, 2021 Posted March 17, 2021 This is issue in ASMedia SATA controller card, not in Espressobin PCIe. Card announces support for 512 byte long PCIe packets, but when PCIe controller is configured for such long payload size then card cause system crash. We have reproduce this issue on other platform too. Marek sent kernel patch which adds quirk for this ASMedia SATA controller to set maximal payload size to 265 bytes https://lore.kernel.org/linux-pci/20210317115924.31885-1-kabel@kernel.org/T/#u and which should workaround this issue. 0 Quote
Rötti Posted March 26, 2021 Author Posted March 26, 2021 Hello Pali, as said in the kernel mailing list: Thanks to you and Marek. What I'd like to ask is, how does the workflow or the next steps look like? I mean, how long does it take to get the patch into the kernel (weeks, month)? How likely is it to be rejected? As soon as there is a new Kernel, will there be any nightly builds from armbian side? Are those changes flowing into Armbian directly or are there backports? Thank you in advance. 0 Quote
Pali Posted March 26, 2021 Posted March 26, 2021 2 minutes ago, Rötti said: I mean, how long does it take to get the patch into the kernel (weeks, month)? It would be merged either into next -rc version or into next mainline version (depends on how maintainers decide). See https://www.kernel.org/ for current released versions. And see https://www.kernel.org/category/faq.html for question When will the next kernel be released? After it is merged into rc or mainline version then this patch (because it is marked as bugfix) would be automatically included also into all longterm versions. 2 minutes ago, Rötti said: How likely is it to be rejected? Unlikely. In case it is rejected it would mean it is needed to update this patch (fix issues) and Marek or me will do it. For rest of armbian related questions ask armbian people. 0 Quote
Rötti Posted July 25, 2021 Author Posted July 25, 2021 Hello, I created a Feature Request to get @Pali's and Marek's patch included into Armbian. The Feature Request can be found here: 0 Quote
Rötti Posted August 26, 2021 Author Posted August 26, 2021 The new patch got into the current branch. I compiled Armbian with: sudo ./compile.sh BOARD=espressobin BRANCH=current RELEASE=focal BUILD_MINIMAL=no BUILD_DESKTOP=no KERNEL_ONLY=no KERNEL_CONFIGURE=no COMPRESS_OUTPUTIMAGE=sha,gpg,img And can confirm that the bug is gone now and the AS-Media based SATA controller chips are now working again. @PaliThank you for organizing the patch. @WernerThank you for you support and help getting the patch into Armbian. 0 Quote
Pali Posted October 3, 2021 Posted October 3, 2021 Now that fix was backported to stable versions 5.14.6, 5.13.19, 5.10.67, 5.4.148, 4.19.207, 4.14.247, 4.9.283 and 4.4.284. 2 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.