David Pottage Posted August 24, 2020 Posted August 24, 2020 As a warning: I have a RockPro64 that was initial setup using the Armbian 20.02.1 buster desktop image (using the 5.4.20 kernel). Over the past few months, I have accepted each minor release of Armbian up to the 20.05.7 release with a 5.4.49 kernel Yesterday, I installed the 20.08 upgrade with 5.7.15 kernel, and the board failed to boot. I have only just got it working again. (I had to restore the kernel and supporting files from a recent backup, and restore symlinks in /boot). I don't know the root cause. I have ordered a second MicroSD card so that I can try booting from a fresh 20.08 image, and I will report back on what happens. In the meantime, I have attached a copy of the serial port output during the failed boot. (The last line is where it got stuck. I left it for an hour before powercycling) Verbose Boot bad new 5.7.15 kernel.txt
Igor Posted August 24, 2020 Posted August 24, 2020 4 hours ago, David Pottage said: I don't know the root cause. I have ordered a second MicroSD card so that I can try booting from a fresh 20.08 image, and I will report back on what happens. In the meantime, I have attached a copy of the serial port output during the failed boot. Can you enable more verbosity? Edit /boot/armbianEnv.txt and change verbosity to 9
piter75 Posted August 24, 2020 Posted August 24, 2020 9 hours ago, David Pottage said: Yesterday, I installed the 20.08 upgrade with 5.7.15 kernel, and the board failed to boot I have tested both fresh v20.08 (Armbian_20.08_Rockpro64_bionic_current_5.7.15.img.xz) as well as v20.05.2 (Armbian_20.05.2_Rockpro64_bionic_current_5.4.43.img.xz) + upgrade to v20.08 and my unit boots fine in both scenarios It is however not the first report about it so there may be something to it... Please provide more verbose logs as Igor suggested.
Myy Posted August 25, 2020 Posted August 25, 2020 Note that the serial console clock changes to 1500000n8 when it boots, so be sure that the serial console receiver also reads at that speed, else you might not see any log after booting.
David Pottage Posted August 28, 2020 Author Posted August 28, 2020 On 8/24/2020 at 5:10 PM, Igor said: Can you enable more verbosity? Edit /boot/armbianEnv.txt and change verbosity to 9 I tried that on my working 5.4.49 kernel, and apart from timestamps and the like, the boot output was the same as booting at verbosity 7. Unless the new 5.7 kernel has extra features to log more output at at verbosity 9, I am reluctant to make that change and boot the 5.7 kernel, as it will break my running system and cause me some difficulty to get it working again.
David Pottage Posted August 31, 2020 Author Posted August 31, 2020 On 8/24/2020 at 5:10 PM, Igor said: Can you enable more verbosity? Edit /boot/armbianEnv.txt and change verbosity to 9 I booted the 5.7.15 kernel with the verbosity increased to 9. The log output was the same. I have attached it anyway. Verbose 9 Boot bad new 5.7.15 kernel.txt
Igor Posted August 31, 2020 Posted August 31, 2020 5 minutes ago, David Pottage said: The log output was the same. Console still goes to screen, not to UART. Change console=serial in /boot/armbianEnv.txt
David Pottage Posted August 31, 2020 Author Posted August 31, 2020 (edited) 5 hours ago, Igor said: Console still goes to screen, not to UART. Change console=serial in /boot/armbianEnv.txt Thanks, I will try that in a bit. In the meantime, I took the latest 20.08 image, and flashed it to a new SanDisk SD card, and it booted just fine (Boot log attached) I did a diff between this successful boot, and my unsuccessful boots from an upgraded system, and noticed two differences that could be significant. - Firstly, the unsuccessful boots where from an eMMC card rather than an SD card - Secondly, the build date of U-Boot differs (Feb 17 2020 vs Aug 17 2020) though both report the same version of 2017.09-armbian Could either of those issues be significant ? Edit: Added missing attachments Verbose 9 shutdown to serial good new 5.7.15 kernel from SD card.txt Verbose 9 Boot to serial GOOD new 5.7.15 kernel from SD card.txt Edited August 31, 2020 by David Pottage Add missing attachments
David Pottage Posted August 31, 2020 Author Posted August 31, 2020 OK, I changed 1 hour ago, Igor said: Console still goes to screen, not to UART. Change console=serial in /boot/armbianEnv.txt I did that. Still no more output. Using the 5.4.49 kernel, I got lots more output, so the setting works, but it still looks like the 5.7 kernel is broken and fails very early in the startup process. Next I will try a diff between my 20.02.1 based system image (that won't boot a 5.7 kernel) and the new 20.08 based image that does. Any suggestions of where I should be looking ?
Igor Posted August 31, 2020 Posted August 31, 2020 1 hour ago, David Pottage said: Any suggestions of where I should be looking ? My Rockpro64 just booted from SD card without any troubles - freshly build image with kernel 5.7.19 ... hardware defect or perhaps quality issue? http://ix.io/2vMr
piter75 Posted August 31, 2020 Posted August 31, 2020 4 hours ago, David Pottage said: Boot log attached It wasn't. Can you attach it? Was it boot from SD (v20.08) with eMMC inserted? Can you try to boot it this way? I would like to see eMMC initialisation logs.
David Pottage Posted August 31, 2020 Author Posted August 31, 2020 44 minutes ago, piter75 said: It wasn't. Can you attach it? Was it boot from SD (v20.08) with eMMC inserted? Can you try to boot it this way? I would like to see eMMC initialisation logs. Sorry. I have edited the offending post to add the missing attachments.
David Pottage Posted September 1, 2020 Author Posted September 1, 2020 Could there be an issue with the /boot/uInitrd-5.7.15-rockchip64 file? The version in the 20.08 release is 13262336 bytes, but the one that got installed when I upgraded my old system to the 5.7.15 kernel is 15359382 bytes. I know that this is a compressed ram disk image of some sort, so the contents could be the same, but the size difference is suspicious. I have not yet worked out how to list the contents of either file, my google fu is failing at present. Edit: I have worked out how to extract the uInitrd file. The contents are different, but it looks like the file gets generated by a script when the kernel is upgraded as most of the differences are files from my system that have been incorporated into the archive. For example my system has an nvme SSD in the PCIe slot, that is divided into a number of disks using LVM2, so my generated uInitrd file has the contents of /etc/lvm, presumably so the kernel knows how to mount those volumes during boot. The attached screenshot of my diff tool shows the other different files, including some kernel related stuff that could be a source of problems. (Vanilla 20.08 on the left, broken system on the right) Does that help? Should I upload the uInitrd file ?
David Pottage Posted September 1, 2020 Author Posted September 1, 2020 Forgot to add the commands to extract the uImage archive tail -c+65 < uInitrd-5.7.15-rockchip64 | gunzip > decomp_uInitrd-5.7.15 mkdir extracted_uInitrd-5.7.15 cd extracted_uInitrd-5.7.15/ cpio -idv < ../decomp_uInitrd-5.7.15
Uwu Posted September 7, 2020 Posted September 7, 2020 I don't have a rockchip device, but 5.7.15 is the last kernel working on my S905X3 device, all others after it fail, so i am wondering if something in the kernel itself is broken. I noticed starting with 5.7.16 it would not properly update the directories in /lib/modules with the new version number. Possible other areas are affected as well.
Igor Posted September 7, 2020 Posted September 7, 2020 5 hours ago, Uwu said: I don't have a rockchip device, but 5.7.15 is the last kernel working on my S905X3 device, all others after it fail, so i am wondering if something in the kernel itself is broken. We only conduct boot/upgrade tests on some hardware that we own and support. https://www.armbian.com/download/?device_support=Supported Anything else - a TV box or a generic board with community support only - can boot. Or not.
David Pottage Posted September 7, 2020 Author Posted September 7, 2020 I tried booting my board using the uInitrd-5.7.15-rockchip64 file from the 20.08 image instead of the generated version that was created when my 5.7.15 kernel got installed. The boot still got stuck in the same way. This suggests that the problem is with the other stuff in my filesystem, rather than the kernel that got supplied in the 5.7.15 kernel package.
AlexVS Posted September 9, 2020 Posted September 9, 2020 Sata controller installed? I experienced a similar problem with the Marvell 88SE9230 controller, after replacing it with the Marvell 9235 everything works fine.
David Pottage Posted September 10, 2020 Author Posted September 10, 2020 18 hours ago, AlexVS said: Sata controller installed? I experienced a similar problem with the Marvell 88SE9230 controller, after replacing it with the Marvell 9235 everything works fine. I have an nvme SSD installed in the PCIe slot. The board boots from an eMMC card.
piter75 Posted September 10, 2020 Posted September 10, 2020 39 minutes ago, David Pottage said: The board boots from an eMMC card. Can you try to capture the verbose bootlog once again while booting from SD but with eMMC still inserted? I checked your previous logs but they were probably captured without eMMC module. You will most probably have to break the boot sequence at u-boot (just hit space bar around the time that Model: RockPro64 is printed) and run the following commands at u-boot prompt: setenv devnum 1 run mmc_boot
David Pottage Posted September 10, 2020 Author Posted September 10, 2020 9 hours ago, piter75 said: Can you try to capture the verbose bootlog once again while booting from SD but with eMMC still inserted? I checked your previous logs but they were probably captured without eMMC module. You will most probably have to break the boot sequence at u-boot (just hit space bar around the time that Model: RockPro64 is printed) and run the following commands at u-boot prompt: setenv devnum 1 run mmc_boot OK, I can do that. To clarify, I have a system setup to boot from an eMMC. That system runs my home sever, has a fair amount of stuff installed and uses an nvme SSD for most of it's storage. That system boots fine on a 5.4 series kernel, but fails using the upgraded 5.7 or 5.8 series kernels I have downloaded the latest 20.08 release and written it to an SD card. That image has a 5.7 kernel and boots just fine, but it does not have all the software installed, or know anything about the nvme drive.
piter75 Posted September 10, 2020 Posted September 10, 2020 3 minutes ago, David Pottage said: To clarify, I have a system setup to boot from an eMMC Yup. When you boot 5.7 or 5.8 from eMMC there is not much to see in the logs. I would like to see eMMC initialisation while you boot from SD and have eMMC and possibly NVMe connected. This may (possibly) shed some light at why you cannot boot with eMMC.
David Pottage Posted September 10, 2020 Author Posted September 10, 2020 OK, I have connected both the SD card and the eMMC, and have booted twice. First from eMMC with the 5.4 kernel, then interupting the boot and using your commands to boot 20.08 from the SD card. The 20.08 image from the SD card had a kernel panic I have attached both boot logs. Verbose 9 Boot to serial kernel panic 5.7.15 with uSD plugged in.txt Verbose 9 Boot to serial good 5.4.49 kernel with uSD plugged in.txt
David Pottage Posted September 11, 2020 Author Posted September 11, 2020 11 hours ago, piter75 said: Yup. When you boot 5.7 or 5.8 from eMMC there is not much to see in the logs. I would like to see eMMC initialisation while you boot from SD and have eMMC and possibly NVMe connected. This may (possibly) shed some light at why you cannot boot with eMMC. Do you think that the eMMC initialisation could be the source of the problem? If so, then would copying my existing setup to an SD card, and booting from that solve the problem?
piter75 Posted September 11, 2020 Posted September 11, 2020 12 hours ago, David Pottage said: Do you think that the eMMC initialisation could be the source of the problem? Rather not. This happens before eMMC is even probed. 12 hours ago, David Pottage said: If so, then would copying my existing setup to an SD card, and booting from that solve the problem? It could. At this point it is unfortunately a mystery to me what the real cause of this issue is.
David Pottage Posted September 11, 2020 Author Posted September 11, 2020 20 minutes ago, piter75 said: 12 hours ago, David Pottage said: If so, then would copying my existing setup to an SD card, and booting from that solve the problem? It could. At this point it is unfortunately a mystery to me what the real cause of this issue is. In that case I will give it a try on Sunday evening, and post the results. I will be away until then.
David Pottage Posted September 15, 2020 Author Posted September 15, 2020 On 9/11/2020 at 9:23 PM, David Pottage said: In that case I will give it a try on Sunday evening, and post the results. I will be away until then. I tried booting from my SD card and it did not work. I started with booting the old 5.4.49 kernel, but even that failed. The first time, I reformatted my SD card with an empty ext4 filesystem, and used rsync to copy everything from the eMMC card root file system. When I attempted to boot the 5.4.49 kernel from it, it failed and could not find the root filesystem. I then thought that there might be some magic in the way the files where laid out, so I used GParted to make a binary copy of the eMMC card root file system to the SD card. When that booted, it went into a bootloop with "Synchronous Abort" handler, esr 0x02000000 messages. I have since looked at the logs from GParted, and noticed that it used e2image instead of dd to copy the partion. Could that command have left out something important? Are there any boot blocks or clever bits of file system layout that are used to boot the RockPro64? It looks like a simple copy does not create something bootable I have attached the serial port output from both boot attempts. Boot reset loop.txt Verbose 9 Boot from uSD failed no root FS 5.4.49 kernel.txt
piter75 Posted September 15, 2020 Posted September 15, 2020 1 hour ago, David Pottage said: When I attempted to boot the 5.4.49 kernel from it, it failed and could not find the root filesystem. Filesystem of your SD card has different UUID than the one in eMMC and the UUID is used to find root fs. 1 hour ago, David Pottage said: Are there any boot blocks or clever bits of file system layout that are used to boot the RockPro64? It looks like a simple copy does not create something bootable Definitely. The root/boot partition should start at 16MiB and it needs u-boot images at correct locations. Maybe (most probably) there is some easier way but I believe you could be more successful this way: flash Armbian to SD and boot it; boot from eMMC and make new fs on the SD partition (be careful to not erase your eMMC partition); mount it at /mnt/sd and run rsync -avx / /mnt/sd/ run "lsblk -o PATH,UUID" to find the UUID of your new fs on SD; modify UUID value in /mnt/sd/boot/armbianEnv.txt (rootdev=UUID=...) and /mnt/sd/etc/fstab (UUID=.... for / mountpoint) to the one found on SD; unmount and try to boot from SD No warranty it works for you but the chance is definitely higher ;-)
David Pottage Posted September 17, 2020 Author Posted September 17, 2020 On 9/15/2020 at 9:58 PM, piter75 said: Maybe (most probably) there is some easier way but I believe you could be more successful this way: flash Armbian to SD and boot it; boot from eMMC and make new fs on the SD partition (be careful to not erase your eMMC partition); mount it at /mnt/sd and run rsync -avx / /mnt/sd/ run "lsblk -o PATH,UUID" to find the UUID of your new fs on SD; modify UUID value in /mnt/sd/boot/armbianEnv.txt (rootdev=UUID=...) and /mnt/sd/etc/fstab (UUID=.... for / mountpoint) to the one found on SD; unmount and try to boot from SD No warranty it works for you but the chance is definitely higher ;-) Thanks for that tip. It worked to boot my old 5.4.49 kernel. I then updated the symlinks in /boot to boot the new 5.8.6 kernel, and the boot failed as before. I had saved a copy of the original /boot directory (from the released 20.08 release), I did some more investigation and did a diff between the failing /boot directory and the copy I took. I noticed a difference in /boot/boot.cmd # diff boot/boot.cmd OLD_boot/boot.cmd 6c6 < setenv load_addr "0x9000000" --- > setenv load_addr "0x39000000" 10c10 < setenv verbosity "1" --- > setenv verbosity "7" 12d11 < setenv bootlogo "false" 15c14 < setenv earlycon "off" --- > setenv earlycon "on" 29c28,29 < if test "${bootlogo}" = "true"; then setenv consoleargs "bootsplash.bootfile=bootsplash.armbian ${consoleargs}"; fi --- > # if test "${earlycon}" = "on"; then setenv consoleargs "earlycon=uart,mmio,0xFF1A0000,1500000 ${consoleargs}"; fi > # 2: uart:16550A mmio:0xFF1A0000 irq:38 tx:51 rx:0 RTS|DTR Could the difference in load_addr be the cause of the problems? I have attached the boot log from the successful boots of each kernel. Sucessful kernel 5.8 Jupiter system boot from microSD.txt Sucessful kernel 5.4.49 Jupiter system boot from microSD.txt
hhalibo Posted September 19, 2020 Posted September 19, 2020 I confirm the bug: After UPGRADE, REBOOT, NOTHING RUN!
Recommended Posts