20.08 upgrade with 5.7.15 will not boot on RockPro64


Recommended Posts

As a warning:

 

I have a RockPro64 that was initial setup using the Armbian 20.02.1 buster desktop image (using the 5.4.20 kernel). Over the past few months, I have accepted each minor release of Armbian up to the 20.05.7 release with a 5.4.49 kernel

 

Yesterday, I installed the 20.08 upgrade with 5.7.15 kernel, and the board failed to boot. I have only just got it working again. (I had to restore the kernel and supporting files from a recent backup, and restore symlinks in /boot).

 

I don't know the root cause. I have ordered a second MicroSD card so that I can try booting from a fresh 20.08 image, and I will report back on what happens. In the meantime, I have attached a copy of the serial port output during the failed boot. (The last line is where it got stuck. I left it for an hour before powercycling)

Verbose Boot bad new 5.7.15 kernel.txt

Link to post
Share on other sites
Want Armbian t-shirt or a cup?

4 hours ago, David Pottage said:

I don't know the root cause. I have ordered a second MicroSD card so that I can try booting from a fresh 20.08 image, and I will report back on what happens. In the meantime, I have attached a copy of the serial port output during the failed boot.


Can you enable more verbosity? Edit /boot/armbianEnv.txt and change verbosity to 9

Link to post
Share on other sites
9 hours ago, David Pottage said:

Yesterday, I installed the 20.08 upgrade with 5.7.15 kernel, and the board failed to boot

I have tested both fresh v20.08 (Armbian_20.08_Rockpro64_bionic_current_5.7.15.img.xz) as well as v20.05.2 (Armbian_20.05.2_Rockpro64_bionic_current_5.4.43.img.xz) + upgrade to v20.08 and my unit boots fine in both scenarios

It is however not the first report about it so there may be something to it...

 

Please provide more verbose logs as Igor suggested.

Link to post
Share on other sites
On 8/24/2020 at 5:10 PM, Igor said:


Can you enable more verbosity? Edit /boot/armbianEnv.txt and change verbosity to 9

I tried that on my working 5.4.49 kernel, and apart from timestamps and the like, the boot output was the same as booting at verbosity 7.

 

Unless the new 5.7 kernel has extra features to log more output at at verbosity 9, I am reluctant to make that change and boot the 5.7 kernel, as it will break my running system and cause me some difficulty to get it working again.

 

Link to post
Share on other sites
Posted (edited)
5 hours ago, Igor said:


Console still goes to screen, not to UART. Change console=serial in /boot/armbianEnv.txt

Thanks, I will try that in a bit.

 

In the meantime, I took the latest 20.08 image, and flashed it to a new SanDisk SD card, and it booted just fine (Boot log attached)

 

I did a diff between this successful boot, and my unsuccessful boots from an upgraded system, and noticed two differences that could be significant.

 

- Firstly, the unsuccessful boots where from an eMMC card rather than an SD card

- Secondly, the build date of U-Boot differs (Feb 17 2020 vs Aug 17 2020) though both report the same version of 2017.09-armbian

 

Could either of those issues be significant ?

 

Edit: Added missing attachments

 

Verbose 9 shutdown to serial good new 5.7.15 kernel from SD card.txt Verbose 9 Boot to serial GOOD new 5.7.15 kernel from SD card.txt

Edited by David Pottage
Add missing attachments
Link to post
Share on other sites

OK, I changed

1 hour ago, Igor said:


Console still goes to screen, not to UART. Change console=serial in /boot/armbianEnv.txt

I did that. Still no more output.

 

Using the 5.4.49 kernel, I got lots more output, so the setting works, but it still looks like the 5.7 kernel is broken and fails very early in the startup process.

 

Next I will try a diff between my 20.02.1 based system image (that won't boot a 5.7 kernel) and the new 20.08 based image that does. Any suggestions of where I should be looking ?

 

Link to post
Share on other sites
44 minutes ago, piter75 said:

It wasn't. Can you attach it?

Was it boot from SD (v20.08) with eMMC inserted? Can you try to boot it this way?

I would like to see eMMC initialisation logs.

Sorry. I have edited the offending post to add the missing attachments.

Link to post
Share on other sites

Could there be an issue with the /boot/uInitrd-5.7.15-rockchip64 file?

 

The version in the 20.08 release is 13262336 bytes, but the one that got installed when I upgraded my old system to the 5.7.15 kernel is 15359382 bytes.

 

I know that this is a compressed ram disk image of some sort, so the contents could be the same, but the size difference is suspicious. I have not yet worked out how to list the contents of either file, my google fu is failing at present.

 

Edit:

I have worked out how to extract the uInitrd file.

The contents are different, but it looks like the file gets generated by a script when the kernel is upgraded as most of the differences are files from my system that have been incorporated into the archive. For example my system has an nvme SSD in the PCIe slot, that is divided into a number of disks using LVM2, so my generated uInitrd file has the contents of /etc/lvm, presumably so the kernel knows how to mount those volumes during boot.

 

The attached screenshot of my diff tool shows the other different files, including some kernel related stuff that could be a source of problems. (Vanilla 20.08 on the left, broken system on the right) Does that help? Should I upload the uInitrd file ?

 

uImage_differences.png

Link to post
Share on other sites

I don't have a rockchip device, but 5.7.15 is the last kernel working on my S905X3 device, all others after it fail, so i am wondering if something in the kernel itself is broken.

 

I noticed starting with 5.7.16 it would not properly update the directories in /lib/modules with the new version number. Possible other areas are affected as well.

Link to post
Share on other sites
5 hours ago, Uwu said:

I don't have a rockchip device, but 5.7.15 is the last kernel working on my S905X3 device, all others after it fail, so i am wondering if something in the kernel itself is broken.


We only conduct boot/upgrade tests on some hardware that we own and supporthttps://www.armbian.com/download/?device_support=Supported Anything else - a TV box or a generic board with community support only - can boot. Or not.

Link to post
Share on other sites

I tried booting my board using the uInitrd-5.7.15-rockchip64 file from the 20.08 image instead of the generated version that was created when my 5.7.15 kernel got installed.

 

The boot still got stuck in the same way. This suggests that the problem is with the other stuff in my filesystem, rather than the kernel that got supplied in the 5.7.15 kernel package.

Link to post
Share on other sites
39 minutes ago, David Pottage said:

The board boots from an eMMC card.

Can you try to capture the verbose bootlog once again while booting from SD but with eMMC still inserted?

I checked your previous logs but they were probably captured without eMMC module.

 

You will most probably have to break the boot sequence at u-boot (just hit space bar around the time that Model: RockPro64 is printed) and run the following commands at u-boot prompt:

setenv devnum 1
run mmc_boot

 

Link to post
Share on other sites
9 hours ago, piter75 said:

Can you try to capture the verbose bootlog once again while booting from SD but with eMMC still inserted?

I checked your previous logs but they were probably captured without eMMC module.

 

You will most probably have to break the boot sequence at u-boot (just hit space bar around the time that Model: RockPro64 is printed) and run the following commands at u-boot prompt:


setenv devnum 1
run mmc_boot

 

OK, I can do that.

 

To clarify, I have a system setup to boot from an eMMC. That system runs my home sever, has a fair amount of stuff installed and uses an nvme SSD for most of it's storage. That system boots fine on a 5.4 series kernel, but fails using the upgraded 5.7 or 5.8 series kernels

 

I have downloaded the latest 20.08 release and written it to an SD card. That image has a 5.7 kernel and boots just fine, but it does not have all the software installed, or know anything about the nvme drive.

 

Link to post
Share on other sites
3 minutes ago, David Pottage said:

To clarify, I have a system setup to boot from an eMMC

Yup. When you boot 5.7 or 5.8 from eMMC there is not much to see in the logs.

I would like to see eMMC initialisation while you boot from SD and have eMMC and possibly NVMe connected.

This may (possibly) shed some light at why you cannot boot with eMMC.

Link to post
Share on other sites

OK, I have connected both the SD card and the eMMC, and have booted twice. First from eMMC with the 5.4 kernel, then interupting the boot and using your commands to boot 20.08 from the SD card.

 

The 20.08 image from the SD card had a kernel panic

 

I have attached both boot logs.

 

 

Verbose 9 Boot to serial kernel panic 5.7.15 with uSD plugged in.txt Verbose 9 Boot to serial good 5.4.49 kernel with uSD plugged in.txt

Link to post
Share on other sites
11 hours ago, piter75 said:

Yup. When you boot 5.7 or 5.8 from eMMC there is not much to see in the logs.

I would like to see eMMC initialisation while you boot from SD and have eMMC and possibly NVMe connected.

This may (possibly) shed some light at why you cannot boot with eMMC.

Do you think that the eMMC initialisation could be the source of the problem?

 

If so, then would copying my existing setup to an SD card, and booting from that solve the problem?

Link to post
Share on other sites
12 hours ago, David Pottage said:

Do you think that the eMMC initialisation could be the source of the problem?

Rather not. This happens before eMMC is even probed.

 

12 hours ago, David Pottage said:

If so, then would copying my existing setup to an SD card, and booting from that solve the problem?

It could. At this point it is unfortunately a mystery to me what the real cause of this issue is.

Link to post
Share on other sites
20 minutes ago, piter75 said:
12 hours ago, David Pottage said:

If so, then would copying my existing setup to an SD card, and booting from that solve the problem?

It could. At this point it is unfortunately a mystery to me what the real cause of this issue is.

In that case I will give it a try on Sunday evening, and post the results. I will be away until then.

Link to post
Share on other sites
On 9/11/2020 at 9:23 PM, David Pottage said:

In that case I will give it a try on Sunday evening, and post the results. I will be away until then.

I tried booting from my SD card and it did not work. I started with booting the old 5.4.49 kernel, but even that failed.

 

The first time, I reformatted my SD card with an empty ext4 filesystem, and used rsync to copy everything from the eMMC card root file system. When I attempted to boot the 5.4.49 kernel from it, it failed and could not find the root filesystem.

 

I then thought that there might be some magic in the way the files where laid out, so I used GParted to make a binary copy of the eMMC card root file system to the SD card. When that booted, it went into a bootloop with "Synchronous Abort" handler, esr 0x02000000 messages. I have since looked at the logs from GParted, and noticed that it used e2image instead of dd to copy the partion. Could that command have left out something important?

 

Are there any boot blocks or clever bits of file system layout that are used to boot the RockPro64? It looks like a simple copy does not create something bootable

 

I have attached the serial port output from both boot attempts.

Boot reset loop.txt Verbose 9 Boot from uSD failed no root FS 5.4.49 kernel.txt

Link to post
Share on other sites
1 hour ago, David Pottage said:

When I attempted to boot the 5.4.49 kernel from it, it failed and could not find the root filesystem.

Filesystem of your SD card has different UUID than the one in eMMC and the UUID is used to find root fs.

 

1 hour ago, David Pottage said:

 

Are there any boot blocks or clever bits of file system layout that are used to boot the RockPro64? It looks like a simple copy does not create something bootable

Definitely. The root/boot partition should start at 16MiB and it needs u-boot images at correct locations.

 

Maybe (most probably) there is some easier way but I believe you could be more successful this way:

  • flash Armbian to SD and boot it;
  • boot from eMMC and make new fs on the SD partition (be careful to not erase your eMMC partition);
  • mount it at /mnt/sd and run rsync -avx / /mnt/sd/
  • run "lsblk -o PATH,UUID" to find the UUID of your new fs on SD;
  • modify UUID value in /mnt/sd/boot/armbianEnv.txt (rootdev=UUID=...) and /mnt/sd/etc/fstab (UUID=.... for / mountpoint) to the one found on SD;
  • unmount and try to boot from SD

No warranty it works for you but the chance is definitely higher ;-)

Link to post
Share on other sites
On 9/15/2020 at 9:58 PM, piter75 said:

Maybe (most probably) there is some easier way but I believe you could be more successful this way:

  • flash Armbian to SD and boot it;
  • boot from eMMC and make new fs on the SD partition (be careful to not erase your eMMC partition);
  • mount it at /mnt/sd and run rsync -avx / /mnt/sd/
  • run "lsblk -o PATH,UUID" to find the UUID of your new fs on SD;
  • modify UUID value in /mnt/sd/boot/armbianEnv.txt (rootdev=UUID=...) and /mnt/sd/etc/fstab (UUID=.... for / mountpoint) to the one found on SD;
  • unmount and try to boot from SD

No warranty it works for you but the chance is definitely higher ;-)

Thanks for that tip. It worked to boot my old 5.4.49 kernel.

I then updated the symlinks in /boot to boot the new 5.8.6 kernel, and the boot failed as before.

 

I had saved a copy of the original /boot directory (from the released 20.08 release),

I did some more investigation and did a diff between the failing /boot directory and the copy I took.

 

I noticed a difference in /boot/boot.cmd

 

# diff boot/boot.cmd OLD_boot/boot.cmd
6c6
< setenv load_addr "0x9000000"
---
> setenv load_addr "0x39000000"
10c10
< setenv verbosity "1"
---
> setenv verbosity "7"
12d11
< setenv bootlogo "false"
15c14
< setenv earlycon "off"
---
> setenv earlycon "on"
29c28,29
< if test "${bootlogo}" = "true"; then setenv consoleargs "bootsplash.bootfile=bootsplash.armbian ${consoleargs}"; fi
---
> # if test "${earlycon}" = "on"; then setenv consoleargs "earlycon=uart,mmio,0xFF1A0000,1500000 ${consoleargs}"; fi
> # 2: uart:16550A mmio:0xFF1A0000 irq:38 tx:51 rx:0 RTS|DTR

Could the difference in load_addr be the cause of the problems?

 

I have attached the boot log from the successful boots of each kernel.

 

Sucessful kernel 5.8 Jupiter system boot from microSD.txt Sucessful kernel 5.4.49 Jupiter system boot from microSD.txt

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...