mikhailai Posted June 17 Share Posted June 17 The Linux kernel contained in the latest "linux-image-legacy-sunxi" (version 24.5.1) package appears to broken to the point of locking-up right from the start. It prints "Starting kernel ...", and no more messages appear even with "verbosity=7" set in the "armbianEnv.txt". The "linux-image-legacy-sunxi" version 24.2.1 boots just fine. Here are the steps to reproduce the problem. I've done this on "Orange Pi One" board, but exactly the same issue occurs on (community maintained) Banana Pi M1. 1. Download and write the Armbian image to a MicroSD card. 2. Connect the serial console, boot the board, finish setup, do all the upgrades: everything works fine at this point. 3. Set "verbosity=7" in the "armbianEnv.txt", reboot and observe the kernel messages. At this point, the "linux-image-current-sunxi", version 24.5.1 (kernel 6.6.31) is installed. 4. Install "armbian-config" and use it to switch to "linux-image-legacy-sunxi=24.2.1 (6.1.77)". Observe that the board boots up fine. 5. Now switch to "linux-image-legacy-sunxi=24.5.1 (6.1.92)". The boot process now gets stuck at "Starting kernel ..." message. So as a summary: * "linux-image-current-sunxi" version 24.5.1 with 6.6.31 kernel: boots fine. * "linux-image-legacy-sunxi" version 24.2.1 with "6.1.77" kernel: boots fine. * "linux-image-legacy-sunxi" version 24.5.1 with "6.1.92" kernel: broken: stuck at "Starting kernel ..." message. I wonder if anyone could check what could have happened with "linux-image-legacy-sunxi" in the latest Armbian build. 0 Quote Link to comment Share on other sites More sharing options...
going Posted June 17 Share Posted June 17 31 минуту назад, mikhailai сказал: I wonder if anyone could check what could have happened with "linux-image-legacy-sunxi" in the latest Armbian build. The last time these patches were changed: Date: Wed Mar 27 20:50:41 2024 Obviously, patches need to be rebased to the new kernel version and conflicts need to be fixed. If you are ready to volunteer to support these patches, I can tell you how to do it. Regards. 0 Quote Link to comment Share on other sites More sharing options...
mikhailai Posted June 18 Author Share Posted June 18 I can try doing one-off fix for the current Armbian release, but I cannot commit to support these patches going forward: I'm very short on time right now. LMK if you're still interested giving me the information. I guess I should start off with reading documentation on building the Armbian (never built any image). 0 Quote Link to comment Share on other sites More sharing options...
going Posted June 18 Share Posted June 18 54 минуты назад, mikhailai сказал: I can try doing one-off fix for the current Armbian release That's enough. It is not necessary to collect an image. It is enough to assemble the kernel package, install it in the OS and check its performance. I'll write the instructions. 0 Quote Link to comment Share on other sites More sharing options...
Stephen Graf Posted June 18 Share Posted June 18 I just tried to build a legacy image for orangepione and it fails. I'll try again later. https://paste.armbian.com/ijiyegidak [🚸] Command failed, retrying in 15s [ apt_find_upstream_package_version_and_download_url base-files ] curl: (28) Operation timed out after 10306 milliseconds with 0 bytes received 0 Quote Link to comment Share on other sites More sharing options...
Stephen Graf Posted June 19 Share Posted June 19 On 6/17/2024 at 10:52 AM, mikhailai said: stuck at "Starting kernel ..." message. I did manage to build a minimal legacy image (24.8.0-trunk, sunxi-legacy:6.1.94) from the current Armbian build system and it gets stuck at the "Starting kernel" message. putty.txt 0 Quote Link to comment Share on other sites More sharing options...
going Posted June 19 Share Posted June 19 54 минуты назад, Stephen Graf сказал: I did manage to build a minimal legacy image (24.8.0-trunk, sunxi-legacy:6.1.94) Will you be able to publish part of the kernel build log? The part that reports on the application of patches. 9 часов назад, Stephen Graf сказал: I just tried to build a legacy image for orangepione and it fails. We don't need this build logic path. Force the build system to always build the kernel package: ./compile.sh test ARTIFACT_IGNORE_CACHE="yes" kernel Configuration file: ~/build$ cat userpatches/config-test.conf display_alert "Common settings for Armbian OS images" "setting default values" "info" #declare -g USE_MAINLINE_GOOGLE_MIRROR="yes" declare -g SYNC_CLOCK="no" declare -g INSTALL_HEADERS="no" declare -g WIREGUARD="no" declare -g VENDOR="Armbian_community" declare -g VENDORURL="https://github.com/armbian/build" declare -g VENDORDOCS="https://docs.armbian.com" declare -g VENDORSUPPORT="https://community.armbian.com/" declare -g VENDORPRIVACY="https://duckduckgo.com/" declare -g VENDORBUGS="https://github.com/armbian/community/issues" declare -g VENDORLOGO="armbian-logo" declare -g MAINTAINERMAIL=info@armbian.com declare -g MAINTAINER="The-going" declare -g COMPRESS_OUTPUTIMAGE="sha,img,xz" declare -g IMAGE_XZ_COMPRESSION_RATIO=5 declare -g EXPERT="yes" #declare -g KERNEL_CONFIGURE=yes #declare -g DONT_BUILD_ARTIFACTS="firmware,full_firmware,fake_ubuntu_advantage_tools,armbian-config,armbian-zsh,armbian-plymouth-theme" #Upload the log file to the armbian website. #SHARE_LOG=yes #ARTIFACT_IGNORE_CACHE="yes" KERNEL_GIT=shallow RELEASE=bookworm BOARD=bananapim64 BRANCH=current BUILD_DESKTOP=no BUILD_MINIMAL=yes P.S. Edit: BOARD=XXXX BRANCH=YYYYY 0 Quote Link to comment Share on other sites More sharing options...
Stephen Graf Posted June 19 Share Posted June 19 14 hours ago, going said: ./compile.sh test ARTIFACT_IGNORE_CACHE="yes" kernel @going I compiled with your test script for legacy orangepione. There was no image produced. The curl command to upload the log file did not work and uploading the log file to this message also failed. I did pull the attached patches section from the log file. Can I email the files to you directly? build_log_patches.txt 0 Quote Link to comment Share on other sites More sharing options...
ColorfulRhino Posted June 19 Share Posted June 19 You can paste logs here: https://paste.armbian.com/ (It's basically hastebin) 0 Quote Link to comment Share on other sites More sharing options...
Stephen Graf Posted June 19 Share Posted June 19 3 minutes ago, ColorfulRhino said: You can paste logs here: @ColorfulRhino No, it says "something went wrong" when I try to save. The files are over 2MB long. 0 Quote Link to comment Share on other sites More sharing options...
Stephen Graf Posted June 19 Share Posted June 19 @going Cut the log file by taking out all the kernel build log entries. https://paste.armbian.com/ibamekatak 0 Quote Link to comment Share on other sites More sharing options...
ColorfulRhino Posted Thursday at 07:51 PM Share Posted Thursday at 07:51 PM Strange, I don't see any error. The build log at the end says your file should be saved as output/debs/linux-image-legacy-sunxi_24.8.0-trunk_armhf__6.1.94-Seb44-D54a0-Pee76-C2446H5c21-HK01ba-V014b-Bf15a-R448a.deb in your build folder. Is this output/debs/ folder empty? 0 Quote Link to comment Share on other sites More sharing options...
mikhailai Posted Friday at 06:54 AM Author Share Posted Friday at 06:54 AM Ok, returning to the original question. I did some dissection, and the problem appears to be a 6.1.x kernel bug as opposed to something being broken on the Armbian side. Disclaimer: I did not use a proper Armbian build; rather just took the kernel code from "linux-6.1.y" branch and used "config-6.1.77-legacy-sunxi". So here are my results: The v6.1.87 is booting fine: the same way as "linux-image-legacy-sunxi" version 24.2.1. The v6.1.88 is broken with the same symptoms as "linux-image-legacy-sunxi" version 24.5.1. The culprit is the following commit: 07b37f227c8daa27e68f57b1c691fab34a06731e (HEAD) random: handle creditable entropy from atomic process context commit 07b37f227c8daa27e68f57b1c691fab34a06731e Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Wed Apr 17 13:38:29 2024 +0200 random: handle creditable entropy from atomic process context commit e871abcda3b67d0820b4182ebe93435624e9c6a4 upstream. The entropy accounting changes a static key when the RNG has initialized, since it only ever initializes once. Static key changes, however, cannot be made from atomic context, so depending on where the last creditable entropy comes from, the static key change might need to be deferred to a worker. Previously the code used the execute_in_process_context() helper function, which accounts for whether or not the caller is in_interrupt(). However, that doesn't account for the case where the caller is actually in process context but is holding a spinlock. This turned out to be the case with input_handle_event() in drivers/input/input.c contributing entropy: [<ffffffd613025ba0>] die+0xa8/0x2fc [<ffffffd613027428>] bug_handler+0x44/0xec [<ffffffd613016964>] brk_handler+0x90/0x144 [<ffffffd613041e58>] do_debug_exception+0xa0/0x148 [<ffffffd61400c208>] el1_dbg+0x60/0x7c [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90 [<ffffffd613011294>] el1h_64_sync+0x64/0x6c [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8 [<ffffffd613102b54>] __might_sleep+0x44/0x7c [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec [<ffffffd6132c2820>] static_key_enable+0x14/0x38 [<ffffffd61400ac08>] crng_set_ready+0x14/0x28 [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8 [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270 [<ffffffd613857e54>] add_input_randomness+0x38/0x48 [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490 [<ffffffd613a81310>] input_event+0x6c/0x98 According to Guoyong, it's not really possible to refactor the various drivers to never hold a spinlock there. And in_atomic() isn't reliable. So, rather than trying to be too fancy, just punt the change in the static key to a workqueue always. There's basically no drawback of doing this, as the code already needed to account for the static key not changing immediately, and given that it's just an optimization, there's not exactly a hurry to change the static key right away, so deferal is fine. Reported-by: Guoyong Wang <guoyong.wang@mediatek.com> Cc: stable@vger.kernel.org Fixes: f5bda35fba61 ("random: use static branch for crng_ready()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff --git a/drivers/char/random.c b/drivers/char/random.c index 5d1c8e1c99b5..fd57eb372d49 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -683,7 +683,7 @@ static void extract_entropy(void *buf, size_t len) static void __cold _credit_init_bits(size_t bits) { - static struct execute_work set_ready; + static DECLARE_WORK(set_ready, crng_set_ready); unsigned int new, orig, add; unsigned long flags; @@ -699,8 +699,8 @@ static void __cold _credit_init_bits(size_t bits) if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) { crng_reseed(); /* Sets crng_init to CRNG_READY under base_crng.lock. */ - if (static_key_initialized) - execute_in_process_context(crng_set_ready, &set_ready); + if (static_key_initialized && system_unbound_wq) + queue_work(system_unbound_wq, &set_ready); wake_up_interruptible(&crng_init_wait); kill_fasync(&fasync, SIGIO, POLL_IN); pr_notice("crng init done\n"); @@ -870,8 +870,8 @@ void __init random_init(void) /* * If we were initialized by the cpu or bootloader before jump labels - * are initialized, then we should enable the static branch here, where - * it's guaranteed that jump labels have been initialized. + * or workqueues are initialized, then we should enable the static + * branch here, where it's guaranteed that these have been initialized. */ if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY) crng_set_ready(NULL); The code change is rather simple: it switches from using "execute_in_process_context" to "queue_work", but that switch is causing the lock-up. I don't have enough knowledge to debug why it is happening: suspect some sort of a deadlock. I've tried taking the "random.c" from the 6.6.34 kernel and doing hacky modifications to get to to compile on 6.1.y: that fixed the problem, so I'm guessing the "random.c" on the 6.1.y branch is not in a good state. Does anyone have suggestions on how to proceed from here? 2 Quote Link to comment Share on other sites More sharing options...
going Posted Friday at 10:01 AM Share Posted Friday at 10:01 AM 2 часа назад, mikhailai сказал: Does anyone have suggestions on how to proceed from here? Analysis: linux-stable> git log --pretty=oneline v6.1.87..07b37f227c8daa27e68f57b1c691fab34a06731e | wc -l 8 Maybe we will do the following: 1) Freeze the outdated kernel to version 6.1.87. diff --git a/config/sources/families/include/sunxi64_common.inc b/config/sources/families/include/sunxi64_common.inc index 18775666..e37fe516 100644 --- a/config/sources/families/include/sunxi64_common.inc +++ b/config/sources/families/include/sunxi64_common.inc @@ -25,6 +25,7 @@ case $BRANCH in legacy) declare -g KERNEL_MAJOR_MINOR="6.1" # Major and minor versions of this kernel. + declare -g KERNELBRANCH="tag:v6.1.78" ;; current) diff --git a/config/sources/families/include/sunxi_common.inc b/config/sources/families/include/sunxi_common.inc index 93b14ab8..f6261767 100644 --- a/config/sources/families/include/sunxi_common.inc +++ b/config/sources/families/include/sunxi_common.inc @@ -26,6 +26,7 @@ case $BRANCH in legacy) declare -g KERNEL_MAJOR_MINOR="6.1" # Major and minor versions of this kernel. + declare -g KERNELBRANCH="tag:v6.1.78" ;; current) 2) Переработаем (извлечём заново патчи) для этой версии ядра. 3) Leave this kernel in this state, and eliminate the cause for the current 6.6 kernel. If it is present in it. 0 Quote Link to comment Share on other sites More sharing options...
going Posted Friday at 10:29 AM Share Posted Friday at 10:29 AM 17.06.2024 в 20:52, mikhailai сказал: "linux-image-current-sunxi" version 24.5.1 with 6.6.31 kernel: boots fine. 3 часа назад, mikhailai сказал: The culprit is the following commit: 07b37f227c8daa27e68f57b1c691fab34a06731e (HEAD) random: handle creditable entropy from atomic process context This patch in the 6.6 kernel is present after the v6.6.28 tag 998f52a860555a9f02242bc0a4b3e9b47d47dc11 I think the problem lies elsewhere. 0 Quote Link to comment Share on other sites More sharing options...
going Posted Friday at 11:17 AM Share Posted Friday at 11:17 AM 20.06.2024 в 02:21, Stephen Graf сказал: Cut the log file by taking out all the kernel build log entries. https://paste.armbian.com/ibamekatak Summary: kernel patching: 498 total patches; 498 applied; 81 with problems; 80 needs_rebase; 4 not_mbox This line indicates that problems exist, but is silent about what kind of problems they are. Row offset? Diffusion? Here, a separate piece can be applied to another node in the DTS or to another function in the C code. Only a person who reads the source code of the file and reads the patch file can detect the problem. 0 Quote Link to comment Share on other sites More sharing options...
mikhailai Posted Friday at 06:34 PM Author Share Posted Friday at 06:34 PM 6 hours ago, going said: This patch in the 6.6 kernel is present after the v6.6.28 tag 998f52a860555a9f02242bc0a4b3e9b47d47dc11 I think the problem lies elsewhere. True, but the "random.c" on the 6.6 branch contains bunch of other changes not present in 6.1 (15 commits to be precise). I suppose the change "random: handle creditable entropy from atomic process context" woks well with these commits, but is broken without some of these changes. In fact, I kind-of confirmed that, per my comment below. 10 hours ago, mikhailai said: I've tried taking the "random.c" from the 6.6.34 kernel and doing hacky modifications to get to to compile on 6.1.y: that fixed the problem, so I'm guessing the "random.c" on the 6.1.y branch is not in a good state. Overall, this looks plausible. The change was originally done and tested on the mainline, with all other changes being present. Then it was cherry-picked into 6.6 and 6.1 branches, where it received more limited testing that did not catch the problem. I'm guessing the problem does not show up on x86 and shows up on armhf. It could be timing dependent, so only shows up under specific circumstances. I'm hoping there would be just a few commits (ideally just one) that could be cherry-picked into 6.1 branch to make it work. 0 Quote Link to comment Share on other sites More sharing options...
going Posted Friday at 07:33 PM Share Posted Friday at 07:33 PM 50 минут назад, mikhailai сказал: I'm hoping there would be just a few commits (ideally just one) that could be cherry-picked into 6.1 branch to make it work. Okay, I get it. Can we just take these few patches from the 6.6 kernel and add them to the 6.1 kernel? It is better if they are in the form in which they already exist in 6.6. I mean, what have you already tested. 0 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.