Jump to content

going

Members
  • Posts

    581
  • Joined

  • Last visited

Reputation Activity

  1. Like
    going reacted to mikhailai in linux-image-legacy-sunxi=24.5.1 (kernel 6.1.92) is broken: stuck at "Starting kernel ..."   
    Hi y'all. Sorry for disappearing for a while: it really took some time to investigate, but now I'm pretty sure I've found the root cause.
     
    TLDR:
    My analysis above is wrong. The bug is present in v6.1.87, and change to "drivers/char/random.c" has nothing to do with it: just accidentally happens to trigger the bug. The problem only manifests itself when the ftrace "mcount" call instruction for _raw_spin_unlock_irqrestore function in the kernel code straddles the instruction cache lines. This happens when "_raw_spin_unlock_irqrestore" address ends on (hex): 1c, 3c, 5c, ... fc (see System.map for that). Due to above arbitrary changes to the kernel code may trigger this problem to appear or disappear. In other words, the hang may look fixed, but then show up later. It is present through the whole 6.1.y kernel branch, as well as 6.6.y branch. I did not check the mainline or earlier branches. The problem does not appear when the kernel is compiled with GCC 9, which is a default cross-compiler on Ubuntu 20.04 (Focal). See the bottom of this post for the correct fix. I'll try to get into the upstream Linux kernel. LONG VERSION:
     
    Problem: Linux kernel hangs in early boot on 32-bit ARM platform, when ftrace 4-byte "mcount" function call location for "_raw_spin_unlock_irqrestore" function straddles icache lines.
     
    The problem persist through the whole 6.1.y kernel branch and likely beyond. Could also reproduce it in the 6.6.y branch with a bit more "nop" placement (see below).
     
    ROOT CAUSE ANALYSIS:

    The hang is inside:
    start_kernel -> ftrace_init -> ftrace_process_locs -> ftrace_update_code.
     
    It hangs when it updates the ftrace location (by calling "ftrace_nop_initialize") for the entry for:
    _raw_spin_unlock_irqrestore The reason is the following:
    "ftrace_nop_initialize" calls "ftrace_init_nop", which on 32-bit ARM goes to "ftrace_make_nop". "ftrace_make_nop" calls "ftrace_modify_code" that calls "__patch_text", that in-turn calls "__patch_text_real" (defined in "arch/arm/kernel/patch.c") with remap=true. After writing the actual instruction, "__patch_text_real" does the following:      if (waddr != addr) {         flush_kernel_vmap_range(waddr, twopage ? size / 2 : size);         patch_unmap(FIX_TEXT_POKE0, &flags);     }     flush_icache_range((uintptr_t)(addr),                (uintptr_t)(addr) + size);  
    The "patch_unmap" calls the above-mentioned "_raw_spin_unlock_irqrestore". Hereby lies the problem. If it's patching the "_raw_spin_unlock_irqrestore", it invokes the function BEFORE flushing the icache, so there is a possibility of that function having an invalid code created by the combination of the updated and non-updated pieces of the instruction residing in different cache lines. The occurrence of the error strongly depends on other factors: that's why it worked for earlier 6.1.y kernels. Necessary factors:
    The ftrace location for "_raw_spin_unlock_irqrestore" is NOT 4-byte aligned and 4 bytes at this location straddle the instruction cache line (0x20) boundaries. I.e. the pg->records[i]->ip (hex) value ends on: 0x1e, 0x3e, 0x5e, ... 0xfe. For that function, this value is offset from the function address by 2 bytes. The previous Ftrace entry needs to be updated as well. That is probably needed to get the icache into inconsistent state. For the reproduced hangs, the previous entry is inside the "_raw_write_unlock_irqrestore" (unlike _raw_spin_unlock_irqrestore, it is NOT being invoked when "ftrace_update_code" is executing). The problem is present for (cross-compiler) GCC 10, 11, 12. It does not happen when the kernel is compiled with GCC 9, even when condition (1) is satisfied. Not sure what is the reason: could be different code or condition (2) being different, leading to cache NOT get into an inconsistent state. Note, the default cross-compiler on Ubuntu 22.04 (Jammy) is GCC 11, while the default compiler on Ubuntu 20.04 (Focal) is GCC 9. Note, the condition (1) can be achieved by increasing/decreasing code size of certain functions. The following algorithm can be used.
    Add 4 "nop" instructions at a time to "drivers/char/random.c", "try_to_generate_entropy" function, until "_raw_spin_unlock_irqrestore" address ends on -x8, or -xC, where "x" is odd. E.g. ...1c, ...3c, ...5c, etc. E.g. asm("nop;nop;nop;nop; "); If it ends on 8, add 2 more "nop" instructions to one of the lock functions inside the "__lock_text_start" section: see the System.map on which one comes first/earlier. PROPOSED FIX:
    The fix is really simple: just swap the order of "patch_unmap" and "flush_icache_range" in the above code snippet (from  "arch/arm/kernel/patch.c", "__patch_text_real" function). I.e. replace the above code snippet with:
    if (waddr != addr) flush_kernel_vmap_range(waddr, twopage ? size / 2 : size); flush_icache_range((uintptr_t)(addr), (uintptr_t)(addr) + size); /* Can only call 'patch_unmap' after flushing dcache and icache, * because it calls 'raw_spin_unlock_irqrestore', but that may * happen to be the very function we're currently patching * (as it happens during the ftrace init). */ if (waddr != addr) patch_unmap(FIX_TEXT_POKE0, &flags);  
  2. Like
    going reacted to rockmusic64 in Pinebook Pro Kernel on RockPro64   
    I did that on my home Debian installation and the resulting image seems to boot, but there is no display output on HDMI or DP. It might be, because i did not do it in the recommended way of using ubuntu in a vm. I will try that and report here.
     
    I am using Debian for around eight years exclusively and do some python and arduino programming. I never worked on a big project.
     
  3. Like
    going reacted to Aleksey Vasenev in Stuck on "Starting kernel..."   
    armbianEnv.txt: verbosity=7
     
    1.log
  4. Like
    going reacted to Werner in Add new Spacemit family and support Banana Pi (BPI F3) Open Source Smart Router   
    https://github.com/armbian/build/pull/6771
  5. Like
    going reacted to mikhailai in linux-image-legacy-sunxi=24.5.1 (kernel 6.1.92) is broken: stuck at "Starting kernel ..."   
    Ok, returning to the original question. I did some dissection, and the problem appears to be a 6.1.x kernel bug as opposed to something being broken on the Armbian side.
    Disclaimer: I did not use a proper Armbian build; rather just took the kernel code from "linux-6.1.y" branch and used "config-6.1.77-legacy-sunxi".
     
    So here are my results:
    The v6.1.87 is booting fine: the same way as "linux-image-legacy-sunxi" version 24.2.1. The v6.1.88 is broken with the same symptoms as "linux-image-legacy-sunxi" version 24.5.1. The culprit is the following commit:
    07b37f227c8daa27e68f57b1c691fab34a06731e (HEAD) random: handle creditable entropy from atomic process context
    commit 07b37f227c8daa27e68f57b1c691fab34a06731e Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Wed Apr 17 13:38:29 2024 +0200 random: handle creditable entropy from atomic process context commit e871abcda3b67d0820b4182ebe93435624e9c6a4 upstream. The entropy accounting changes a static key when the RNG has initialized, since it only ever initializes once. Static key changes, however, cannot be made from atomic context, so depending on where the last creditable entropy comes from, the static key change might need to be deferred to a worker. Previously the code used the execute_in_process_context() helper function, which accounts for whether or not the caller is in_interrupt(). However, that doesn't account for the case where the caller is actually in process context but is holding a spinlock. This turned out to be the case with input_handle_event() in drivers/input/input.c contributing entropy: [<ffffffd613025ba0>] die+0xa8/0x2fc [<ffffffd613027428>] bug_handler+0x44/0xec [<ffffffd613016964>] brk_handler+0x90/0x144 [<ffffffd613041e58>] do_debug_exception+0xa0/0x148 [<ffffffd61400c208>] el1_dbg+0x60/0x7c [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90 [<ffffffd613011294>] el1h_64_sync+0x64/0x6c [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8 [<ffffffd613102b54>] __might_sleep+0x44/0x7c [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec [<ffffffd6132c2820>] static_key_enable+0x14/0x38 [<ffffffd61400ac08>] crng_set_ready+0x14/0x28 [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8 [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270 [<ffffffd613857e54>] add_input_randomness+0x38/0x48 [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490 [<ffffffd613a81310>] input_event+0x6c/0x98 According to Guoyong, it's not really possible to refactor the various drivers to never hold a spinlock there. And in_atomic() isn't reliable. So, rather than trying to be too fancy, just punt the change in the static key to a workqueue always. There's basically no drawback of doing this, as the code already needed to account for the static key not changing immediately, and given that it's just an optimization, there's not exactly a hurry to change the static key right away, so deferal is fine. Reported-by: Guoyong Wang <guoyong.wang@mediatek.com> Cc: stable@vger.kernel.org Fixes: f5bda35fba61 ("random: use static branch for crng_ready()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff --git a/drivers/char/random.c b/drivers/char/random.c index 5d1c8e1c99b5..fd57eb372d49 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -683,7 +683,7 @@ static void extract_entropy(void *buf, size_t len) static void __cold _credit_init_bits(size_t bits) { - static struct execute_work set_ready; + static DECLARE_WORK(set_ready, crng_set_ready); unsigned int new, orig, add; unsigned long flags; @@ -699,8 +699,8 @@ static void __cold _credit_init_bits(size_t bits) if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) { crng_reseed(); /* Sets crng_init to CRNG_READY under base_crng.lock. */ - if (static_key_initialized) - execute_in_process_context(crng_set_ready, &set_ready); + if (static_key_initialized && system_unbound_wq) + queue_work(system_unbound_wq, &set_ready); wake_up_interruptible(&crng_init_wait); kill_fasync(&fasync, SIGIO, POLL_IN); pr_notice("crng init done\n"); @@ -870,8 +870,8 @@ void __init random_init(void) /* * If we were initialized by the cpu or bootloader before jump labels - * are initialized, then we should enable the static branch here, where - * it's guaranteed that jump labels have been initialized. + * or workqueues are initialized, then we should enable the static + * branch here, where it's guaranteed that these have been initialized. */ if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY) crng_set_ready(NULL);  
    The code change is rather simple: it switches from using "execute_in_process_context" to "queue_work", but that switch is causing the lock-up. I don't have enough knowledge to debug why it is happening: suspect some sort of a deadlock.
     
    I've tried taking the "random.c" from the 6.6.34 kernel and doing hacky modifications to get to to compile on 6.1.y: that fixed the problem, so I'm guessing the "random.c" on the 6.1.y branch is not in a good state.
     
    Does anyone have suggestions on how to proceed from here?
  6. Like
    going reacted to Tony3 in Building Kernel does not build modules (*.ko)   
    First of all thanks to have helped me on that matter.
    I found out what the issue was. When switching some option to M, some code had to be patched as it sounds like it was not expecting to become module or calling a module.
    The general root cause is that some C code is missing instruction like "MODULE_IMPORT_NS(DMA_BUF);", to use module symbols of another module or like "EXPORT_SYMBOL_GPL(dma_contiguous_default_area);" to make some symbols available to other modules.
    After making the whole "Media support" branch as a module, I had to make the attached patchs and apply them to my linux kernel 6.1 V24.05 by copying them in the directory build/userpatches/kernel/rk35xx-vendor-6.1
     
    After doing that all went fine, I got the new Armbian Kernel built, and was able to compile the tbs drivers against it by following instruction there https://github.com/tbsdtv/linux_media/wiki
     
    All drivers working fine with tvheadend and using Kodi as a viewer (with hardware acceleration)
     
    At the end, I am impressed that it worked that well, thanks to Armbian to make the building of a kernel so easy.
    cma.patch contiguous.patch f_uvc.patch
  7. Like
    going got a reaction from O to the o in Banana Pi M3 Boot from EMMC   
    Yes, I confirm. U-boot does not support (not configured during compilation) booting from EMMC.
     
    Today I will fix it for Armbian and for my image.
    Wait a bit.
  8. Like
    going got a reaction from Werner in Banana Pi M3 Boot from EMMC   
    Yes, I confirm. U-boot does not support (not configured during compilation) booting from EMMC.
     
    Today I will fix it for Armbian and for my image.
    Wait a bit.
  9. Like
    going got a reaction from Igor in [Armbian newsletter] - Release is coming   
    I tested the 6.6.30 kernel without changing anything.
    The assembly was done locally on the main branch.
     
    I noticed that the temperature for A64 A83T is calculated incorrectly. DMESG does not report errors.
    Device loading is a bit weird with freezes.
    U-boot has not checked yet.
    The first step is to re-release the patches. I've already done that for the core.
    I'll fix the temperature and make a pull request.
     
  10. Like
    going reacted to Werner in [CNX-Software] - Banana Pi BPI-M6 SBC features SenaryTech SN3680 quad-core Cortex-A73 AI processor   
    If the image would have been made by us its download would be either from armbian.com or https://github.com/armbian/community/ or https://github.com/armbian/os and not from some random baidu/google drive.
  11. Like
    going got a reaction from Alessandro Lannocca in Kali Linux as supported distro   
    thanks. I understand your point of view.

    For a clean system, you may need this key:  KEEP_ORIGINAL_OS_RELEASE=yes
    The script that @Igor advises will add the Kali repository to the /etc/apt/sources.list.d/kali.list file.
    But one inaccuracy remains.
    You'll probably want to change that: https://github.com/armbian/build/blob/27a07d918e3e010f74dc24fcc17f510a8eb35252/lib/functions/rootfs/distro-specific.sh#L150
     
    @Igor sid and unstable are synonyms of the same repositorysid and ustable.
    sid|unstable) cat <<- EOF > "${basedir}"/etc/apt/sources.list deb ${DEBIAN_MIRROR} $release main contrib non-free non-free-firmware #deb-src ${DEBIAN_MIRROR} $release main contrib non-free non-free-firmware EOF Otherwise, apt will read the same thing twice.
  12. Like
    going got a reaction from dhlii in nanopi R1S-H3   
    @dhlii I wish good health to the developer of embedded Linux.
    It will be very interesting for me to talk to you.
    I have a question. Do you use specialized build systems such as buildroot in your work?
     
    The first thing to do is add the target DTS to the u-boot. You can take this as a basis:
    u-boot> find ./arch/arm/dts/ -name '*nanopi-r1*' ./arch/arm/dts/sun8i-h3-nanopi-r1.dts ./arch/arm/dts/sun50i-h5-nanopi-r1s-h5.dts linux-stable> find ./arch/arm/boot/dts/ -name '*nanopi*' ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-duo2.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-m1-plus.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-m1.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-neo-air.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-neo.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-r1.dts ./arch/arm/boot/dts/allwinner/sun8i-h3-nanopi.dtsi This DTS must match the wiring of the pins of the printed circuit board and match the brands of soldered chips.
    The second good step is if you add the default u-boot configuration file.
    This will allow you to repeat the loader assembly by changing only the dts
    You can take this as a basis:
    u-boot> find ./configs/ -name '*h3*' u-boot> find ./configs/ -name '*nanopi*' Special attention is paid to the CONFIG_DRAM_CLK parameter.
    Even on identical boards but from different series, different memory chips can be soldered.
     
    After u-boot has done its job and it loads the dtb of the kernel and the kernel itself, we will be able to dynamically change the dtb using overlays.
    I.e., the DTB in u-boot is hard-coded, the DTB for the kernel we can change dynamically.
     
    P.S. Here I have described my own development process.
  13. Like
    going reacted to royk in Tutorial on how to use your own IR remote controller   
    I've found information on how to use your own IR remote controller from this site: https://forum.odroid.com/viewtopic.php?f=215&t=44671
    In short:
    1. Enable logging from the IR kernel module, enter in a terminal: 
    sudo -i
    echo 1 > /sys/module/rockchip_pwm_remotectl/parameters/code_print
    dmesg -w
     
    2. Check if your remote is supported by pressing the keys on your remote. It should give you info like:
    [ 3485.342354] USERCODE=0xfb04
    [ 3485.369309] RMC_GETDATA=fd
     
    3. Download the overlay file below and edit the usercode and the code for each key. So for like with the key above it'll be 0xfd
     
    4. Place the header file "rk-input.h" in the same directory as the overlay file. In my case the location is "/usr/src/linux-headers-6.1.43-vendor-rk35xx/include/dt-bindings/input/rk-input.h"
     
    5. Compile and install with:
    cpp -nostdinc remote.dts remote-precompiled.dts
    sudo armbian-add-overlay remote-precompiled.dts
    remote.dts
  14. Like
    going reacted to dd5xl in worked im for Banana Pi M3   
    My DDRAM is "H9CCNNNBJTMLARNUM" where according to the datasheet the B" should be read as a "8" (8GB density).
     
    And yes, I've read the specs and voltages as well.
    From my experiences any platform has to be tuned in terms of clocks and voltages to become stable. You can't rely e.g. on a PMIC delivering exactly the voltages as programmed in the registers due to chip variations and PCB layout constraints.
     
    Its not the programmed value but the effective voltage at the consuming chip which makes the difference.
     
  15. Like
    going reacted to dd5xl in worked im for Banana Pi M3   
    @goingI'm on Armbian 24.2:

     
    bert@bananapim3:~$ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Armbian 24.2.0-trunk.550 bookworm Release: 12 Codename: bookworm bert@bananapim3:~$ uname -a Linux bananapim3 6.1.63-current-sunxi #1 SMP Mon Nov 20 10:52:19 UTC 2023 armv7l GNU/Linux  
  16. Like
    going reacted to dd5xl in worked im for Banana Pi M3   
    @going I'm still on CONFIG_DRAM_CLK=480 as preset by defconfig.
     
    Please see my .config against U-Boot V2024.01 below!
     

     
     
    BananaPiM3_u-boot_2024.zip
  17. Like
    going got a reaction from highlander0681 in None of the images work.. OrangePi Zero2   
    You won't believe it, but sometimes it seems to me that artificial intelligence begins to mock my messages.
    I apologize.
     
  18. Like
    going reacted to Gunjan Gupta in Orange Pi Zero 3   
    I have the same experience based on my last 7 months of being here. I totally agree with you.
     
    I also don't disagree with the idea of making it simpler for the non-techie user as long as its not something that is going to increase development and maintenance burden for Armbian. I know no matter how simple we make for them, a lot of them will just try it once and will say its easier to do things on raspberry pi and move on. Simply because its easier for them to find guides for it and they are more geared towards end result than the learning they gather in the process.
  19. Like
    going got a reaction from Gunjan Gupta in Orange Pi Zero 3   
    A short comment on the discussion.
     
    I never use armbian-config. It doesn't help me figure out the essence of the problem.
    I am considering the overlays that are provided by Armbian as a template for possible use.
    A script that switches something can be harmful. I'm just turning it off.
    Next, I pull out the DTB from the working system, which was actually applied.
     
    I open the schematic diagram of the printed circuit board and the schematic diagram of the device that I want to attach to the board.
    I write out the available pin numbers in the table that I could use.
    I take an overlay file found on the Internet or an existing one as a basis and rewrite it to suit my needs.
     
    Next, compile this file, add it to the download and check its operability.
    This algorithm ensures that nothing superfluous appears in the applied device tree.
     
    We can add a lot of automation. We can even connect neural networks to recognize circuit diagrams.😁
    But an experienced user will still try to get around (ignore) all this.
    And an inexperienced user will not understand why something is not working for him.
    He just did not extract the DTB from the running OS and he does not see the real state of things.
     
    It seems to me that helping the user to start thinking with his own head is the best solution.
  20. Like
    going got a reaction from bahtiyar57 in worked im for Banana Pi M3   
    @Tu Hu Until we do the verification, you can use my image. It was created in my version of the build system.
    Armbian_23.10_Bananapim3_bookworm_edge_6.4.10_minimal.zip
  21. Like
    going reacted to bahtiyar57 in Banana Pi M3 crashes if ETH plugged   
    Disabling EEE on my router settings solved the problem.
     
     
    EDIT:
    Is there an option to disable EEE directly from Armbain?
  22. Like
    going got a reaction from bahtiyar57 in Banana Pi M3 crashes if ETH plugged   
    @bahtiyar57
     
     
    _________________ FTDI232 BPI | _________________ | | 3.3v |_____ ____ UART GND o|---------------|o GND o| | || | | Linux TX o|---------------|o RX o| | USB|| |====USB cable===| console RX o|---------------|o TX o |____||____| | "minicom" _________________| |________5v______| o| 3.3v
    o|
    This is the jumper on the device.
    I use this scheme. Everyone uses this scheme. It's safe.
  23. Like
    going got a reaction from Werner in Banana Pi M3 crashes if ETH plugged   
    @bahtiyar57
     
     
    _________________ FTDI232 BPI | _________________ | | 3.3v |_____ ____ UART GND o|---------------|o GND o| | || | | Linux TX o|---------------|o RX o| | USB|| |====USB cable===| console RX o|---------------|o TX o |____||____| | "minicom" _________________| |________5v______| o| 3.3v
    o|
    This is the jumper on the device.
    I use this scheme. Everyone uses this scheme. It's safe.
  24. Like
    going got a reaction from AaronNGray in worked im for Banana Pi M3   
    @Gunjan Gupta I apologize. The machine translator wrote changing the meaning. 
    It should be tested by someone who has these boards. First of all, it's me.
    Yes.
    I'm not suggesting anything. I'm just thinking. I will not make the decisions to do this or that.
     
    My thoughts on the code:
    function post_config_uboot_target__extra_configs_for_bananapipro() { display_alert "$BOARD" "set dram clock" "info" run_host_command_logged scripts/config --set-val CONFIG_DRAM_CLK "384" } I didn't see the difference in the names bananapipro and bananapim3.
    I'm sorry, the glasses are on my nose.
    But that doesn't change the point. The user will only see the message:
    bananapipro set dram clock
    What if he added his own custom patch? Does the build system allow the user to add patches today?
    Maybe give the user the opportunity to set the value of this variable? For example:
    UBOOT_CONFIG_DRAM_CLK="${UBOOT_CONFIG_DRAM_CLK:-384}" function post_config_uboot_target__extra_configs_for_bananapipro() { display_alert "$BOARD" "set dram clock to $UBOOT_CONFIG_DRAM_CLK" "info" run_host_command_logged scripts/config --set-val CONFIG_DRAM_CLK "$UBOOT_CONFIG_DRAM_CLK" } Please don't listen to my grumbling, but do as you see fit.
  25. Like
    going reacted to AaronNGray in worked im for Banana Pi M3   
    @going Armbian_23.10_Bananapim3_bookworm_edge_6.4.10_minimal - boots and expands fine, the caps lock light works as well now ! reboots fine. sudo apt-get update works,  sudo apt-get upgrade works, and reboots.
     
    Armbian_community_24.2.0-trunk.449_Bananapim3_bookworm_current_6.6.13 - boots and expands fine, caps lock light works. sudo reboot now, shuts down, and fails on reboot, power down/up reboot fails too. Attached logs
    system.journalsystem.journalsystem.journal
    kern.log syslog system@11efa2ed424246fabfc13bc624702e97-0000000000000001-00060fba61d3372c.journal
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines