Jump to content

going

Members
  • Posts

    675
  • Joined

  • Last visited

Reputation Activity

  1. Like
    going got a reaction from Werner in Banana Pi - Armbian Buildsystem | Development Team   
    This will be a fork that will return the code to the parent project https://github.com/armbian/build ?
    Or is it planned to be developed as an independent project based on this branch?
  2. Like
    going got a reaction from Sesse in SV08 can't find thermal zone on 6.11.2   
    PR: https://github.com/armbian/build/pull/7442
  3. Like
    going reacted to Sesse in SV08 can't find thermal zone on 6.11.2   
    I'm fine with this subject edit.
  4. Like
    going reacted to Sesse in SV08 can't find thermal zone on 6.11.2   
    Thanks for the pointers. Here is my patch (I tested it by dropping it into the right userpatches directory, compiling the kernel and then booting):
     
    From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 From: Steinar H. Gunderson <steinar+kernel@gunderson.no> Date: Mon, 4 Nov 2024 15:35:38 +0000 Subject: Fix broken allwinner,sram dependency on CB1 On BigTreeTech CB1, the thermal sensor has an allwinner,sram property pointing to <&syscon>. However, Armbian has an out-of-tree kernel patch that creates dependencies based on allwinner,sram properties, which assumes that they point to sram nodes exactly two levels below the syscon node (instead of the syscon itself). This manifests itself as the thermal sensor refusing to load with a nonsensical error message: [ 23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier Note that it does not say _which_ supplier it is waiting for (the message ends in a space and then no supplier). The patch was unproblematic in the 5.6 megous patch set, where it was introduced, and in 6.6, which is current for Armbian, but in 6.8, the sun8i-thermal driver got mainlined, with this extra property compared to the out-of-tree version we used before (since it wants to clear a special bit at 0x300000 instead of relying on the firmware to do so before kernel boot). Fix by being a bit more flexible when we walk up the tree, so that we always stop at the syscon node. Tested on a Sovol SV08, which is CB1-based. Signed-off-by: Steinar H. Gunderson <steinar+kernel@gunderson.no> --- drivers/of/property.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/of/property.c b/drivers/of/property.c index 6fcfcda9d..f00f2b129 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1368,12 +1368,13 @@ static struct device_node *parse_allwinner_sram(struct device_node *np, if (index > 0) return NULL; sram_node = of_parse_phandle(np, prop_name, 0); - sram_node = of_get_parent(sram_node); - sram_node = of_get_parent(sram_node); + while (sram_node && !of_node_is_type(sram_node, "syscon")) { + sram_node = of_get_parent(sram_node); + } return sram_node; } static const struct supplier_bindings of_supplier_bindings[] = { -- Created with Armbian build tools https://github.com/armbian/build I'm going to mark this as solved; I hope it can go upstream at some point.
     
    Note that I think the most sensible thing to do is still just to delete the patch in question, not to patch it further. But I don't understand 100% why it was added in the first place; maybe there is some reason that is still valid for some platform.
  5. Like
    going reacted to Sesse in SV08 can't find thermal zone on 6.11.2   
    I removed the patch, and the resulting kernel booted and found its thermal zones. So I think the basic issue is that the patch just is doing of_get_parent() twice without checking that this makes sense.
    sesse@amalie:~$ uname -a Linux amalie 6.11.2-edge-sunxi64 #2 SMP Fri Oct  4 14:38:57 UTC 2024 aarch64 GNU/Linux sesse@amalie:~$ ls -l /sys/class/thermal/ total 0 lrwxrwxrwx 1 root root 0 Nov  3 20:28 cooling_device0 -> ../../devices/virtual/thermal/cooling_device0 lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone0 -> ../../devices/virtual/thermal/thermal_zone0 lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone1 -> ../../devices/virtual/thermal/thermal_zone1 lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone2 -> ../../devices/virtual/thermal/thermal_zone2 lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone3 -> ../../devices/virtual/thermal/thermal_zone3  
  6. Like
    going got a reaction from Werner in Boot fails after NVMe/SPI install   
    Thank you very much.
    I collect in a notebook all possible bugs when using this utility.
    I want to redo it.
  7. Like
    going reacted to royk in No network connection after update   
    @going https://github.com/orangepi-xunlong/u-boot-orangepi/tree/v2017.09-rk3588
  8. Like
    going reacted to ozacas in 24.11.0-trunk armbian-install failure induced by old partition devices   
    the original system that started the thread, after (failed) armbian-install run, looks like this with the USB stick used to boot the system:
     
    acas@uefi-arm64:~$ df Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 1605708 12816 1592892 1% /run /dev/sda2 59094688 5773640 52631248 10% / tmpfs 8028524 0 8028524 0% /dev/shm tmpfs 5120 0 5120 0% /run/lock efivarfs 64 42 23 65% /sys/firmware/efi/efivars tmpfs 8028524 0 8028524 0% /tmp /dev/sda1 258094 150 257945 1% /boot/efi /dev/zram1 47960 2152 42224 5% /var/log tmpfs 1605704 60 1605644 1% /run/user/1000 acas@uefi-arm64:~$ fdisk -l Disk /dev/nvme0n1: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors Disk model: Fanxiang S501Q 1TB Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: A3C2AE18-4086-48D2-B1F9-90B315BAB1FA Device Start End Sectors Size Type /dev/nvme0n1p1 2048 2000408575 2000406528 953.9G Linux filesystem Disk /dev/sda: 57.62 GiB, 61865984000 bytes, 120832000 sectors Disk model: USB Flash Drive Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 611F458A-F35E-B04A-9F22-8E04707F1E17 Device Start End Sectors Size Type /dev/sda1 8192 532479 524288 256M EFI System /dev/sda2 532480 120831966 120299487 57.4G Linux root (ARM-64) Disk /dev/zram0: 7.66 GiB, 8221212672 bytes, 2007132 sectors Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk /dev/zram1: 50 MiB, 52428800 bytes, 12800 sectors Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes  
    note how there is no EFI partition on the NVME drive. But the image from the armbian build framework has an EFI system partition and is what i'm using to boot the system right now via the USB stick.
  9. Like
    going reacted to ozacas in 24.11.0-trunk armbian-install failure induced by old partition devices   
    the box i was referring to has the following -
    acas@opi2:~/build$ fdisk -l Disk /dev/nvme0n1: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors Disk model: Fanxiang S500Pro 1TB Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x4f11c8df Device Boot Start End Sectors Size Id Type /dev/nvme0n1p1 2048 411647 409600 200M ef EFI (FAT-12/16/32) /dev/nvme0n1p2 411648 2000409263 1999997616 953.7G 83 Linux Disk /dev/sda: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: ASM1153USB3.0TOS Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 33553920 bytes Disklabel type: dos Disk identifier: 0x2eeae39f Device Boot Start End Sectors Size Id Type /dev/sda1 2048 8390655 8388608 4G b W95 FAT32 /dev/sda2 8390656 50333695 41943040 20G 83 Linux /dev/sda3 50333696 3907029167 3856695472 1.8T 83 Linux Disk /dev/zram0: 3.74 GiB, 4014276608 bytes, 980048 sectors Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk /dev/zram1: 50 MiB, 52428800 bytes, 12800 sectors Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes  
    ah... i have a USB connected sata adapter ssd as /dev/sda - forgot about that. Maybe that is accidentally providing boot (used to be connected to an rpi4)
  10. Like
    going got a reaction from MaxT in x96q h313   
    @Nick A I understand you.
    Working with the source code and making patches within the framework of the build system is a sad business.
     
    Let's try to sort it out.
    I'm writing this for everyone. Don't read it if it's familiar to you.
    First, basic knowledge about git:
    A git is a chain of related git objects.
    The git object is the compressed difference between the current state and the previous one.
    The Git object has a name in the form of a 40-digit hexadecimal numeric word.
    Each object is tightly linked to the previous (parent) and subsequent (child) objects.
    Concepts such as HEAD, tag, branch are references to a specific git object.
    The working tree is the git state extracted to the working directory.
     
    In previous posts, we talked about u-boot. Let's check what we have in the build system.
    My build system is located in the VM in the folder: /home/leo/armbian
     
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ git branch + master * u-boot-edge-bananapim3 leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ git log --pretty=oneline -3 866ca972d6c3cabeaf6dbac431e8e08bb30b3c8e (HEAD -> u-boot-edge-bananapim3, tag: v2024.01) Prepare v2024.01 82750ce44226e5f2b3bbcd79cf7b3ba3dfd3de4d arm: dts: iot2050: Fix by syncing from Linux dbb124cf6888da9581834a3c17b02f958a8afacf configs: j7200: Remove HBMC_AM654 config  
    Enter your data from the github here.
    It is important.
    git config --global user.email "you@example.com" git config --global user.name "Your Name"  
    Let's add all the changed files to the monitored state and make a commit:
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ sudo git add --all leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ sudo git commit -m "The Armbian changes" [u-boot-edge-bananapim3 dcf395c104] The Armbian changes 52 files changed, 682 insertions(+), 16 deletions(-) create mode 100644 arch/arm/dts/sun50i-a64-recore.dts create mode 100644 arch/arm/dts/sun50i-h313-x96q-lpddr3.dts create mode 100644 configs/recore_defconfig create mode 100644 configs/x96q_lpddr3_defconfig  
    Now we will see:
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ git log --pretty=oneline -3 dcf395c1044533320913373b3b8da980ac49ac73 (HEAD -> u-boot-edge-bananapim3) The Armbian changes 866ca972d6c3cabeaf6dbac431e8e08bb30b3c8e (tag: v2024.01) Prepare v2024.01 82750ce44226e5f2b3bbcd79cf7b3ba3dfd3de4d arm: dts: iot2050: Fix by syncing from Linux The HEAD and branch links point to the new git object.
    But the v2024.01 tag continues to point to 866ca972d6c3cabeaf6dbac431e8e08bb30b3c8e
     
    Let's add the debugging code and make a commit:
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ sudo nano drivers/mmc/sunxi_mmc.c leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ sudo git add drivers/mmc/sunxi_mmc.c leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ sudo git commit -m "define DEBUG macros for sunxi mmc"  
    Look at the last commit. This is what will be extracted using the command "git format-patch -1":
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ git log -p -1 commit c5a54a85e206c32c4a17aa573c8f409899c2a77a (HEAD -> u-boot-edge-bananapim3) Author: The-going <48602507+The-going@users.noreply.github.com> Date: Wed Sep 25 09:09:47 2024 +0000 define DEBUG macros for sunxi mmc diff --git a/drivers/mmc/sunxi_mmc.c b/drivers/mmc/sunxi_mmc.c index 8b684929e0..c32c9bda28 100644 --- a/drivers/mmc/sunxi_mmc.c +++ b/drivers/mmc/sunxi_mmc.c @@ -33,6 +33,8 @@ #include "sunxi_mmc.h" +#define DEBUG + #ifndef CCM_MMC_CTRL_MODE_SEL_NEW #define CCM_MMC_CTRL_MODE_SEL_NEW 0 #endif  
    We will extract these changes to the target directory:
    leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ git format-patch -1 -o /home/leo/armbian/patch/u-boot/u-boot-sunxi/ /home/leo/armbian/patch/u-boot/u-boot-sunxi/0001-define-DEBUG-macros-for-sunxi-mmc.patch leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ mv /home/leo/armbian/patch/u-boot/u-boot-sunxi/0001-define-DEBUG-macros-for-sunxi-mmc.patch /home/leo/armbian/patch/u-boot/u-boot-sunxi/define-DEBUG-macros-for-sunxi-mmc.patch leo@armbuild:~/armbian/cache/sources/u-boot-worktree/u-boot/v2024.01$ cat /home/leo/armbian/patch/u-boot/u-boot-sunxi/define-DEBUG-macros-for-sunxi-mmc.patch From c5a54a85e206c32c4a17aa573c8f409899c2a77a Mon Sep 17 00:00:00 2001 From: The-going <48602507+The-going@users.noreply.github.com> Date: Wed, 25 Sep 2024 09:09:47 +0000 Subject: [PATCH] define DEBUG macros for sunxi mmc --- drivers/mmc/sunxi_mmc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mmc/sunxi_mmc.c b/drivers/mmc/sunxi_mmc.c index 8b684929e0..c32c9bda28 100644 --- a/drivers/mmc/sunxi_mmc.c +++ b/drivers/mmc/sunxi_mmc.c @@ -33,6 +33,8 @@ #include "sunxi_mmc.h" +#define DEBUG + #ifndef CCM_MMC_CTRL_MODE_SEL_NEW #define CCM_MMC_CTRL_MODE_SEL_NEW 0 #endif -- 2.34.1 I have removed the prefix number of the patch file name so that the patch is applied last.
     
    If I do git format-patch -1 v2024.01 I will get the changes that are stored in the git object referenced by this v2024.01 tag.
     
    Do I need to write here how to correctly add multiple changes to the kernel?
     
  11. Like
    going reacted to kris777 in Wifi setup   
    install in system:
    sudo apt-get install network-manager try this in terminal
    nmcli device wifi list nmcli device wifi connect "$SSID" password "$PASSWORD" I know it's not recommended by developers / @Igor connecting to wifi on Armbian FW ... but I did it like on standard Linux and it also works on FW from OrangePi suppliers 🙂
  12. Like
    going got a reaction from milanp in How to boot from squashfs file ?   
    Quite interesting!
    If you try to explain your ultimate goal and the reasons that led you to this decision, perhaps I can give you some advice.
  13. Like
    going got a reaction from milanp in How to boot from squashfs file ?   
    It's hard to argue with that.
    But the squashfs file system is basically an opportunity to use a compressed root file system to save space.
     
    If we set the task as ensuring fault tolerance of the root file system, then we need to perform a whole set of measures.
    And mounting the root file system with the read-only flag is one of the points.
     
    By the way, u-boot is able to work with the squashfs file system.
    I thought your question was about that.
     
    With respect.
  14. Like
    going reacted to robertoj in How to boot from squashfs file ?   
    There the option of “overlay file system”. It makes your file system “forget” anything that is changed since the snapshot, upon each poweroff 
     
    look for it in armbian-config
  15. Like
    going got a reaction from sami in nanopi-r5s SD card and EMMC   
    If your device is in the same condition, I will ask you to provide some additional information about the downloaded operating system.
    Your case is unique.
    Your device and mine have the same OS and the same set of packages installed.

    But the armbian-install script behaves completely differently and crashes in your case.
    I need to understand why this is happening and correct the erroneous behavior.
    Please post the output of the following commands:
    cat /proc/cmdline | tr " " "\n" lsblk -Py df -h sudo blkid  
    1) You can use a utility from the manufacturer of the chip or device. If this software exists.
    2) You can become a hacker, listen to the following harmful tips, or find them on the Internet:
    Boot the device in any way and get access to the command line with superuser rights.
    The bootloader, u-boot or other, can be written in whole or in parts to three locations on the eMMC:
    /dev/mmcblkXboot0 /dev/mmcblkXboot1 /dev/mmcblkX X - This is the number for your eMMC In order for the device not to boot from eMMC, it is necessary to clear those areas in which parts of the bootloader can be placed.
    sudo su echo 0 > /sys/block/mmcblkXboot0/force_ro dd if=/dev/zero of=/dev/mmcblkXboot0 bs=1M count=4 echo 0 > /sys/block/mmcblkXboot1/force_ro dd if=/dev/zero of=/dev/mmcblkXboot1 bs=1M count=4 dd if=/dev/zero of=/dev/mmcblkX bs=1M count=10 After these steps, your device will be able to boot from the SD card correctly.
     
    You can also use the helpful tips described in great detail in the wiki documentation for NanoPi R5S.
     
    With respect
     
  16. Like
    going got a reaction from sami in nanopi-r5s SD card and EMMC   
    No
     
    Create a check-command.sh file:
    #!/bin/bash cm="ls grep awk blkid tr lsblk xargs sync mount df head cat sed mktemp nl chroot lsof parted partprobe mkfs fdisk" for c in $cm do echo "# $c =: $(command -v $c)" done echo "=====" root_uuid=$(sed -e 's/^.*root=//' -e 's/ .*$//' < /proc/cmdline) root_partition=$(blkid | tr -d '":' | grep "${root_uuid}" | awk '{print $1}') root_partition_name=$(echo $root_partition | sed 's/\/dev\///g') root_partition_device_name=$(lsblk -ndo pkname $root_partition) root_partition_device=/dev/$root_partition_device_name emmccheck=$(ls -d -1 /dev/mmcblk* 2>/dev/null | grep -w 'mmcblk[0-9]' | grep -v "$root_partition_device") diskcheck=$(lsblk -l | awk -F" " '/ disk / {print $1}' | grep -E '^sd|^nvme|^mmc' | grep -v "$root_partition_device_name" | grep -v boot) echo "root_partition_device=$root_partition_device" echo "emmccheck=$emmccheck" echo "diskcheck=$diskcheck"  
    and make it executable.
    chmod +x check-command.sh  
    Run on board and publish
     
    sudo ./check-command.sh  
  17. Like
    going got a reaction from sami in nanopi-r5s SD card and EMMC   
    @sami Please publish your OS and BASH version.
     
    leo@bananapim3:~$ lsblk --version lsblk from util-linux 2.38.1  
  18. Like
    going got a reaction from sami in nanopi-r5s SD card and EMMC   
    Please try to replace the /usr/sbin/armbian-install file with this new version:
    packages/bsp/common/usr/sbin/armbian-install
     
    Please show the output of the `lsblk` command.
    Please post screenshots of the armbian-install dialog.
  19. Like
    going reacted to Igor in Orangepizero does not restart any more   
    I think this part is safe to ignore. Wireless works, BT scans finds nothing (this perhaps needs fixing). But this hardware is too old to bother with. Its more like a reference to see if Allwinner A20 generally works.
     

    That is important part, yes.
     
     
    Do it always the same even not most optimal ?
     

    We don't have this info in db at the moment.
     

    Its binary - board failed completely, not responding but it should = error at CI level. There were few people saying that they will help to develop this test framework, but its only what I wrote some time ago. 
  20. Like
    going got a reaction from vick_lo in Build the linux-tools   
    This feature is not available at the stage of image assembly using the Armbian assembly system.
    Why? I don't know.
     
    P.S. The Armbian build system does not build and distribute source packages.
    Only binary packages are collected and provided.
    Why? I don't know.
  21. Like
    going reacted to mikhailai in linux-image-legacy-sunxi=24.5.1 (kernel 6.1.92) is broken: stuck at "Starting kernel ..."   
    Hi y'all. Sorry for disappearing for a while: it really took some time to investigate, but now I'm pretty sure I've found the root cause.
     
    TLDR:
    My analysis above is wrong. The bug is present in v6.1.87, and change to "drivers/char/random.c" has nothing to do with it: just accidentally happens to trigger the bug. The problem only manifests itself when the ftrace "mcount" call instruction for _raw_spin_unlock_irqrestore function in the kernel code straddles the instruction cache lines. This happens when "_raw_spin_unlock_irqrestore" address ends on (hex): 1c, 3c, 5c, ... fc (see System.map for that). Due to above arbitrary changes to the kernel code may trigger this problem to appear or disappear. In other words, the hang may look fixed, but then show up later. It is present through the whole 6.1.y kernel branch, as well as 6.6.y branch. I did not check the mainline or earlier branches. The problem does not appear when the kernel is compiled with GCC 9, which is a default cross-compiler on Ubuntu 20.04 (Focal). See the bottom of this post for the correct fix. I'll try to get into the upstream Linux kernel. LONG VERSION:
     
    Problem: Linux kernel hangs in early boot on 32-bit ARM platform, when ftrace 4-byte "mcount" function call location for "_raw_spin_unlock_irqrestore" function straddles icache lines.
     
    The problem persist through the whole 6.1.y kernel branch and likely beyond. Could also reproduce it in the 6.6.y branch with a bit more "nop" placement (see below).
     
    ROOT CAUSE ANALYSIS:

    The hang is inside:
    start_kernel -> ftrace_init -> ftrace_process_locs -> ftrace_update_code.
     
    It hangs when it updates the ftrace location (by calling "ftrace_nop_initialize") for the entry for:
    _raw_spin_unlock_irqrestore The reason is the following:
    "ftrace_nop_initialize" calls "ftrace_init_nop", which on 32-bit ARM goes to "ftrace_make_nop". "ftrace_make_nop" calls "ftrace_modify_code" that calls "__patch_text", that in-turn calls "__patch_text_real" (defined in "arch/arm/kernel/patch.c") with remap=true. After writing the actual instruction, "__patch_text_real" does the following:      if (waddr != addr) {         flush_kernel_vmap_range(waddr, twopage ? size / 2 : size);         patch_unmap(FIX_TEXT_POKE0, &flags);     }     flush_icache_range((uintptr_t)(addr),                (uintptr_t)(addr) + size);  
    The "patch_unmap" calls the above-mentioned "_raw_spin_unlock_irqrestore". Hereby lies the problem. If it's patching the "_raw_spin_unlock_irqrestore", it invokes the function BEFORE flushing the icache, so there is a possibility of that function having an invalid code created by the combination of the updated and non-updated pieces of the instruction residing in different cache lines. The occurrence of the error strongly depends on other factors: that's why it worked for earlier 6.1.y kernels. Necessary factors:
    The ftrace location for "_raw_spin_unlock_irqrestore" is NOT 4-byte aligned and 4 bytes at this location straddle the instruction cache line (0x20) boundaries. I.e. the pg->records[i]->ip (hex) value ends on: 0x1e, 0x3e, 0x5e, ... 0xfe. For that function, this value is offset from the function address by 2 bytes. The previous Ftrace entry needs to be updated as well. That is probably needed to get the icache into inconsistent state. For the reproduced hangs, the previous entry is inside the "_raw_write_unlock_irqrestore" (unlike _raw_spin_unlock_irqrestore, it is NOT being invoked when "ftrace_update_code" is executing). The problem is present for (cross-compiler) GCC 10, 11, 12. It does not happen when the kernel is compiled with GCC 9, even when condition (1) is satisfied. Not sure what is the reason: could be different code or condition (2) being different, leading to cache NOT get into an inconsistent state. Note, the default cross-compiler on Ubuntu 22.04 (Jammy) is GCC 11, while the default compiler on Ubuntu 20.04 (Focal) is GCC 9. Note, the condition (1) can be achieved by increasing/decreasing code size of certain functions. The following algorithm can be used.
    Add 4 "nop" instructions at a time to "drivers/char/random.c", "try_to_generate_entropy" function, until "_raw_spin_unlock_irqrestore" address ends on -x8, or -xC, where "x" is odd. E.g. ...1c, ...3c, ...5c, etc. E.g. asm("nop;nop;nop;nop; "); If it ends on 8, add 2 more "nop" instructions to one of the lock functions inside the "__lock_text_start" section: see the System.map on which one comes first/earlier. PROPOSED FIX:
    The fix is really simple: just swap the order of "patch_unmap" and "flush_icache_range" in the above code snippet (from  "arch/arm/kernel/patch.c", "__patch_text_real" function). I.e. replace the above code snippet with:
    if (waddr != addr) flush_kernel_vmap_range(waddr, twopage ? size / 2 : size); flush_icache_range((uintptr_t)(addr), (uintptr_t)(addr) + size); /* Can only call 'patch_unmap' after flushing dcache and icache, * because it calls 'raw_spin_unlock_irqrestore', but that may * happen to be the very function we're currently patching * (as it happens during the ftrace init). */ if (waddr != addr) patch_unmap(FIX_TEXT_POKE0, &flags);  
  22. Like
    going reacted to rockmusic64 in Pinebook Pro Kernel on RockPro64   
    I did that on my home Debian installation and the resulting image seems to boot, but there is no display output on HDMI or DP. It might be, because i did not do it in the recommended way of using ubuntu in a vm. I will try that and report here.
     
    I am using Debian for around eight years exclusively and do some python and arduino programming. I never worked on a big project.
     
  23. Like
    going reacted to Aleksey Vasenev in Stuck on "Starting kernel..."   
    armbianEnv.txt: verbosity=7
     
    1.log
  24. Like
    going reacted to Werner in Add new Spacemit family and support Banana Pi (BPI F3) Open Source Smart Router   
    https://github.com/armbian/build/pull/6771
  25. Like
    going reacted to mikhailai in linux-image-legacy-sunxi=24.5.1 (kernel 6.1.92) is broken: stuck at "Starting kernel ..."   
    Ok, returning to the original question. I did some dissection, and the problem appears to be a 6.1.x kernel bug as opposed to something being broken on the Armbian side.
    Disclaimer: I did not use a proper Armbian build; rather just took the kernel code from "linux-6.1.y" branch and used "config-6.1.77-legacy-sunxi".
     
    So here are my results:
    The v6.1.87 is booting fine: the same way as "linux-image-legacy-sunxi" version 24.2.1. The v6.1.88 is broken with the same symptoms as "linux-image-legacy-sunxi" version 24.5.1. The culprit is the following commit:
    07b37f227c8daa27e68f57b1c691fab34a06731e (HEAD) random: handle creditable entropy from atomic process context
    commit 07b37f227c8daa27e68f57b1c691fab34a06731e Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Wed Apr 17 13:38:29 2024 +0200 random: handle creditable entropy from atomic process context commit e871abcda3b67d0820b4182ebe93435624e9c6a4 upstream. The entropy accounting changes a static key when the RNG has initialized, since it only ever initializes once. Static key changes, however, cannot be made from atomic context, so depending on where the last creditable entropy comes from, the static key change might need to be deferred to a worker. Previously the code used the execute_in_process_context() helper function, which accounts for whether or not the caller is in_interrupt(). However, that doesn't account for the case where the caller is actually in process context but is holding a spinlock. This turned out to be the case with input_handle_event() in drivers/input/input.c contributing entropy: [<ffffffd613025ba0>] die+0xa8/0x2fc [<ffffffd613027428>] bug_handler+0x44/0xec [<ffffffd613016964>] brk_handler+0x90/0x144 [<ffffffd613041e58>] do_debug_exception+0xa0/0x148 [<ffffffd61400c208>] el1_dbg+0x60/0x7c [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90 [<ffffffd613011294>] el1h_64_sync+0x64/0x6c [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8 [<ffffffd613102b54>] __might_sleep+0x44/0x7c [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec [<ffffffd6132c2820>] static_key_enable+0x14/0x38 [<ffffffd61400ac08>] crng_set_ready+0x14/0x28 [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8 [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270 [<ffffffd613857e54>] add_input_randomness+0x38/0x48 [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490 [<ffffffd613a81310>] input_event+0x6c/0x98 According to Guoyong, it's not really possible to refactor the various drivers to never hold a spinlock there. And in_atomic() isn't reliable. So, rather than trying to be too fancy, just punt the change in the static key to a workqueue always. There's basically no drawback of doing this, as the code already needed to account for the static key not changing immediately, and given that it's just an optimization, there's not exactly a hurry to change the static key right away, so deferal is fine. Reported-by: Guoyong Wang <guoyong.wang@mediatek.com> Cc: stable@vger.kernel.org Fixes: f5bda35fba61 ("random: use static branch for crng_ready()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> diff --git a/drivers/char/random.c b/drivers/char/random.c index 5d1c8e1c99b5..fd57eb372d49 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -683,7 +683,7 @@ static void extract_entropy(void *buf, size_t len) static void __cold _credit_init_bits(size_t bits) { - static struct execute_work set_ready; + static DECLARE_WORK(set_ready, crng_set_ready); unsigned int new, orig, add; unsigned long flags; @@ -699,8 +699,8 @@ static void __cold _credit_init_bits(size_t bits) if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) { crng_reseed(); /* Sets crng_init to CRNG_READY under base_crng.lock. */ - if (static_key_initialized) - execute_in_process_context(crng_set_ready, &set_ready); + if (static_key_initialized && system_unbound_wq) + queue_work(system_unbound_wq, &set_ready); wake_up_interruptible(&crng_init_wait); kill_fasync(&fasync, SIGIO, POLL_IN); pr_notice("crng init done\n"); @@ -870,8 +870,8 @@ void __init random_init(void) /* * If we were initialized by the cpu or bootloader before jump labels - * are initialized, then we should enable the static branch here, where - * it's guaranteed that jump labels have been initialized. + * or workqueues are initialized, then we should enable the static + * branch here, where it's guaranteed that these have been initialized. */ if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY) crng_set_ready(NULL);  
    The code change is rather simple: it switches from using "execute_in_process_context" to "queue_work", but that switch is causing the lock-up. I don't have enough knowledge to debug why it is happening: suspect some sort of a deadlock.
     
    I've tried taking the "random.c" from the 6.6.34 kernel and doing hacky modifications to get to to compile on 6.1.y: that fixed the problem, so I'm guessing the "random.c" on the 6.1.y branch is not in a good state.
     
    Does anyone have suggestions on how to proceed from here?
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines