Sesse Posted November 3 Posted November 3 Following up from another thread: I compiled an edge kernel (6.11.2-edge-sunxi64) for my Sovol SV08, in part because I'd like ftrace support (for perf sched). It comes up, except there is no /class/thermal/thermal_zone0/ (which Klipper demands to work, perhaps sensibly enough). dmesg says: [ 23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier It's a bit strange, since my understanding is that the full error message should be “wait for supplier XYZ”, i.e., some specific supplier. This was built with ./compile.sh kernel BOARD=bigtreetech-cb1 BRANCH=edge KERNEL_GIT=shallow and then a hack to use only -j2 (or the device will run out of memory during build). I installed both linux-image-edge-sunxi64 and linux-dtb-edge-sunxi64. The version is 4.11.0-trunk_arm64__6.11.2-S7aa2-D29ea-P5d78-C6b40H5c21-HK01ba-Vc222-Ba3b7-R448a (that's a mouthful). I modified the kernel config, but only to enable ftrace; I didn't mess with thermal or any other driver. armbianmonitor -u: https://paste.armbian.com/gokinaxefa 0 Quote
Sesse Posted November 3 Author Posted November 3 For fun, I hacked the DTB by removing this line (which is the only relevant thing that seems to be different between the 6.6 and 6.11 DTs): ths: thermal-sensor@5070400 { …more lines here… - allwinner,sram = <&syscon>; …more lines here… } After that, the driver loaded just fine (no kernel message, /sys/bus/platform/devices/5070400.thermal-sensor/waiting_for_supplier contains 0), but there are still no thermal zones in /sys/class/thermal. It seems that what really changed between 6.6 and 6.11 is that the h616 thermal drivers got upstreamed, so most of the Armbian patches have been obsoleted, but I guess in the process, something broke for the CB1. 0 Quote
JohnTheCoolingFan Posted November 3 Posted November 3 Tip for compiling: you should be able to use docker for easy cross-compilation on your pc instead of the board itself. As for thermal zones, it's probably a matter of device tree tweaking, I'll take a look. 0 Quote
Sesse Posted November 3 Author Posted November 3 Yes, I tried (albeit without Docker, just regular cross-compiling), and it failed with https://paste.armbian.com/niyisiciqa . It seems there's some confusion somewhere as of whether the image is supposed to be compressed or not (I think maybe Arm does decompression somehow in U-Boot?), so vmlinux versus vmlinuz. I tried looking briefly into it, but figured it was easier just to wait it out on the board. 🙂 0 Quote
going Posted November 3 Posted November 3 2 часа назад, Sesse сказал: Yes, I tried (albeit without Docker, just regular cross-compiling), and it failed with https://paste.armbian.com/niyisiciqa . Try rolling back the build system to this commit: 90637986eef70e5478105b64039a8ba1bfd85062 All subsequent commits are irrelevant to sunxi. 2 часа назад, Sesse сказал: the h616 thermal Check the correctness of the temperature triplet in the DTS, fix it and the sun8i_thermal temperature driver will work correctly. 0 Quote
Sesse Posted November 3 Author Posted November 3 10 minutes ago, going said: Check the correctness of the temperature triplet in the DTS, fix it and the sun8i_thermal temperature driver will work correctly. Could you be a bit more specific? Which part of the DTS is this? I'm looking through the difference in drivers, and seemingly the upstream driver has this snippet in the commit message: Quote Also the registers readings are wrong, unless a bit in the first SYS_CFG register cleared, so set exercise the SRAM regmap to take care of that. which explains why it has that “allwinner,sram” dependency (and why it silently refuses to load if I take it out). Specifically, the driver has a “.needs_sram” in the kernel. https://patches.linaro.org/project/linux-pm/patch/20240219153639.179814-6-andre.przywara@arm.com/ So why that handle isn't satisfied… I have no idea. (I'm definitely not well-versed in device trees.) 0 Quote
Sesse Posted November 3 Author Posted November 3 Maybe simply patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch needs to go? There's nothing like it in the upstream 6.11 kernel, and it seems broken for the case where the DT links to <&syscon> directly instead of to a child of a child of it. 0 Quote
going Posted November 3 Posted November 3 17 минут назад, Sesse сказал: Maybe simply patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch needs to go? There's nothing like it in the upstream 6.11 kernel, and it seems broken for the case where the DT links to <&syscon> directly instead of to a child of a child of it. In order to make it easier to view and analyze the source code, I support the kernel with patches already applied. Try to compare it with others sun50i-h616 trips It definitely works. sun50i-a64 trips 0 Quote
Sesse Posted November 3 Author Posted November 3 I don't get it; isn't the problem the SRAM provider, not the configured trip limits? What am I supposed to look for here? 0 Quote
going Posted November 3 Posted November 3 26 минут назад, Sesse сказал: I don't get it; isn't the problem the SRAM provider, not the configured trip limits? What am I supposed to look for here? Please understand me correctly. I only have an A64 processor. And I can only check the real work on it. For the rest, only compilation. When moving patches to a new version, something always breaks and the work of fixing it falls on someone else who has boards with one processor or another. I can only give you hints. 0 Quote
Sesse Posted November 3 Author Posted November 3 I removed the patch, and the resulting kernel booted and found its thermal zones. So I think the basic issue is that the patch just is doing of_get_parent() twice without checking that this makes sense. sesse@amalie:~$ uname -a Linux amalie 6.11.2-edge-sunxi64 #2 SMP Fri Oct 4 14:38:57 UTC 2024 aarch64 GNU/Linux sesse@amalie:~$ ls -l /sys/class/thermal/ total 0 lrwxrwxrwx 1 root root 0 Nov 3 20:28 cooling_device0 -> ../../devices/virtual/thermal/cooling_device0 lrwxrwxrwx 1 root root 0 Nov 3 20:25 thermal_zone0 -> ../../devices/virtual/thermal/thermal_zone0 lrwxrwxrwx 1 root root 0 Nov 3 20:25 thermal_zone1 -> ../../devices/virtual/thermal/thermal_zone1 lrwxrwxrwx 1 root root 0 Nov 3 20:25 thermal_zone2 -> ../../devices/virtual/thermal/thermal_zone2 lrwxrwxrwx 1 root root 0 Nov 3 20:25 thermal_zone3 -> ../../devices/virtual/thermal/thermal_zone3 1 Quote
Sesse Posted November 3 Author Posted November 3 I think that if you don't want to just delete the patch (I don't think it makes much sense anymore), then maybe something like this, but I have no idea how to build a kernel from a worktree (i.e., without making a .patch by hand and running through the patch procedure): sram_node = of_parse_phandle(np, prop_name, 0); if (!of_node_is_type(sram_node, "syscon")) sram_node = of_get_parent(sram_node); if (!of_node_is_type(sram_node, "syscon")) sram_node = of_get_parent(sram_node); I.e., don't go walk the tree looking for the syscon node if we're already at it. 0 Quote
going Posted November 4 Posted November 4 Did I understand correctly? Did you mean this patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch? 11 часов назад, Sesse сказал: I removed the patch, and the resulting kernel booted and found its thermal zones. So I think the basic issue is that the patch just is doing of_get_parent() twice without checking that this makes sense. You can disable any patch by simply putting a minus sign at the beginning of a line in the series file: diff --git a/patch/kernel/archive/sunxi-6.11/series.conf b/patch/kernel/archive/sunxi-6.11/series.conf index 3eaf6617..3a2067e0 100644 --- a/patch/kernel/archive/sunxi-6.11/series.conf +++ b/patch/kernel/archive/sunxi-6.11/series.conf @@ -171,7 +171,7 @@ patches.megous/usb-gadget-Fix-dangling-pointer-in-netdev-private-data.patch patches.megous/mmc-dw-mmc-rockchip-fix-sdmmc-after-soft-reboot.patch patches.megous/Revert-drm-sun4i-lvds-Invert-the-LVDS-polarity.patch - patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch +- patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch patches.megous/arm64-dts-rockchip-rk356x-Fix-PCIe-register-map-and-ranges.patch patches.megous/Fix-intptr_t-typedef.patch patches.megous/mmc-sunxi-mmc-Remove-runtime-PM.patch without making any other changes to the build system. The best solution is to make changes on top of all applied patches. Make a git commit and extract it as a patch to the root directory with patches. This can be done directly in the build system. sesion (1) cat <<- EOF > "userpatches/config-test.conf" display_alert "Settings from" "config-test.conf file" "info" BOARD=bigtreetech-cb1 BRANCH=edge RELEASE=bookworm KERNEL_GIT=shallow # Enter your github credentials here so that we know the author and how to contact you. declare -g MAINTAINERMAIL="48602507+The-going@users.noreply.github.com" declare -g MAINTAINER="The-going" declare -g SYNC_CLOCK="no" EOF ./compile.sh test kernel-patch # The build system will apply all patches, commit and stop. # Don't do anything in this terminal session (1). # The script will print the path to the directory where you want to make changes. # You will need another session (2) in the terminal to access and act on the kernel repository. session (2): cd /path/to/kernel-worktree sudo su # Make changes to the file in any convenient editor. For example: nano README # write and exit the editor session (1) Press <ENTER> And you'll see something like this: ───────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────── │ File: /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch ───────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────── 1 │ diff --git a/README b/README 2 │ index fd903645e..fa0d6844d 100644 3 │ --- a/README 4 │ +++ b/README 5 │ @@ -1,5 +1,5 @@ 6 │ -Linux kernel 7 │ -============ 8 │ +Linux kernel Armbian framework patching 9 │ +======================================= 10 │ 11 │ There are several guides for kernel developers and users. These guides can 12 │ be rendered in a number of formats, like HTML and PDF. Please read ───────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────── [🚸] Are you happy with this patch? [ Type 'yes' to accept, 'stop' to stop patching, or anything else to keep patching ] Enter yes The script will make the correct patch assign it a strange name that does not match the "Subject:" field and put it in the "output/patch/" folder. leo@noble-vm:~/armbian$ batcat /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch ───────┬───────────────────────────────────────────────────────────────────────── │ File: /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch ───────┼───────────────────────────────────────────────────────────────────────── 1 │ From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 2 │ From: The-going <48602507+The-going@users.noreply.github.com> 3 │ Date: Mon, 4 Nov 2024 12:21:22 +0000 4 │ Subject: Patching kernel rockchip-rk3588 files README 5 │ 6 │ Signed-off-by: The-going <48602507+The-going@users.noreply.github.com> 7 │ --- 8 │ README | 4 ++-- 9 │ 1 file changed, 2 insertions(+), 2 deletions(-) 10 │ 11 │ diff --git a/README b/README 12 │ index fd903645e..fa0d6844d 100644 13 │ --- a/README 14 │ +++ b/README 15 │ @@ -1,7 +1,7 @@ 16 │ -Linux kernel 17 │ -============ 18 │ +Linux kernel Armbian framework patching 19 │ +======================================= 20 │ 21 │ There are several guides for kernel developers and users. These guides can 22 │ be rendered in a number of formats, like HTML and PDF. Please read 23 │ Documentation/admin-guide/README.rst first. 24 │ 25 │ -- 26 │ Created with Armbian build tools https://github.com/armbian/build 27 │ ───────┴────────────────────────────────────────────────────────────────────────────── This strange behavior can be circumvented. session (2) cd /path/to/kernel-worktree sudo su # Tell who you are directly to git. git config --global user.email "48602507+The-going@users.noreply.github.com" git config --global user.name "The-going" # Check the git settings. cat ~/.gitconfig # Make the necessary changes to the file and commit. # The main thing is to write a good explanation in the message to the commit about the changes made. # End the superuser session exit # If it's a single commit then just send the patch to the destination (regular user session): git format-patch --keep-subject --filename-max-length=75 --abbrev=12 -1 -o /home/user/path/to/armbian-build/patch/kernel/archive/sunxi-6.11/ # If the changes are significant and you need several commits, # the best solution is to make a separate series. # Create an empty patches.addons folder in the target directory TDIR="/home/user/path/to/armbian-build/patch/kernel/archive/sunxi-6.11/patches.addons" mkdir $TDIR # If 5 commits revision_runge="HEAD~5..HEAD" # run script mk_format_patch.sh /home/user/path/to/armbian-build/tools/mk_format_patch . $revision_runge $TDIR # Add a list of new patches to the series.conf file cat ${TDIR}/../series.addons >>${TDIR}/../series.conf I wrote it in such detail because other users will read it. @Sesse You have very successfully found this rotten apple. I express my deep gratitude for the work done. Please post a properly designed patch here and I will send it to the megous upstream project so that it can be reviewed and the names included in v6.12. With respect 0 Quote
Solution Sesse Posted November 4 Author Solution Posted November 4 Thanks for the pointers. Here is my patch (I tested it by dropping it into the right userpatches directory, compiling the kernel and then booting): From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 From: Steinar H. Gunderson <steinar+kernel@gunderson.no> Date: Mon, 4 Nov 2024 15:35:38 +0000 Subject: Fix broken allwinner,sram dependency on CB1 On BigTreeTech CB1, the thermal sensor has an allwinner,sram property pointing to <&syscon>. However, Armbian has an out-of-tree kernel patch that creates dependencies based on allwinner,sram properties, which assumes that they point to sram nodes exactly two levels below the syscon node (instead of the syscon itself). This manifests itself as the thermal sensor refusing to load with a nonsensical error message: [ 23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier Note that it does not say _which_ supplier it is waiting for (the message ends in a space and then no supplier). The patch was unproblematic in the 5.6 megous patch set, where it was introduced, and in 6.6, which is current for Armbian, but in 6.8, the sun8i-thermal driver got mainlined, with this extra property compared to the out-of-tree version we used before (since it wants to clear a special bit at 0x300000 instead of relying on the firmware to do so before kernel boot). Fix by being a bit more flexible when we walk up the tree, so that we always stop at the syscon node. Tested on a Sovol SV08, which is CB1-based. Signed-off-by: Steinar H. Gunderson <steinar+kernel@gunderson.no> --- drivers/of/property.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/of/property.c b/drivers/of/property.c index 6fcfcda9d..f00f2b129 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1368,12 +1368,13 @@ static struct device_node *parse_allwinner_sram(struct device_node *np, if (index > 0) return NULL; sram_node = of_parse_phandle(np, prop_name, 0); - sram_node = of_get_parent(sram_node); - sram_node = of_get_parent(sram_node); + while (sram_node && !of_node_is_type(sram_node, "syscon")) { + sram_node = of_get_parent(sram_node); + } return sram_node; } static const struct supplier_bindings of_supplier_bindings[] = { -- Created with Armbian build tools https://github.com/armbian/build I'm going to mark this as solved; I hope it can go upstream at some point. Note that I think the most sensible thing to do is still just to delete the patch in question, not to patch it further. But I don't understand 100% why it was added in the first place; maybe there is some reason that is still valid for some platform. 1 Quote
going Posted November 4 Posted November 4 (edited) 51 минуту назад, Sesse сказал: I'm going to mark this as solved; I hope it can go upstream at some point. Note that I think the most sensible thing to do is still just to delete the patch in question, not to patch it further. But I don't understand 100% why it was added in the first place; maybe there is some reason that is still valid for some platform. I will send this fix to the author of the original patch. We will wait for his opinion. P.S. @Sesse I remembered that this is relevant for h616 and h618 processors. Other users have informed me about the lack of a temperature interface on boards with these processors. With your permission, I will edit the message in the patch a little. Edited November 4 by going Add P.S. 0 Quote
Sesse Posted November 4 Author Posted November 4 The CB1 is H616-based, but I'm not entirely sure what you're referring to apart from that. I would love to see your proposed edits before words are being put in my mouth 🙂 There was a typo in my patch that essentially caused the dependency to be ignored, I'm adding an updated version. From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 From: Steinar H. Gunderson <steinar+kernel@gunderson.no> Date: Mon, 4 Nov 2024 15:35:38 +0000 Subject: Fix broken allwinner,sram dependency on CB1 On BigTreeTech CB1, the thermal sensor has an allwinner,sram property pointing to <&syscon>. However, Armbian has an out-of-tree kernel patch that creates dependencies based on allwinner,sram properties, which assumes that they point to sram nodes exactly two levels below the syscon node (instead of the syscon itself). This manifests itself as the thermal sensor refusing to load with a nonsensical error message: [ 23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier Note that it does not say _which_ supplier it is waiting for (the message ends in a space and then no supplier). The patch was unproblematic in the 5.6 megous patch set, where it was introduced, and in 6.6, which is current for Armbian, but in 6.8, the sun8i-thermal driver got mainlined, with this extra property compared to the out-of-tree version we used before (since it wants to clear a special bit at 0x300000 instead of relying on the firmware to do so before kernel boot). Fix by being a bit more flexible when we walk up the tree, so that we always stop at the syscon node. Tested on a Sovol SV08, which is CB1-based. Signed-off-by: Steinar H. Gunderson <steinar+kernel@gunderson.no> --- drivers/of/property.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/of/property.c b/drivers/of/property.c index 6fcfcda9d..f00f2b129 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1368,12 +1368,13 @@ static struct device_node *parse_allwinner_sram(struct device_node *np, if (index > 0) return NULL; sram_node = of_parse_phandle(np, prop_name, 0); - sram_node = of_get_parent(sram_node); - sram_node = of_get_parent(sram_node); + while (sram_node && !of_node_name_eq(sram_node, "syscon")) { + sram_node = of_get_parent(sram_node); + } return sram_node; } static const struct supplier_bindings of_supplier_bindings[] = { -- Created with Armbian build tools https://github.com/armbian/build 0 Quote
going Posted November 4 Posted November 4 2 минуты назад, Sesse сказал: The CB1 is H616-based, but I'm not entirely sure what you're referring to apart from that. I would love to see your proposed edits before words are being put in my mouth Subject: Fix broken allwinner,sram dependency on h616, h618 Or, at your discretion, indicate in another way that this is relevant for these processors 0 Quote
JohnTheCoolingFan Posted November 5 Posted November 5 Oh crap, forgot to subscribe to the topic. Thanks @going for your work on this! 0 Quote
Sesse Posted November 6 Author Posted November 6 @JohnTheCoolingFan So, now that this has all been merged (thanks!), there's the thing I wanted to do in the first place before starting to shave this yak; is there any reason why the Armbian kernel images ship with ftrace off? Specifically, having access to `perf sched record` can be really nice for figuring out what's going on when something misses its timeslice. (I just ticked CONFIG_FTRACE=y and what seemed to be a sensible set of options to add kernel tracepoints, and that was seemingly enough to make perf happy 🙂 ) AFAIK ftrace has essentially zero runtime overhead when inactive (there's a nop prologue at the start of every tracable kernel function, and that's it), as long as CONFIG_HAVE_DYNAMIC_FTRACE=y is also set, so it shouldn't be a very risky option to enable in the default configs. I believe basically all desktop/server distribution kernels enable it since a long time, although I haven't checked. 0 Quote
JohnTheCoolingFan Posted November 7 Posted November 7 Sorry @Sesse but I don't know an answer to that question. You'll have to ask someone else for the reasoning behind that. I'll ask in our discord channel. 0 Quote
going Posted November 7 Posted November 7 (edited) 18 часов назад, Sesse сказал: So, now that this has all been merged (thanks!), there's the thing I wanted to do in the first place before starting to shave this yak; is there any reason why the Armbian kernel images ship with ftrace off? This feature is necessary for developers. I mean those people who really understand the kernel code and will be able to draw conclusions and steps when they see the trace message. These people will be able to add any functionality to the kernel and debug it on their own. P.S. Thank you for not asking why the kernel is not being built with debugging symbols. Edited November 7 by going Add P.S. 0 Quote
going Posted November 10 Posted November 10 04.11.2024 в 19:43, Sesse сказал: Here is my patch (I tested it by dropping it into the right userpatches directory, compiling the kernel and then booting): The patch is accepted in v6.12 drivers/of/property.c Many thanks from the author of megi 0 Quote
Sesse Posted November 10 Author Posted November 10 Quote I mean those people who really understand the kernel code and will be able to draw conclusions and steps when they see the trace message. To be clear, ftrace does show any tracing messages. It is a framework where an administrator can hook certain functions, for instance for performance debugging (using eBPF or perf). You do need Linux administrator skills to understand such systems, but you do not need any understanding of the underlying kernel code. The primary reason to want this in a standard kernel is that you would often want to do measurements on a production system, without having to go through the onerous process of compiling your own kernel and then having to reboot. For instance, when someone is complaining about a video system losing frames, or a 3D printer losing too many timeslices and aborting a print. 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.