Jump to content

SV08 can't find thermal zone on 6.11.2


Go to solution Solved by Sesse,

Recommended Posts

Posted

Following up from another thread:
 

I compiled an edge kernel (6.11.2-edge-sunxi64) for my Sovol SV08, in part because I'd like ftrace support (for perf sched). It comes up, except there is no /class/thermal/thermal_zone0/ (which Klipper demands to work, perhaps sensibly enough). dmesg says:

[   23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier

It's a bit strange, since my understanding is that the full error message should be “wait for supplier XYZ”, i.e., some specific supplier.

 

This was built with

./compile.sh  kernel BOARD=bigtreetech-cb1 BRANCH=edge KERNEL_GIT=shallow

and then a hack to use only -j2 (or the device will run out of memory during build). I installed both linux-image-edge-sunxi64 and linux-dtb-edge-sunxi64. The version is 4.11.0-trunk_arm64__6.11.2-S7aa2-D29ea-P5d78-C6b40H5c21-HK01ba-Vc222-Ba3b7-R448a (that's a mouthful). I modified the kernel config, but only to enable ftrace; I didn't mess with thermal or any other driver.

 

armbianmonitor -u: https://paste.armbian.com/gokinaxefa

Posted

For fun, I hacked the DTB by removing this line (which is the only relevant thing that seems to be different between the 6.6 and 6.11 DTs):

               ths: thermal-sensor@5070400 {
…more lines here…
-                       allwinner,sram = <&syscon>;
…more lines here…
               }

After that, the driver loaded just fine (no kernel message, /sys/bus/platform/devices/5070400.thermal-sensor/waiting_for_supplier contains 0), but there are still no thermal zones in /sys/class/thermal.

 

It seems that what really changed between 6.6 and 6.11 is that the h616 thermal drivers got upstreamed, so most of the Armbian patches have been obsoleted, but I guess in the process, something broke for the CB1.

Posted

Tip for compiling: you should be able to use docker for easy cross-compilation on your pc instead of the board itself.

As for thermal zones, it's probably a matter of device tree tweaking, I'll take a look.

Posted

Yes, I tried (albeit without Docker, just regular cross-compiling), and it failed with https://paste.armbian.com/niyisiciqa . It seems there's some confusion somewhere as of whether the image is supposed to be compressed or not (I think maybe Arm does decompression somehow in U-Boot?), so vmlinux versus vmlinuz. I tried looking briefly into it, but figured it was easier just to wait it out on the board. 🙂

Posted
2 часа назад, Sesse сказал:

Yes, I tried (albeit without Docker, just regular cross-compiling), and it failed with https://paste.armbian.com/niyisiciqa .

Try rolling back the build system to this commit: 90637986eef70e5478105b64039a8ba1bfd85062
All subsequent commits are irrelevant to sunxi.

 

2 часа назад, Sesse сказал:

the h616 thermal

Check the correctness of the temperature triplet in the DTS, fix it and the sun8i_thermal temperature driver will work correctly.

Posted
10 minutes ago, going said:

Check the correctness of the temperature triplet in the DTS, fix it and the sun8i_thermal temperature driver will work correctly.

Could you be a bit more specific? Which part of the DTS is this?

 

I'm looking through the difference in drivers, and seemingly the upstream driver has this snippet in the commit message:
 

Quote

    Also the registers readings are wrong, unless a bit in the first SYS_CFG
    register cleared, so set exercise the SRAM regmap to take care of that.

which explains why it has that “allwinner,sram” dependency (and why it silently refuses to load if I take it out). Specifically, the driver has a “.needs_sram” in the kernel.

 

https://patches.linaro.org/project/linux-pm/patch/20240219153639.179814-6-andre.przywara@arm.com/

 

So why that handle isn't satisfied… I have no idea. (I'm definitely not well-versed in device trees.)

Posted

Maybe simply patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch needs to go? There's nothing like it in the upstream 6.11 kernel, and it seems broken for the case where the DT links to <&syscon> directly instead of to a child of a child of it.

Posted
17 минут назад, Sesse сказал:

Maybe simply patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch needs to go? There's nothing like it in the upstream 6.11 kernel, and it seems broken for the case where the DT links to <&syscon> directly instead of to a child of a child of it.

In order to make it easier to view and analyze the source code, I support the kernel with patches already applied.

 

Try to compare it with others

sun50i-h616 trips

 

It definitely works.

sun50i-a64 trips

 

Posted
26 минут назад, Sesse сказал:

I don't get it; isn't the problem the SRAM provider, not the configured trip limits? What am I supposed to look for here?

Please understand me correctly.
I only have an A64 processor.
And I can only check the real work on it. For the rest, only compilation.
When moving patches to a new version, something always breaks and

the work of fixing it falls on someone else who has boards with one processor or another.

 

I can only give you hints.

Posted

I removed the patch, and the resulting kernel booted and found its thermal zones. So I think the basic issue is that the patch just is doing of_get_parent() twice without checking that this makes sense.

sesse@amalie:~$ uname -a
Linux amalie 6.11.2-edge-sunxi64 #2 SMP Fri Oct  4 14:38:57 UTC 2024 aarch64 GNU/Linux
sesse@amalie:~$ ls -l /sys/class/thermal/
total 0
lrwxrwxrwx 1 root root 0 Nov  3 20:28 cooling_device0 -> ../../devices/virtual/thermal/cooling_device0
lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone0 -> ../../devices/virtual/thermal/thermal_zone0
lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone1 -> ../../devices/virtual/thermal/thermal_zone1
lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone2 -> ../../devices/virtual/thermal/thermal_zone2
lrwxrwxrwx 1 root root 0 Nov  3 20:25 thermal_zone3 -> ../../devices/virtual/thermal/thermal_zone3

 

Posted

I think that if you don't want to just delete the patch (I don't think it makes much sense anymore), then maybe something like this, but I have no idea how to build a kernel from a worktree (i.e., without making a .patch by hand and running through the patch procedure):

 

        sram_node = of_parse_phandle(np, prop_name, 0);
        if (!of_node_is_type(sram_node, "syscon"))
                sram_node = of_get_parent(sram_node);
        if (!of_node_is_type(sram_node, "syscon"))
                sram_node = of_get_parent(sram_node);

I.e., don't go walk the tree looking for the syscon node if we're already at it.

Posted

Did I understand correctly?
Did you mean this patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch?

11 часов назад, Sesse сказал:

I removed the patch, and the resulting kernel booted and found its thermal zones. So I think the basic issue is that the patch just is doing of_get_parent() twice without checking that this makes sense.

You can disable any patch by simply putting a minus sign at the beginning of a line in the series file:

diff --git a/patch/kernel/archive/sunxi-6.11/series.conf b/patch/kernel/archive/sunxi-6.11/series.conf
index 3eaf6617..3a2067e0 100644
--- a/patch/kernel/archive/sunxi-6.11/series.conf
+++ b/patch/kernel/archive/sunxi-6.11/series.conf
@@ -171,7 +171,7 @@
        patches.megous/usb-gadget-Fix-dangling-pointer-in-netdev-private-data.patch
        patches.megous/mmc-dw-mmc-rockchip-fix-sdmmc-after-soft-reboot.patch
        patches.megous/Revert-drm-sun4i-lvds-Invert-the-LVDS-polarity.patch
-       patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch
+-      patches.megous/of-property-fw_devlink-Support-allwinner-sram-links.patch
        patches.megous/arm64-dts-rockchip-rk356x-Fix-PCIe-register-map-and-ranges.patch
        patches.megous/Fix-intptr_t-typedef.patch
        patches.megous/mmc-sunxi-mmc-Remove-runtime-PM.patch

without making any other changes to the build system.

 

The best solution is to make changes on top of all applied patches.

Make a git commit and extract it as a patch to the root directory with patches.

This can be done directly in the build system.

sesion (1)

cat <<- EOF > "userpatches/config-test.conf"
display_alert "Settings from" "config-test.conf file" "info"

BOARD=bigtreetech-cb1
BRANCH=edge
RELEASE=bookworm
KERNEL_GIT=shallow

# Enter your github credentials here so that we know the author and how to contact you.
declare -g MAINTAINERMAIL="48602507+The-going@users.noreply.github.com"
declare -g MAINTAINER="The-going"

declare -g SYNC_CLOCK="no"
EOF


./compile.sh test  kernel-patch

# The build system will apply all patches, commit and stop.
# Don't do anything in this terminal session (1).
# The script will print the path to the directory where you want to make changes.

# You will need another session (2) in the terminal to access and act on the kernel repository.

session (2):

cd /path/to/kernel-worktree
sudo su
# Make changes to the file in any convenient editor.
For example:
nano README
# write and exit the editor

session (1)

Press <ENTER> And you'll see something like this:

───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ diff --git a/README b/README
   2   │ index fd903645e..fa0d6844d 100644
   3   │ --- a/README
   4   │ +++ b/README
   5   │ @@ -1,5 +1,5 @@
   6   │ -Linux kernel
   7   │ -============
   8   │ +Linux kernel Armbian framework patching
   9   │ +=======================================
  10   │  
  11   │  There are several guides for kernel developers and users. These guides can
  12   │  be rendered in a number of formats, like HTML and PDF. Please read
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────
[🚸] Are you happy with this patch? [ Type 'yes' to accept, 'stop' to stop patching, or anything else to keep patching ]

Enter yes

The script will make the correct patch assign it a strange name that does not match the "Subject:" field

and put it in the "output/patch/" folder.

leo@noble-vm:~/armbian$ batcat /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch
───────┬─────────────────────────────────────────────────────────────────────────
       │ File: /home/leo/armbian/output/patch/kernel-rockchip-rk3588-edge.patch
───────┼─────────────────────────────────────────────────────────────────────────
   1   │ From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
   2   │ From: The-going <48602507+The-going@users.noreply.github.com>
   3   │ Date: Mon, 4 Nov 2024 12:21:22 +0000
   4   │ Subject: Patching kernel rockchip-rk3588 files README
   5   │ 
   6   │ Signed-off-by: The-going <48602507+The-going@users.noreply.github.com>
   7   │ ---
   8   │  README | 4 ++--
   9   │  1 file changed, 2 insertions(+), 2 deletions(-)
  10   │ 
  11   │ diff --git a/README b/README
  12   │ index fd903645e..fa0d6844d 100644
  13   │ --- a/README
  14   │ +++ b/README
  15   │ @@ -1,7 +1,7 @@
  16   │ -Linux kernel
  17   │ -============
  18   │ +Linux kernel Armbian framework patching
  19   │ +=======================================
  20   │  
  21   │  There are several guides for kernel developers and users. These guides can
  22   │  be rendered in a number of formats, like HTML and PDF. Please read
  23   │  Documentation/admin-guide/README.rst first.
  24   │  
  25   │ -- 
  26   │ Created with Armbian build tools https://github.com/armbian/build
  27   │ 
───────┴──────────────────────────────────────────────────────────────────────────────

This strange behavior can be circumvented.

session (2)

cd /path/to/kernel-worktree
sudo su

# Tell who you are directly to git.
git config --global user.email "48602507+The-going@users.noreply.github.com"
git config --global user.name "The-going"

# Check the git settings.
cat ~/.gitconfig

# Make the necessary changes to the file and commit.
# The main thing is to write a good explanation in the message to the commit about the changes made.
# End the superuser session
exit

# If it's a single commit then just send the patch to the destination (regular user session):
git format-patch --keep-subject --filename-max-length=75 --abbrev=12 -1 -o /home/user/path/to/armbian-build/patch/kernel/archive/sunxi-6.11/

# If the changes are significant and you need several commits,
# the best solution is to make a separate series.
# Create an empty patches.addons folder in the target directory
TDIR="/home/user/path/to/armbian-build/patch/kernel/archive/sunxi-6.11/patches.addons"
mkdir $TDIR

# If 5 commits
revision_runge="HEAD~5..HEAD"  

# run script mk_format_patch.sh
/home/user/path/to/armbian-build/tools/mk_format_patch . $revision_runge $TDIR

# Add a list of new patches to the series.conf file
cat ${TDIR}/../series.addons >>${TDIR}/../series.conf 

I wrote it in such detail because other users will read it.

 

@Sesse You have very successfully found this rotten apple.

I express my deep gratitude for the work done.
Please post a properly designed patch here and I will send it to the megous upstream project

so that it can be reviewed and the names included in v6.12.

 

With respect

  • Solution
Posted

Thanks for the pointers. Here is my patch (I tested it by dropping it into the right userpatches directory, compiling the kernel and then booting):
 

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Steinar H. Gunderson <steinar+kernel@gunderson.no>
Date: Mon, 4 Nov 2024 15:35:38 +0000
Subject: Fix broken allwinner,sram dependency on CB1

On BigTreeTech CB1, the thermal sensor has an allwinner,sram
property pointing to <&syscon>. However, Armbian has an out-of-tree
kernel patch that creates dependencies based on allwinner,sram
properties, which assumes that they point to sram nodes exactly
two levels below the syscon node (instead of the syscon itself).
This manifests itself as the thermal sensor refusing to load with
a nonsensical error message:

  [   23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier

Note that it does not say _which_ supplier it is waiting for
(the message ends in a space and then no supplier). The patch
was unproblematic in the 5.6 megous patch set, where it was
introduced, and in 6.6, which is current for Armbian, but in
6.8, the sun8i-thermal driver got mainlined, with this extra
property compared to the out-of-tree version we used before
(since it wants to clear a special bit at 0x300000 instead of
relying on the firmware to do so before kernel boot).

Fix by being a bit more flexible when we walk up the tree,
so that we always stop at the syscon node.

Tested on a Sovol SV08, which is CB1-based.

Signed-off-by: Steinar H. Gunderson <steinar+kernel@gunderson.no>
---
 drivers/of/property.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 6fcfcda9d..f00f2b129 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1368,12 +1368,13 @@ static struct device_node *parse_allwinner_sram(struct device_node *np,
 
 	if (index > 0)
 		return NULL;
 
 	sram_node = of_parse_phandle(np, prop_name, 0);
-	sram_node = of_get_parent(sram_node);
-	sram_node = of_get_parent(sram_node);
+	while (sram_node && !of_node_is_type(sram_node, "syscon")) {
+		sram_node = of_get_parent(sram_node);
+	}
 
 	return sram_node;
 }
 
 static const struct supplier_bindings of_supplier_bindings[] = {
-- 
Created with Armbian build tools https://github.com/armbian/build

I'm going to mark this as solved; I hope it can go upstream at some point.

 

Note that I think the most sensible thing to do is still just to delete the patch in question, not to patch it further. But I don't understand 100% why it was added in the first place; maybe there is some reason that is still valid for some platform.

Posted (edited)
51 минуту назад, Sesse сказал:

I'm going to mark this as solved; I hope it can go upstream at some point.

 

Note that I think the most sensible thing to do is still just to delete the patch in question, not to patch it further. But I don't understand 100% why it was added in the first place; maybe there is some reason that is still valid for some platform.

I will send this fix to the author of the original patch.
We will wait for his opinion.

 

P.S.

@Sesse I remembered that this is relevant for h616 and h618 processors.
Other users have informed me about the lack of a temperature interface on boards with these processors.
With your permission, I will edit the message in the patch a little.

Edited by going
Add P.S.
Posted

The CB1 is H616-based, but I'm not entirely sure what you're referring to apart from that. I would love to see your proposed edits before words are being put in my mouth 🙂

 

There was a typo in my patch that essentially caused the dependency to be ignored, I'm adding an updated version.

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Steinar H. Gunderson <steinar+kernel@gunderson.no>
Date: Mon, 4 Nov 2024 15:35:38 +0000
Subject: Fix broken allwinner,sram dependency on CB1

On BigTreeTech CB1, the thermal sensor has an allwinner,sram
property pointing to <&syscon>. However, Armbian has an out-of-tree
kernel patch that creates dependencies based on allwinner,sram
properties, which assumes that they point to sram nodes exactly
two levels below the syscon node (instead of the syscon itself).
This manifests itself as the thermal sensor refusing to load with
a nonsensical error message:

  [   23.775976] platform 5070400.thermal-sensor: deferred probe pending: platform: wait for supplier

Note that it does not say _which_ supplier it is waiting for
(the message ends in a space and then no supplier). The patch
was unproblematic in the 5.6 megous patch set, where it was
introduced, and in 6.6, which is current for Armbian, but in
6.8, the sun8i-thermal driver got mainlined, with this extra
property compared to the out-of-tree version we used before
(since it wants to clear a special bit at 0x300000 instead of
relying on the firmware to do so before kernel boot).

Fix by being a bit more flexible when we walk up the tree,
so that we always stop at the syscon node.

Tested on a Sovol SV08, which is CB1-based.

Signed-off-by: Steinar H. Gunderson <steinar+kernel@gunderson.no>
---
 drivers/of/property.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 6fcfcda9d..f00f2b129 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1368,12 +1368,13 @@ static struct device_node *parse_allwinner_sram(struct device_node *np,
 
 	if (index > 0)
 		return NULL;
 
 	sram_node = of_parse_phandle(np, prop_name, 0);
-	sram_node = of_get_parent(sram_node);
-	sram_node = of_get_parent(sram_node);
+	while (sram_node && !of_node_name_eq(sram_node, "syscon")) {
+		sram_node = of_get_parent(sram_node);
+	}
 
 	return sram_node;
 }
 
 static const struct supplier_bindings of_supplier_bindings[] = {
-- 
Created with Armbian build tools https://github.com/armbian/build

 

Posted
2 минуты назад, Sesse сказал:

The CB1 is H616-based, but I'm not entirely sure what you're referring to apart from that. I would love to see your proposed edits before words are being put in my mouth

 

Subject: Fix broken allwinner,sram dependency on h616, h618

Or, at your discretion, indicate in another way that this is relevant for these processors

Posted

@JohnTheCoolingFan So, now that this has all been merged (thanks!), there's the thing I wanted to do in the first place before starting to shave this yak; is there any reason why the Armbian kernel images ship with ftrace off? Specifically, having access to `perf sched record` can be really nice for figuring out what's going on when something misses its timeslice. (I just ticked CONFIG_FTRACE=y and what seemed to be a sensible set of options to add kernel tracepoints, and that was seemingly enough to make perf happy 🙂 )

 

AFAIK ftrace has essentially zero runtime overhead when inactive (there's a nop prologue at the start of every tracable kernel function, and that's it), as long as CONFIG_HAVE_DYNAMIC_FTRACE=y is also set, so it shouldn't be a very risky option to enable in the default configs. I believe basically all desktop/server distribution kernels enable it since a long time, although I haven't checked.

Posted (edited)
18 часов назад, Sesse сказал:

So, now that this has all been merged (thanks!), there's the thing I wanted to do in the first place before starting to shave this yak; is there any reason why the Armbian kernel images ship with ftrace off?

This feature is necessary for developers.

I mean those people who really understand the kernel code and will be able to draw conclusions and steps when they see the trace message.

These people will be able to add any functionality to the kernel and debug it on their own.

 

P.S. Thank you for not asking why the kernel is not being built with debugging symbols.

Edited by going
Add P.S.
Posted
04.11.2024 в 19:43, Sesse сказал:

Here is my patch (I tested it by dropping it into the right userpatches directory, compiling the kernel and then booting):

The patch is accepted in v6.12

drivers/of/property.c

Many thanks from the author of megi

Posted
Quote

I mean those people who really understand the kernel code and will be able to draw conclusions and steps when they see the trace message.

To be clear, ftrace does show any tracing messages. It is a framework where an administrator can hook certain functions, for instance for performance debugging (using eBPF or perf). You do need Linux administrator skills to understand such systems, but you do not need any understanding of the underlying kernel code.

 

The primary reason to want this in a standard kernel is that you would often want to do measurements on a production system, without having to go through the onerous process of compiling your own kernel and then having to reboot. For instance, when someone is complaining about a video system losing frames, or a 3D printer losing too many timeslices and aborting a print.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines