Posting here following what was recommended on twitter.
After updating my helios64 earlier this week and rebooting to get the new kernel, I realized it was suspiciously silent.
A quick check to sensor temps readings and physical check made me realize the fan were not spinning.
After a quick read on the wiki, I checked fancontrol which was indeed failing:
root@helios64:~ # systemctl status fancontrol.service
● fancontrol.service - fan speed regulator
Loaded: loaded (/lib/systemd/system/fancontrol.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/fancontrol.service.d
└─pid.conf
Active: failed (Result: exit-code) since Fri 2021-05-28 00:08:13 CEST; 1min 42s ago
Docs: man:fancontrol(8)
man:pwmconfig(8)
Process: 2495 ExecStartPre=/usr/sbin/fancontrol --check (code=exited, status=0/SUCCESS)
Process: 2876 ExecStart=/usr/sbin/fancontrol (code=exited, status=1/FAILURE)
Main PID: 2876 (code=exited, status=1/FAILURE)
May 28 00:08:13 helios64 fancontrol[2876]: MINPWM=0
May 28 00:08:13 helios64 fancontrol[2876]: MAXPWM=255
May 28 00:08:13 helios64 fancontrol[2876]: AVERAGE=1
May 28 00:08:13 helios64 fancontrol[2876]: Error: file /dev/thermal-cpu/temp1_input doesn't exist
May 28 00:08:13 helios64 fancontrol[2876]: Error: file /dev/thermal-cpu/temp1_input doesn't exist
May 28 00:08:13 helios64 fancontrol[2876]: At least one referenced file is missing. Either some required kernel
May 28 00:08:13 helios64 fancontrol[2876]: modules haven't been loaded, or your configuration file is outdated.
May 28 00:08:13 helios64 fancontrol[2876]: In the latter case, you should run pwmconfig again.
May 28 00:08:13 helios64 systemd[1]: fancontrol.service: Main process exited, code=exited, status=1/FAILURE
May 28 00:08:13 helios64 systemd[1]: fancontrol.service: Failed with result 'exit-code'.
Basically fancontrol expect a device in /dev to read the sensors value from, and that device seems to be missing. After a bit of poking around and learning about udev, I managed to manually solve the issue by recreating the device symlink manually:
Now digging more this issue happen because udev is not creating the symlink like it should for some reason. After reading the rule in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and a bit of udev documentation, I managed to find how to test it:
root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0
[...]
Reading rules file: /etc/udev/rules.d/90-helios64-hwmon-legacy.rules
Reading rules file: /etc/udev/rules.d/90-helios64-ups.rules
[...]
DEVPATH=/devices/virtual/thermal/thermal_zone0
ACTION=add
SUBSYSTEM=thermal
IS_HELIOS64_HWMON=1
HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0
USEC_INITIALIZED=7544717
run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0 ' <-- something is wrong here, there is no target
Unload module index
Unloaded link configuration context.
After spending a bit more time reading the udev rule, I realized that the second argument was empty because we don't match the ATTR{type}=="soc-thermal" condition. We can look up the types like this:
root@helios64:~ # find /sys/ -name type | grep thermal
/sys/devices/virtual/thermal/cooling_device1/type
/sys/devices/virtual/thermal/thermal_zone0/type
/sys/devices/virtual/thermal/cooling_device4/type
/sys/devices/virtual/thermal/cooling_device2/type
/sys/devices/virtual/thermal/thermal_zone1/type
/sys/devices/virtual/thermal/cooling_device0/type
/sys/devices/virtual/thermal/cooling_device3/type
/sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_alert0/type
/sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_crit/type
/sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_crit/type
/sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert0/type
/sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert1/type
root@helios64:~ # cat /sys/devices/virtual/thermal/thermal_zone0/type
cpu <-- we were expecting soc-thermal!
and after rewriting the line with the new type, udev is happy again
# Edit in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and add the following line after the original one
ATTR{type}=="cpu", ENV{HWMON_PATH}="/sys%p/temp", ENV{HELIOS64_SYMLINK}="/dev/thermal-cpu/temp1_input", RUN+="/usr/bin/mkdir /dev/thermal-cpu/"
root@helios64:~ # udevadm control --reload
root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0
[...]
DEVPATH=/devices/virtual/thermal/thermal_zone0
ACTION=add
SUBSYSTEM=thermal
IS_HELIOS64_HWMON=1
HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0/temp
HELIOS64_SYMLINK=/dev/thermal-cpu/temp1_input
USEC_INITIALIZED=7544717
run: '/usr/bin/mkdir /dev/thermal-cpu/'
run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0/temp /dev/thermal-cpu/temp1_input'
Unload module index
Unloaded link configuration context.
Apparently for some reason the device-tree changed upstream and the thermal type changed from soc-thermal to cpu?
You can post now and register later.
If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.
Question
halfa
Posting here following what was recommended on twitter.
After updating my helios64 earlier this week and rebooting to get the new kernel, I realized it was suspiciously silent.
A quick check to sensor temps readings and physical check made me realize the fan were not spinning.
After a quick read on the wiki, I checked fancontrol which was indeed failing:
Basically fancontrol expect a device in /dev to read the sensors value from, and that device seems to be missing. After a bit of poking around and learning about udev, I managed to manually solve the issue by recreating the device symlink manually:
/usr/bin/mkdir /dev/thermal-cpu/ ln -s /sys/devices/virtual/thermal/thermal_zone0/temp /dev/thermal-cpu/temp1_input systemctl restart fancontrol.service systemctl status fancontrol.service
Now digging more this issue happen because udev is not creating the symlink like it should for some reason. After reading the rule in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and a bit of udev documentation, I managed to find how to test it:
root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0 [...] Reading rules file: /etc/udev/rules.d/90-helios64-hwmon-legacy.rules Reading rules file: /etc/udev/rules.d/90-helios64-ups.rules [...] DEVPATH=/devices/virtual/thermal/thermal_zone0 ACTION=add SUBSYSTEM=thermal IS_HELIOS64_HWMON=1 HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0 USEC_INITIALIZED=7544717 run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0 ' <-- something is wrong here, there is no target Unload module index Unloaded link configuration context.
After spending a bit more time reading the udev rule, I realized that the second argument was empty because we don't match the ATTR{type}=="soc-thermal" condition. We can look up the types like this:
root@helios64:~ # find /sys/ -name type | grep thermal /sys/devices/virtual/thermal/cooling_device1/type /sys/devices/virtual/thermal/thermal_zone0/type /sys/devices/virtual/thermal/cooling_device4/type /sys/devices/virtual/thermal/cooling_device2/type /sys/devices/virtual/thermal/thermal_zone1/type /sys/devices/virtual/thermal/cooling_device0/type /sys/devices/virtual/thermal/cooling_device3/type /sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_alert0/type /sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_crit/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_crit/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert0/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert1/type root@helios64:~ # cat /sys/devices/virtual/thermal/thermal_zone0/type cpu <-- we were expecting soc-thermal!
and after rewriting the line with the new type, udev is happy again
# Edit in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and add the following line after the original one ATTR{type}=="cpu", ENV{HWMON_PATH}="/sys%p/temp", ENV{HELIOS64_SYMLINK}="/dev/thermal-cpu/temp1_input", RUN+="/usr/bin/mkdir /dev/thermal-cpu/" root@helios64:~ # udevadm control --reload root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0 [...] DEVPATH=/devices/virtual/thermal/thermal_zone0 ACTION=add SUBSYSTEM=thermal IS_HELIOS64_HWMON=1 HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0/temp HELIOS64_SYMLINK=/dev/thermal-cpu/temp1_input USEC_INITIALIZED=7544717 run: '/usr/bin/mkdir /dev/thermal-cpu/' run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0/temp /dev/thermal-cpu/temp1_input' Unload module index Unloaded link configuration context.
Apparently for some reason the device-tree changed upstream and the thermal type changed from soc-thermal to cpu?
Link to comment
Share on other sites
20 answers to this question
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.