Jump to content

Orange PI 4 lts default thermal trip point issues


slimcomp

Recommended Posts

After booting up armbian 23.02.2 on my orange pi 4 LTS, cpu temp goes up to 95 deg celcius when stress testing (no heatsink or fan installed at all) and the device reboots shortly afterwards. I ran `armbianmonitor -m` when stress testing and found out that some of the cores are not throttling.

I examined the default trip point settings coming with the image in the file /boot/dtb/rockchip/rk3399-orangepi-4-lts.dtb:

                         trips {

                                cpu_alert0 {
                                        temperature = <0x14c08>;
                                        hysteresis = <0x7d0>;
                                        type = "passive";
                                        phandle = <0x5a>;
                                };

                                cpu_alert1 {
                                        temperature = <0x17318>;
                                        hysteresis = <0x7d0>;
                                        type = "passive";
                                        phandle = <0x5b>;
                                };

                                cpu_crit {
                                        temperature = <0x186a0>;
                                        hysteresis = <0x7d0>;
                                        type = "critical";
                                        phandle = <0xf3>;
                                };
                        };


These temperature settings do not match the vendor images' settings, and I changed them to match the vendor images' settings as follows:
 

                        trips {

                                cpu_alert0 {
                                        temperature = <0x11170>;
                                        hysteresis = <0x7d0>;
                                        type = "passive";
                                        phandle = <0x5a>;
                                };

                                cpu_alert1 {
                                        temperature = <0x14c08>;
                                        hysteresis = <0x7d0>;
                                        type = "passive";
                                        phandle = <0x5b>;
                                };

                                cpu_crit {
                                        temperature = <0x1c138>;
                                        hysteresis = <0x7d0>;
                                        type = "critical";
                                        phandle = <0xf3>;
                                };
                        };

After making this change, I no longer have any reboots caused by overheating and `armbianmonitor -m` shows that the cpu is correctly throttled at 85 deg celcius when stressing testing without heatsink or fan.

 

Is this the right way to solve this issue? If so, can someone make a PR for this? Thanks.

Edited by slimcomp
Link to comment
Share on other sites

Thanks for noticing this, and sorry for being very late on answer. I will check and fix the trip points soon; also there could be room for improvement from stock/vendor based trip points with the granularity provided by devfreq framework for GPU and DMC (read: GPU and DDR can be controlled as well to lower temperature of the board).

Link to comment
Share on other sites

I took a look and the trip points table is provided by the mainline kernel.

The trip points on mainline are:

  1. 85 °C - big cores are thermal throttled (even down to 600 Mhz from 1.8Ghz )
  2. 95 °C - all cores are thermal throttled
  3. 100 °C - device reboots

On vendor kernel instead:

  1. 70 °C - big cores are thermal throttled
  2. 85 °C - all cores are thermal throttled
  3. 115 °C - device reboots

 

As far as I see, the mainline trip points looks for reasonable for me, and the device reboots at 100 °C which looks like a reasonable critical temperature to prevent physical damage on the long term.

If you reach 100 °C and beyond, you definitely need a heatsink and proper energy dispersion.

 

On my tests running concurrently openssl speed -multi 6 and mbw -n 1000 256 , stressing the CPU with crypto tests and DRAM for benchmarks, the board never crosses 86 °C because the thermal throttling of the big cores gets engaged and it looks sufficient to keep the soc at a reasonable temperature even after several minutes of sustained load. Of course my board is without any kind of enclosure.

 

I don't think there is the need to really change the trip points.

On your setup perhaps you don't get reboots because you're allowed to stress to soc up to 115 °C, but you should evaluate a way to remove the heat in excess rather than raise the limits, or limit the core frequency to reduce energy dissipation if you're in a constrained environment.

 

edit:

I should retreat partially: it seems that the 95°C trip point for all cores is way too high. My board hanged at 94°C during the rsa test with openssl, so quite probably it is better to change the trip points this way:

  1. 82 °C - big cores are thermal throttled
  2. 85 °C - all cores and DMC are thermal throttled
  3. 90 °C - device reboots

 

this should also give enough room for the board to recover after reboot

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines