tkaiser Posted September 24, 2017 Posted September 24, 2017 TL;DR: The small H2+/H3 boards unlike their bigger siblings are all prone to overheating due to smaller PCB size (on the larger boards the PCB's groundplane acts somewhat as a large heatsink dissipating heat away from the SoC). Due to mainline kernel settings not being optimized currently all these boards are slower under constant load compared to legacy kernel. This should change but won't unless someone is looking into it and spends some time on this. Two areas that deal with this overheating tendency or are somewhat related are thermal protection / throttling: use the thermal sensor(s) inside the SoC to downclock various engines if specific tresholds are exceeded DVFS (dynamic voltage frequency scaling). All the small boards have either no voltage regulation (NanoPi NEO2) or a primitive one only switching between 1.1V and 1.3V With sun8i legacy kernel Armbian and linux-sunxi community members spent a lot of time and efforts on improving thermal/throttling performance. Read through the following as a reference please: https://github.com/armbian/build/issues/298 The result of our optimizations was a lot of better performance compared to Allwinner's defaults (that targeted only Android and preferred higher single thread performance over overall better performance, with Allwinner settings on an overheating system you could end up with just one or two active CPU cores pretty easily). Now with mainline kernel situation for the larger H3 boards is ok-ish (those boards have an I2C adjustable voltage regulator, voltage switching works fine grained, overheating isn't much of an issue anyway and performance is almost as good as with legacy kernel). But situation with the smaller boards needs some attention. If we run the cheapest boards currently with mainline kernel then we're talking about these settings: max cpufreq 1008 MHz (at 1.3V), next lower cpufreq 816 MHz at 1.1V, then 624/480/312/240/120 MHz defined 4 thermal trip points defined starting at 65"C with throttling, then using 75° and 90°C and shutting board down when 105°C are reached. With Armbian and using legacy kernel it's the following instead: max cpufreq is 1200 MHz, then 1008 MHz still at 1.3V, at 912 MHz we switch to 1.1V and below are a few other cpufreqs available between 816 MHz and 1344 MHz Armbian's legacy kernel provides cpufreq steps every 48 MHz (allowing for fine grained throttling) On the small boards we use twice as much thermal trip points as mainline settings and our strategy is to switch to 912MHz@1.1V pretty early once throttling occurs These differences result in both lower 'normal' performance (since mainline kernel limits also single threaded tasks to 1GHz instead of 1.2GHz) and also 'full load' performance since DVFS/THS/throttling settings are not optimal and once the board reaches the first thermal trip point throttling is not that efficient compared to legacy. It's easy to test: grab an OPi Zero, NanoPi Duo or any of the other H2+/H3 boards with primitive voltage regulation, then grab an Armbian OS image with legacy kernel (3.4.113 using fex settings) and one with mainline kernel. Execute on both sudo rpimonitor -r (installs RPi-Monitor so you can enjoy nice graphs when connecting with a web browser to port 8888 of your machine) sudo rpimonitor -p (installs cpuminer which is a great tool to heatup your board and also to measure 'thermal performance' since spitting out khash/s values in benchmark mode minerd --benchmark (this is the actual benchmark running) With mainline kernel performance is lower. Expected result: same performance. What to do? Improve mainline settings. BTW: Mainline settings currently are as they are since these were the values megi started with last year. Once numbers exist they're only dealt with copy&paste any more. 2
grg Posted October 23, 2017 Posted October 23, 2017 This is strange. While the mainline kernel might not be as finely tuned as the legacy kernel, and the smaller boards are known to run hot, my experience doesn't seem to be typical. These are my temperature readings from my OrangePi Zero v1.4 running the "next" build: Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St. 07:03:07: 240MHz 0.03 2% 1% 0% 0% 0% 0% 103.5°C 5/8 07:03:12: 240MHz 0.02 2% 1% 0% 0% 0% 0% 103.9°C 6/8 07:03:17: 240MHz 0.10 2% 1% 0% 0% 0% 0% 103.7°C 5/8 07:03:23: 240MHz 0.09 2% 1% 0% 0% 0% 0% 103.8°C 5/8 07:03:28: 240MHz 0.08 2% 1% 0% 0% 0% 0% 104.3°C 6/8 Yet, it passes the "thumb test". The greatest reading I get with my cheap infrared thermometer is 42℃ but the average is around 37℃ (reading from both side). It shuts down doing a simple "apt-get update". Anyway, I'm just sharing for some extra context. 1
Recommended Posts