Mainline kernel and dvfs / throttling / thermal settings


Recommended Posts

We provided this week experimental Armbian images with mainline kernel for a few H3 boards. On the official download pages there are a few variants with 4.6.7 lurking around and here you find some with 4.7.2: http://kaiser-edv.de/tmp/w8JAAY/

 
Those for the NEO are also suitable for NanoPi M1 and OPi One/Lite, the one for OPi PC Plus can be used on any of the larger Oranges (no Ethernet on +/+2/+2E -- don't ask please, this is stuff for later) since they all use the same more sophisticated voltage regulator being able to adjust VDD_CPUX in 20mW steps (VDD_CPUX is the voltage the CPU cores are fed with). The procedure is called dynamic voltage frequency scaling and the idea behind is to lower voltage on the CPU cores when they're running at lower clockspeeds and vice versa.
 
Works pretty well with legacy kernel in the meantime but it required a lot of work to come up with optimal settings that are still reliable (undervolting the CPU causes stability problems and data corruption!) while also providing best performance results (lower VDD_CPUX voltage, less heat, later throttling). For details please read through the whole issue here: https://github.com/igorpecovnik/lib/issues/298
 
So what has changed with mainline kernel now? In Armbian we use megi's kernel branch containing Ethernet and dvfs/THS patches and a few others. What's still missing and what do you get now when trying out these mainline images? No HDMI, no audio, no sophisticated SBC stuff like I2C, SPI and so on (unless you know how to deal with device tree overlays).
 
But USB, Ethernet on all Fast Ethernet equipped devices (no network on GbE models currently!), cpufreq scaling / dvfs, working WiFi and with 4.7 also the opportunity to test out USB OTG and the new schedutil cpufreq scheduler.
 
My 4.7.2 Armbian releases contain also a new armbianmonitor variant that can deal with mainline kernel on H3 boards (different templates needed and a different method to get VDD_CPUX values -- fex file vs. device tree) and can install cpuminer. Why cpuminer? To test the efficiency of throttling settings -- see below. As usual RPi-Monitor can be installed using 'sudo armbianmonitor -r' and now cpuminer will be installed by 'sudo armbianmonitor -p' (p for performance measurements).
 
To let cpuminer run in fully automated mode do a 'touch /root/.cpuminer', then minerd in benchmark mode will immediately start after booting and results will be collected by RPi-Monitor (not on the status page but only on the statistics page -- actual values aren't interesting only behaviour over time!)
 
Dvfs settings for Orange Pi PC and the other SY8106A equipped H3 devices already look good and work quite well (although I would tweak them here and there) while those settings for the H3 devices with the more primitive voltage regulation do not.
 
I used a NanoPi NEO for the test but the results apply to all H3 boards that use only 2 different VDD_CPU voltages: NanoPi M1/NEO/NEO-Air and OPi One/Lite. Unlike the Oranges NanoPI M1 and NEO overheat more easily, the latter especially (maybe due to smaller PCB size, single bank DRAM configuration and LDO regulators nearby the SoC?). And tested on the only remaining NEO that does not wear a heatsink.
 
In the beginning I allowed 1200 MHz max cpufreq but since I neither used heatsink nor fan throttling had to jump in to prevent overheating. In this mode H3 started running cpuminer at 1200 MHz, clocked down to 1008 MHz pretty fast and from then on always switched between 624 MHz (1.1V VDD_CPU) and 1008 MHz (1.3V VDD_CPU). The possible 816 MHz (1.1V) in between were never used. Average consumption in this mode was 2550 mW and average cpuminer score 1200 khash/s:
 
Bildschirmfoto%202016-08-26%20um%2008.41
 
I then limited max cpufreq to 816 MHz through sysfs and let the test continue. In the beginning H3 switched between 624 and 816 MHz but since SoC temperature further decreased H3 stayed then all the time at 816 MHz and below 75°C (the highest allowed cpufreq at the lower VDD_CPU core voltage with megi's settings!). Average consumption in this mode was 2420 mW and average cpuminer score 1350 khash/s.
 
This is how cpufreq and temperatures correlated over time:

Constantly switching between 624 and 1008 MHz.

08:06:49: 1008MHz  4.17 100%   1%   0%  98%   0%   0% 74.9°C
08:06:54:  624MHz  4.15 100%   0%   0%  99%   0%   0% 77.2°C
08:06:59:  624MHz  4.14 100%   0%   0%  99%   0%   0% 82.1°C
08:07:05:  624MHz  4.21 100%   0%   0%  99%   0%   0% 75.6°C
08:07:10: 1008MHz  4.19 100%   1%   0%  98%   0%   0% 74.4°C
08:07:15: 1008MHz  4.26 100%   0%   0%  99%   0%   0% 74.9°C
08:07:20:  624MHz  4.22 100%   0%   0%  99%   0%   0% 76.8°C
08:07:26:  624MHz  4.20 100%   2%   0%  97%   0%   0% 82.6°C
08:07:31:  624MHz  4.18 100%   2%   0%  97%   0%   0% 75.5°C
08:07:36:  624MHz  4.17 100%   2%   0%  97%   0%   0% 82.6°C
08:07:41: 1008MHz  4.16 100%   0%   0%  99%   0%   0% 75.0°C
08:07:47:  624MHz  4.14 100%   1%   0%  98%   0%   0% 76.7°C
08:07:52: 1008MHz  4.13 100%   1%   0%  98%   0%   0% 82.3°C

Now upper clockspeed set to 816 MHz and SoC cooling slightly down:

08:11:32:  624MHz  4.16 100%   0%   0%  99%   0%   0% 75.2°C
08:11:37:  624MHz  4.15 100%   1%   0%  98%   0%   0% 75.7°C
08:11:43:  816MHz  4.14 100%   1%   0%  98%   0%   0% 74.6°C
08:11:48:  816MHz  4.13 100%   0%   0%  99%   0%   0% 74.1°C
08:11:53:  816MHz  4.12 100%   0%   0%  99%   0%   0% 74.1°C
08:11:58:  624MHz  4.11 100%   0%   0%  99%   0%   0% 75.7°C
08:12:04:  624MHz  4.10 100%   0%   0%  99%   0%   0% 75.1°C
08:12:09:  624MHz  4.17 100%   1%   0%  98%   0%   0% 73.5°C
08:12:14:  624MHz  4.16 100%   0%   0%  99%   0%   0% 75.5°C
08:12:19:  816MHz  4.14 100%   0%   0%  99%   0%   0% 74.7°C
08:12:24:  816MHz  4.13 100%   0%   0%  99%   0%   0% 74.7°C
08:12:30:  816MHz  4.12 100%   0%   0%  99%   0%   0% 73.8°C
08:12:35:  624MHz  4.11 100%   0%   0%  99%   0%   0% 75.2°C
08:12:40:  624MHz  4.10 100%   1%   0%  98%   0%   0% 75.2°C
08:12:45:  816MHz  4.09 100%   0%   0%  99%   0%   0% 74.7°C

And now finally remaining at 816 MHz and below 75°C all the time:

08:18:22:  624MHz  4.14 100%   0%   0%  99%   0%   0% 75.5°C
08:18:28:  816MHz  4.13 100%   0%   0%  99%   0%   0% 74.9°C
08:18:33:  624MHz  4.12 100%   0%   0%  99%   0%   0% 73.8°C
08:18:38:  624MHz  4.11 100%   1%   0%  98%   0%   0% 75.1°C
08:18:43:  816MHz  4.10 100%   1%   0%  98%   0%   0% 75.0°C
08:18:49:  816MHz  4.09 100%   0%   0%  99%   0%   0% 73.8°C
08:18:54:  816MHz  4.09 100%   0%   0%  99%   0%   0% 75.0°C
08:18:59:  816MHz  4.08 100%   0%   0%  99%   0%   0% 73.7°C
08:19:04:  816MHz  4.07 100%   2%   0%  97%   0%   0% 74.7°C
08:19:10:  624MHz  4.07 100%   2%   0%  97%   0%   0% 75.5°C
08:19:15:  816MHz  4.06 100%   2%   0%  97%   0%   0% 75.0°C
08:19:20:  816MHz  4.06 100%   0%   0%  99%   0%   0% 73.7°C
08:19:25:  816MHz  4.05 100%   0%   0%  99%   0%   0% 74.9°C
08:19:31:  816MHz  4.05 100%   0%   0%  99%   0%   0% 73.9°C
08:19:36:  816MHz  4.04 100%   2%   0%  97%   0%   0% 74.6°C
08:19:41:  816MHz  4.04 100%   2%   0%  97%   0%   0% 74.4°C
08:19:46:  816MHz  4.04 100%   0%   0%  99%   0%   0% 74.7°C
08:19:51:  816MHz  4.03 100%   0%   0%  99%   0%   0% 74.5°C

 
So we got an increase in performance from 1200 to 1350 khash/s (+12.5%) while being able to lower consumption by 130 mW (2550 mW vs. 2420 mW) so not only performance increased but also performance per watt ratio if we manually adjust maximum cpufreq and forbid everything above 816 MHz. Quite the opposite of what one would expect ;)
 
At least it should be obvious that dvfs settings for the small H3 devices need some attention. I monitored consumption through AXP209 on a Banana Pro feeding the H3 device through its USB port. The high voltage fluctuations due to the NEO's voltage regulator constantly switching between 1.1V and 1.3V can be seen until 8:12, then the '30 minutes average value' stabilized at 8:42 and the real consumption difference could be read: 130 mW:
 
Bildschirmfoto%202016-08-26%20um%2008.58
 
 
 
With legacy kernel we defined a lot more possible cpufreq/dvfs operating points so let's give it a try: Same hardware setup (same NEO, same USB-to-Micro-USB cable from Banana Pro to NEO, same upright position for nearly identical thermal behaviour), same DRAM clockspeed (408 MHz) but different DVFS/THS settings of course:
 
Bildschirmfoto%202016-08-26%20um%2015.01
 
If we compare mainline kernel with max cpufreq limited to 816 MHz and legacy kernel we get
  • 4.7.2: 1350 khash/s, ~74°C, constant 816 MHz cpufreq, 2420 mW reported consumption
  • 3.4.112: 1150 khash/s, ~80°C, 648-720 MHz cpufreq, 2610 mW reported consumption
Looking at numbers 1 to 3 it simply feels weird: lower clockspeed, lower performance but higher temperatures. Would be an indication for different thermal readouts between mainline and legacy kernel (we had the same issue half a year ago when switching from BSP u-boot to mainline: temperature readouts 10-15°C lower).
 
But fortunately I also monitored consumption and there it's 200 mW more. On the same hardware with the same hardware setup. So there is really something happening on the NEO that wastes more energy when running minderd with legacy kernel and that might be responsible for higher temperatures and more aggressive throttling leading to lower performance (at least that's the only possible reason I can imagine)
 
Since we already know that on the NEO adjusting DRAM clockspeed with legacy kernel makes a huge difference regarding consumption and temperatures (see post #13 here) maybe the whole problem is related to different DRAM config on the NEO (single bank vs. dual bank on all other H3 devices) and something's wrong with mainline kernel here? Don't know but already decided to repeat the test with NanoPi M1 (dual bank DRAM config but also the primitive 1.1/1.3V voltage regulation)
Link to post
Share on other sites
Donate and support the project!

Test done with NanoPi M1 (with the image made for NEO, only real difference: in both cases DRAM has been clocked with 408 MHz instead of 624 MHz as we usually do in Armbian):

 

I used the following script called from /etc/rc.local to iterate through 816, 624, 480 and 240 MHz cpufreq running cpuminer and then killed the task to get idle consumption. Remaining in each state for 40 minutes so my 'monitoring PSU' is able to provide '30 min average consumption' numbers.

#!/bin/bash
for i in 816000 624000 480000 240000 ; do
	sleep 2400
	echo $i >/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
done
sleep 2400
pkill minerd

Cpuminer scores at the 4 different CPU clockspeeds were identical: 1420, 1110, 875, 450 khash/s

 

But both temperatures and consumption differed (on the left is kernel 4.7.2, on the right 3.4.112 -- same board, same cables, same PSU, same SD card, only OS image exchanged):

816 MHZ: 67°C / 1820 mW    73°C / 2060 mW
624 MHz: 62°C / 1600 mW    66°C / 1820 mW
480 MHz: 57°C / 1430 mW    62°C / 1660 mW
240 MHz: 49°C / 1100 mW    54°C / 1340 mW
idle:    41°C /  750 mW    47°C /  980 mW

(I had to slightly adjust temperature readouts since the tests take a long time to get sane average consumption values and ambient temperature differs by 2°C in the lab -- day vs night).

 

On average we get a difference of +5°C / +230 mW for legacy kernel compared to mainline. This means running mainline kernel is another good measure to reduce consumption since with legacy kernel I already used NEO settings with GPU/HDMI disabled. You should also keep in mind that I powered the NanoPi M1 with FriendlyARM's PSU-ONECOM module which adds a few mW to consumption so NanoPi M1 with mainline kernel and NEO setting (DRAM clockspeed set to 408 MHz) might consume already less than 700mW.

 

Detailed graphs below:

 

 

Mainline kernel / in the beginning allowing 1200 MHz cpufreq max which shows bad throttling behaviour switching between 624 and 1008 and not using the 816 MHz in between:

 

Bildschirmfoto%202016-08-27%20um%2009.55

 

Bildschirmfoto%202016-08-27%20um%2009.57

 

And now legacy kernel 3.4.112:

 

Bildschirmfoto%202016-08-27%20um%2010.19

 

Bildschirmfoto%202016-08-27%20um%2010.11

 

 

 

Link to post
Share on other sites

So I know you said not to ask about gigabit ethernet on say the 2E buuuuuut.... :P

What's blocking that? Looks like support for EMACs isn't in 4.7, how's about 4.8?

Amusingly I just got around to using your tools to build my own armbian image with 4.7.4, hit the same gigE roadblock, figured 4.8 would probably help then upon checking here I see you're already working on it :P

Link to post
Share on other sites

What's blocking that?

 

Those lazy Armbian devs ;)

 

To be honest I got really to lazy to keep up with all the device tree stuff you would've to have to adjust with every new kernel version. Currently we use megi's github repo as source (at 4.7, containing cpufreq/dvfs settings that would need some love/attention but only outdated Ethernet stuff and lack support for GbE enabled boards).

 

I'm currently testing for montjoie his Ethernet v4 driver, you can have a look here what to change in the build system (click on the spoiler thingie): http://forum.armbian.com/index.php/topic/2044-some-discovery-while-trying-520-builds/?p=15717 (choose then Orange Pi Plus, will run on OPi Plus 2E but not all USB ports will work).

 

You should have at least a heatsink on H3 (1296 MHz, no throttling implemented) and then you most probably want to try out this patch: https://irclog.whitequark.org/linux-sunxi/2016-09-19#17604648;

Link to post
Share on other sites

Those lazy Armbian devs ;)

 

To be honest I got really to lazy to keep up with all the device tree stuff you would've to have to adjust with every new kernel version. Currently we use megi's github repo as source (at 4.7, containing cpufreq/dvfs settings that would need some love/attention but only outdated Ethernet stuff and lack support for GbE enabled boards).

 

I'm currently testing for montjoie his Ethernet v4 driver, you can have a look here what to change in the build system (click on the spoiler thingie): http://forum.armbian.com/index.php/topic/2044-some-discovery-while-trying-520-builds/?p=15717 (choose then Orange Pi Plus, will run on OPi Plus 2E but not all USB ports will work).

 

You should have at least a heatsink on H3 (1296 MHz, no throttling implemented) and then you most probably want to try out this patch: https://irclog.whitequark.org/linux-sunxi/2016-09-19#17604648;

Image building now, I don't like the retries inbound but eh, it's probably fine, not gonna be under too much load when disk/cpu's a bottleneck  :)

 

Link to post
Share on other sites

My bad (I do too much stuff in parallel today): I disabled a few of the patches that might break 4.8 compilation (not tested, just disabled everything I currently don't need). From the same location where compile.sh is do

for in add_missing_UARTs_I2Cs_SPI_for-H3.patch patch-for-rtl8189fs-sun8i-h3-orangepi-pc-dts.patch scripts-dtc-Update-to-version-with-overlays.patch ; do
    touch userpatches/kernel/sun8i-dev/${i}
done

And of course you need latest fix for Ethernet driver in userpatches/kernel/sun8i-dev/montjoe-ethernet-fix.patch

 

And please keep in mind that the patches are from now on disabled unless you remove the empty files in userpatches directory. Simply check changelog, maybe Martin is already busy updating them for 4.8?

Link to post
Share on other sites
root@orangepiplus:~# uname -a
Linux orangepiplus 4.8.0-sun8i #1 SMP Mon Sep 19 18:07:30 BST 2016 armv7l armv7l armv7l GNU/Linux

Wahey! So to recap, switching to montjoie's repo and running that script was enough. Next up installing to eMMC, then setting up ceph. One thing I've noticed is that /sys/class/thermal has no devices so definitely need a heatsink on these as they won't know they're on fire. 

*edit* aaaand installing to eMMC failed, or rather booting from it failed.

Link to post
Share on other sites

switching to montjoie's repo and running that script was enough. Next up installing to eMMC, then setting up ceph. One thing I've noticed is that /sys/class/thermal has no devices so definitely need a heatsink on these as they won't know they're on fire. 

*edit* aaaand installing to eMMC failed, or rather booting from it failed.

 

That's also known, I think Martin found one solution since with mainline kernel for whatever reasons eMMC is not mmcblk1 but mmcblk2 instead (so changing this in nand_sata_install was his work-around). Don't know the status now since now the whole thing should work based on UUIDs.

 

BTW: In case you're somewhat familiar with network performance testing you owe me some numbers ;)

 

These two modification (remember?) led to restored performance and no retransmits in one direction but affected performance in the other somewhat negatively:

 

  1. echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
  2. echo 32768 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt
 
Would be great if you can play a bit around with those values and in case you come up with values that lead to more balanced numbers please get back to us :)
 
BTW: We have code for dvfs / thermal / cpurfreq stuff in one repo and most recent Ethernet driver in the other. Combining both is possible but requires also updating all the .dts files for all devices (which is something I pretty early fail with all the times :) ) I would wait for code reviews of montjoie's v5 and then maybe create then a set of patches suitable for all H3 devices around.
 
But I fear getting back to the upstream kernel guys and fighting there for megi's patches and each H3 board being treated individually is the better way to waste my time...
Link to post
Share on other sites

Well I can give you numbers but take them with a large pinch of salt, for the baseline tests pfsense is running on esxi though it does have high priority and a NIC dedicated to it with pcie passthrough. 

Baseline Thinkpad T60 to pfsense: http://pastebin.com/zGsyZ4xN
Baseline pfsense to Thinkpad T60: http://pastebin.com/raLKzdRw
So network itself good for 900-940mbit/s.

OPi2E running kernel 4.8 to Thinkpad: http://pastebin.com/k7sEZC48
OPi2E running kernel 3.4 to Thinkpad: http://pastebin.com/nmbkMqf1
Both devices get 830-840mbit/s but the one running 4.8 takes longer to get there, spending a while at 730 first. 

Thinkpad to OPi2E running kernel 4.8 caused the OPi to drop off the network after 50 seconds, it was all over the place though, 700-900mbit.
Thinkpad to OPi2E running kernel 3.4: http://pastebin.com/n5ENN9v7
The board running 3.4 is all over the place too, guess that's just normal behaviour.

I went looking and found the one that dropped off the network had rebooted and changed MAC/IP in the process. I ran iperf against it again, this time it got to 21 seconds before rebooting. http://pastebin.com/SqQ74ZaX

That rebooting is a problem meant to be solved by the v4 branch right?
 

Link to post
Share on other sites

Did you apply montjoie's latest patch? Since without kernel panics / reboot are what's to be expected (BTW: consistent MAC address can be both assigned in .dts and /etc/network/interfaces -- if you're already there you could assign static IP addresses also)

 

Regarding the 'strange' variation in throughout that's most likely caused by iperf3 running on the same CPU core as the Ethernet IRQ handler. Using taskset to keep iperf3 away from cpu3 is key to more constant iperf performance: https://irclog.whitequark.org/linux-sunxi/2016-09-19#17605514;

Link to post
Share on other sites

Did you apply montjoie's latest patch? Since without kernel panics / reboot are what's to be expected (BTW: consistent MAC address can be both assigned in .dts and /etc/network/interfaces -- if you're already there you could assign static IP addresses also)

 

Regarding the 'strange' variation in throughout that's most likely caused by iperf3 running on the same CPU core as the Ethernet IRQ handler. Using taskset to keep iperf3 away from cpu3 is key to more constant iperf performance: https://irclog.whitequark.org/linux-sunxi/2016-09-19#17605514;

Nope, didn't apply that patch as I couldn't find it! Figured it'd be at http://sunxi.montjoie.ovh/ethernet/ but I get a 403 trying to load that. 

Link to post
Share on other sites

Tried just the kernel but for whatever reason the NIC didn't start. Anyway I made a new image and am testing it now, results in a few minutes. It's stayed up for 150 seconds so far which bodes well.

Check out that stability, 600 seconds! http://pastebin.com/GLzMQFs2 

Now I'll move iperf off core 3: http://pastebin.com/fmj8gL4F

 

Still not perfect but good enough for my purposes. Let's try the other way: http://pastebin.com/95mFBs0V

 

Hmm 718mbit vs 835 without the patch. Next up let's mess with the RPS values, the default is 0 for both:
 

root@orangepiplus:~# cat /proc/sys/net/core/rps_sock_flow_entries
0
root@orangepiplus:~# cat /sys/class/net/eth0/queues/rx-0/rps_flow_cnt
0

First up, 32768 for both. Results? OPi to thinkpad sees 720 for a while, rising to 870 before dropping back down. That makes sense as the inbound queue is the only one changing. I'll only mention it if it gets interesting from here: http://pastebin.com/k7sEZC48

Inbound is where the interesting numbers should be at and indeed they are, 937mbit average! That's with iperf on cores 0-2 btw: http://pastebin.com/6HZGDYqz

So the question is how do we improve outbound?  *edit* I reran the outbound test and got 870 continuously. 

Separate note, I'm still trying to get booting from eMMC working. nand-sata-install is the latest version, boot.cmd uses UUIDs so the advice here doesn't work: http://forum.armbian.com/index.php/topic/2082-banana-pi-m2-with-system-on-emmc-data-on-sd-card/

 

Observed behaviour is that with just running nand-sata-install then trying to boot from eMMC I get no green light on the board. If I then follow the uboot copying advice found in the following post I get a green light but... that's it, the NIC never fires up, I can't tell whether anything else happens: http://forum.armbian.com/index.php/topic/2046-vanilla-kernel-on-opi-pc-install-to-emmc/?p=15685

 

*edit* Found it! So the nand-sata-install script, it writes to /mnt/bootfs/etc/fstab and adds a line saying mount this UUID at / and that works. However, the one in /mnt/rootfs/etc/fstab still uses names, mmcblk2. Change it to use UUIDs too and it works, basically just copy /mnt/bootfs/etc/fstab over /mnt/rootfs/etc/fstab. 

 

Link to post
Share on other sites
Guest
This topic is now closed to further replies.