Jump to content

tkaiser

Members
  • Posts

    5462
  • Joined

Everything posted by tkaiser

  1. Hi, in an attempt to add new information and correct conflicting information over at the bottom of https://linux-sunxi.org/SID_Register_Guide#Currently_known_SID.27s it would be great if as many owner of Allwinner boards could execute most recent sbc-bench version and post/share the result link here (ix.io/sprunge.us) https://github.com/ThomasKaiser/sbc-bench#execution The new sbc-bench version always lists the SID identifier next to the SoC so that it reads as follows for example: SoC guess: Allwinner H6 (SID: 82c00007) Especially boards with 64-bit SoCs like H618, H616, H6, H5, A64 but also R40/V40 and exotic variants like S3, V3s, V853 are of interest. Hopes are low though since the forum structure has changed a lot, posting directly in supported sunxi-forum seems impossible and most probably nobody will ever find this post
  2. Why? What are the benefits over the Radxa provided OS images?
  3. Today I had the chance to retest again. The good news: there's no real difference whether ASPM is set to default or performance. The bad news is that the idle consumption difference is quite higher. With dmc governor enabled (and therefore idling at 528 MHz DRAM clock) the difference between powersave and default/performance is almost 250 mW, at the highest DRAM clock it's around 150mW (back then when measuring the consumption difference there was no dmc governor and most probably the 100mW I was talking about were the result of rounding down). More details here: https://forum.radxa.com/t/rock-5b-debug-party-invitation/10483/508?u=tkaiser From an energy efficiency point of view a script line in armbian-hardware-optimization checking for PCIe devices being present at boot, then setting either powersave or performance would be perfect but as I recently learned Armbian isn't about such optimisations any more
  4. https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5B.md#important-insights-and-suggested-optimisations BTW: The device is called Rock 5B as such congratulations on the subforum name...
  5. All of this (maybe except USB PD negotiations) applies to M3 as well: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5B.md#important-insights-and-suggested-optimisations
  6. https://forum.radxa.com/t/rock-5b-debug-party-invitation/10483/472?u=tkaiser Latest sbc-bench version 0.9.9 has also an additional service for ignorant people: reporting probably performance relevant governors...
  7. Well, I was talking about 3 different issues: switching with ASPM from powersave to default (you chose performance now, no idea what this does wrt consumption, switching from powersave to defailt is 100mW) io_is_busy (that's addressed now though in a redundant way but doesn't matter since only delaying the armbian-hardware-optimization service by a few ms). Both these changes should help with NVMe (and PCIe device) performance in general setting the dmc_governor to performance instead of dmc_ondemand (results in 600mW higher consumption though). This is reponsible for better overall performance since the dmc_ondemand governor often does not ramp up memory clock fast enough. A comparison between clocking LPDDR4 with only 528 MHz vs. the current maximum of 2112 MHz is here: https://browser.geekbench.com/v5/cpu/compare/17009700?baseline=17009078 The latter isn't addressed at all but I posted multiple times how this can be changed from userspace: echo performance >/sys/class/devfreq/dmc/governor @balbes150 did (only) two changes and the io_is_busy tweak will help with every storage device (USB and SATA too). No need for a new image since you can simply edit /usr/lib/armbian/armbian-hardware-optimization. Or overwrite it with the contests of http://ix.io/4aFe
  8. It's not about mainline kernel vs. BSP kernel but the latter having settings that might fit for Android (use cases like watching video, playing games) but not for Linux. You seem to be using these defaults without questioning them. Rockchip's BSP kernel defaults to powersave for ASPM (Active State Power Management) which of course negatively affects NVMe performance. As such you need to either eliminate CONFIG_PCIEASPM_POWERSAVE=y from kernel config or need to execute somewhere after booting: echo default >/sys/module/pcie_aspm/parameters/policy Also the BSP kernel when the dmc/dfi device-tree nodes are enabled (seems to be the case with your RK3588 kernel fork since the 7-zip scores you and @blondu are sharing are below 14800 while they could be around 16500) defaults to dmc_ondemand governor which can be changed by doing this: echo performance >/sys/class/devfreq/dmc/governor (similar way as I've done this for RK3399 years ago: https://github.com/armbian/build/blob/fdf73a025ba56124523baefaf705792b74170fb8/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization#L241-L244 ) And this here: prefix="/sys/devices/system/cpu" CPUFreqPolicies=($(ls -d ${prefix}/cpufreq/policy? | sed 's/freq\/policy//')) if [ ${#CPUFreqPolicies[@]} -eq 1 -a -d "${prefix}/cpufreq" ]; then # if there's just a single cpufreq policy ondemand sysfs entries differ CPUFreqPolicies=${prefix} fi for i in ${CPUFreqPolicies[@]}; do affected_cpu=$(tr -d -c '[:digit:]' <<< ${i}) echo ondemand >${prefix}/cpu${affected_cpu:-0}/cpufreq/scaling_governor echo 1 >${i}/cpufreq/ondemand/io_is_busy echo 25 >${i}/cpufreq/ondemand/up_threshold echo 10 >${i}/cpufreq/ondemand/sampling_down_factor echo 200000 >${i}/cpufreq/ondemand/sampling_rate done is the exact replacement for lines 81-89 in armbian-hardware-optimization: https://github.com/armbian/build/blob/fdf73a025ba56124523baefaf705792b74170fb8/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization#L81-L89 Without this I/O performance with Armbian sucks on a variety of boards for example ODROID N2/N2+, VIM3 or now the RK3588/RK3588S based boards. Unfortunately @lanefu seems to be way too biased or limited in his thinking to understand this when he creates bizarre tickets that are rotting around somewhere: https://armbian.atlassian.net/browse/AR-1262 I really don't care whether these fixes will be incorporated into Armbian. But if you benchmark stuff the settings should be adjusted accordingly. And we (we as a broader community – not this place here) already know how ASPM settings negatively affect performance of PCIe devices (like for example NVMe SSDs): https://forum.radxa.com/t/rock-5b-debug-party-invitation/10483/86?u=tkaiser, we know which role io_is_busy has and what the benefits and drawbacks of the chosen dmc governor are. A quality NVMe SSD even when just connected with a single Gen2 lane should always outperform any SATA SSD if it's about what really matters: random I/O. If the SSD is cheap garbage or the settings are garbage it might look differently.
  9. But your image or lets better say the kernel you're using is made for Android and lacks optimisations. As for NVMe what about echo default >/sys/module/pcie_aspm/parameters/policy or removing CONFIG_PCIEASPM_POWERSAVE=y from kernel config? And for I/O performance in general this would be needed since with my code fragments from half a decade ago that are still part of Armbian the 3rd CPU cluster isn't adjusted properly: prefix="/sys/devices/system/cpu" CPUFreqPolicies=($(ls -d ${prefix}/cpufreq/policy? | sed 's/freq\/policy//')) if [ ${#CPUFreqPolicies[@]} -eq 1 -a -d "${prefix}/cpufreq" ]; then # if there's just a single cpufreq policy ondemand sysfs entries differ CPUFreqPolicies=${prefix} fi for i in ${CPUFreqPolicies[@]}; do affected_cpu=$(tr -d -c '[:digit:]' <<< ${i}) echo ondemand >${prefix}/cpu${affected_cpu:-0}/cpufreq/scaling_governor echo 1 >${i}/cpufreq/ondemand/io_is_busy echo 25 >${i}/cpufreq/ondemand/up_threshold echo 10 >${i}/cpufreq/ondemand/sampling_down_factor echo 200000 >${i}/cpufreq/ondemand/sampling_rate done And based on sbc-bench results shared here the dmc/dfi nodes are enabled in device-tree (defaulting to dmc_ondemand) and as such this would restore 'full performance': echo performance >/sys/class/devfreq/dmc/governor
  10. How did you perform this comparison? Looking only at sequential transfer speeds? Or checking random I/O (which matters way more on an OS drive or when building software)? What does /sys/module/pcie_aspm/parameters/policy look like? powersave, right? And for a quick comparison of schedutil, performance, ondemand and ondemand with io_is_busy see the comments below https://github.com/radxa/kernel/commit/55f540ce97a3d19330abea8a0afc0052ab2644ef
  11. Really low numbers. 1) hdparm uses 128K block size to test which was a lot last century when whoever hardcoded this in hdparm but today it's just a joke. Use iozone or fio with larger block sizes 2) Armbian doesn't care since years about low-level optimizations. Better search for 'io_is_busy' and 'dmc/governor' to get full speed (both storage and CPU performance or at least a 2000 higher 7-zip score): https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5B.md 3) check 'cat /sys/devices/system/cpu/cpufreq/policy?/ondemand/io_is_busy' – if I/O is processed by cpu6 or cpu7 it might be a lot slower compared even to the little cores 4) A PCIe Gen2 lane allows for 5 GT/s, SATA 6Gbps has 120% the data rate and both use 8b10b coding. As such SATA should be faster with sequential transfer speeds 5) Sequential transfer speeds are BS if it's about an OS drive since here random I/O matters. And due to SATA relying on AHCI made for spinning rust last century NVMe should outperform SATA even in a single Gen2 lane config and even if silly 'benchmarks' like hdparm draw a different picture. 6) What is 'cat /sys/module/pcie_aspm/parameters/policy' telling?
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines