1 1
tkaiser

sbc-bench

Recommended Posts

That RockPro64 heat sink is big enough, is anyone has a cnc and the will, they could remove a millimeter or so from the area around the actual contact patch and give that a try with compound instead of a shim.

Share this post


Link to post
Share on other sites

Hi @tkaiser

 

I tried your sbc-bench on a freshly installed Debian 9.5 netinstall on Z8350 platform.

http://ix.io/1m0V.

 

to be more reliable, perhaps your script could verify gcc and make are installed or not, i had to do it manually or mhz and tinymembench were not made and installed.

 

unless that minor problem, that's really a neat tool you made.

Share this post


Link to post
Share on other sites
1 hour ago, t-minik said:

to be more reliable, perhaps your script could verify gcc and make are installed or not, i had to do it manually or mhz and tinymembench were not made and installed.

 

[x] done (most probably not in exactly the way as expected ;) )

Share this post


Link to post
Share on other sites

yup, it isn't the way i thought at, but it works perfectly as is.

 

well done, thanks for your hard  work on all those SBCs.

 

EDIT // now i try to run the test with 4.17 kernel and it returns :

/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq is unavailable :

 

root@z83v:~# uname -a
Linux z83v 4.17.0-0.bpo.3-amd64 #1 SMP Debian 4.17.17-1~bpo9+1 (2018-08-27) x86_64 GNU/Linux

root@z83v:~# /usr/local/src/mhz/mhz 3 1000000
count=807053 us50=21083 us250=105281 diff=84198 cpu_MHz=1917.036 tsc50=30359268 tsc250=151603398 diff=150 rdtsc_MHz=1439.988
count=807053 us50=21134 us250=105291 diff=84157 cpu_MHz=1917.970 tsc50=30431952 tsc250=151618572 diff=150 rdtsc_MHz=1440.006
count=807053 us50=21087 us250=105324 diff=84237 cpu_MHz=1916.148 tsc50=30364506 tsc250=151665012 diff=150 rdtsc_MHz=1439.991

root@z83v:~# find /sys -name cpufreq
/sys/devices/system/cpu/cpu3/cpufreq
/sys/devices/system/cpu/cpu1/cpufreq
/sys/devices/system/cpu/cpufreq
/sys/devices/system/cpu/cpu2/cpufreq
/sys/devices/system/cpu/cpu0/cpufreq
/sys/module/cpufreq

root@z83v:~# ls /sys/devices/system/cpu/cpu0/cpufreq
affected_cpus     cpuinfo_min_freq       
related_cpus      scaling_cur_freq
scaling_governor  scaling_min_freq
cpuinfo_max_freq  cpuinfo_transition_latency
scaling_available_governors  scaling_driver
scaling_max_freq  scaling_setspeed

there is only scaling_cur_freq but if i undersatnd, it seem to be the value that kernel is thinking cpu is running instead of real value (in throttling case it's inaccurate).

Share this post


Link to post
Share on other sites
1 hour ago, t-minik said:

there is only scaling_cur_freq but if i undersatnd, it seem to be the value that kernel is thinking cpu is running instead of real value (in throttling case it's inaccurate)

 

Yes, scaling_cur_freq is just some number compared to cpuinfo_cur_freq: https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt

 

Querying the correct sysfs node is also more 'expensive' and therefore only allowed by root. Please see also @jeanrhum's adventure with the very same Atom and obviously similar kernel: 

 

Share this post


Link to post
Share on other sites

Just another result for a rk3288 board (ugoos ut3s): http://ix.io/1mLr

I use jock armbian xenial desktop image with 4.18 kernel. This tv box has a fan, but it does not seem to work with current setting, since throttling occurs and I didn't see the fan rotating, but I go away before the end of the run. Current ambient temperature is about 24-25°C.

Share this post


Link to post
Share on other sites

Just for fun - http://ix.io/1niO -- Pi Zero W with stock raspbian and current (as of 3/21/2018) firmware...

 

@tkaiser -- Might reconsider the reported gimping of the Pi3 B+, the current VC4 firmware resolves many of the concerns raised on the github, and there's still a an item with "vcgencmd measure_clock arm" where it returns values that are consistent with Pi3, but not the Plus there - so the "fake" results are probably what's returned from the VC4

 

"vcgencmd measure_temp" is a bit more accurate than the arm temps there on the chip, and it measures the chip package, not the actual cores, which is expected, as we're asking the firmware which reports what VC4 sees

 

I trust the kernel returns more than the VC4 info for clocks - and data between Pi3 and Pi3 Plus confirms this...

 

here's current pi3 Plus results...  http://ix.io/1niD

 

Similar with Pi Zero W - where VC4 reports a consistent 1GHz, but kernel shows that the arm is at 700MHz when idle, even though VC4 reports 1GHz all the time

 

VC4 is a mess - that much is true... and there's little insight into what Broadcom provides there. Maybe the folks at RPf know more, but that's likely all NDA and closed...

Edited by sfx2000
updating info

Share this post


Link to post
Share on other sites
23 hours ago, sfx2000 said:

Might reconsider the reported gimping of the Pi3 B+, the current VC4 firmware resolves many of the concerns raised on the github, and there's still a an item with "vcgencmd measure_clock arm" where it returns values that are consistent with Pi3, but not the Plus there - so the "fake" results are probably what's returned from the VC4

 

"vcgencmd measure_temp" is a bit more accurate than the arm temps there on the chip, and it measures the chip package, not the actual cores, which is expected, as we're asking the firmware which reports what VC4 sees

 

I trust the kernel returns more than the VC4 info for clocks - and data between Pi3 and Pi3 Plus confirms this...

 

Pi's and VC4 VCOS - aka ThreadX as per what @tkaiser refers to...

 

Seems that the Pi Folks have done a bit - however, I agree that VCOS does stretch things a bit - if you ask VCOS, it says one thing, you ask the kernel, it says another, and always trust the kernel...

 

Example below - 4 threads on UnixBench - Pi3BPlus is running at 1.4GHz, and the results suggest that everything is good there when compared to it's little brother, the Pi3 - current rpi-firmware throttles back at 80c in my experience, and under load, it dances close to it... for folks that work on DVFS curves, there's a bit to appreciate there.

 

Notice there's no throttle at 60c now, and hasn't been for a while - the gimpage was early on with the 1.4GHz product...

 

That being said - don't trust VCOS reports from userland -- trust the kernel, with 4.14, the numbers are honest - for all Pi's...

 

pinfo.sh...

 

#!/bin/bash
# pinfo.sh looks at firmware/kernel - clocks, temps for cpu
# create this file in the user dir, and make it executable
celsius=$(cat /sys/class/thermal/thermal_zone0/temp | sed 's/.\{3\}$/.&/')
clock0=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/')
# can perhaps comment below out, some SoC's gang the clocks - but with a quad, all are reported
clock1=$(cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/')
clock2=$(cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/')
clock3=$(cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/')
echo "Host   => $(date) @ $(hostname)"
echo "Uptime =>$(uptime)"
echo "SW Rev => $(uname -vr)"
# VC4 stuff for RPi variants - comment out as needed for non-VC targets
echo "FW Rev => $(vcgencmd version)"
echo "==============="
echo "ARM Mem => $(vcgencmd get_mem arm)"
echo "GPU Mem => $(vcgencmd get_mem gpu)"
echo "==============="
echo "Pi Temp  => $(vcgencmd measure_temp)"
echo "Pi Volts => $(vcgencmd measure_volts core)"
echo "Pi Clock => $(vcgencmd measure_clock arm)"
# end VC4 stuff
# rest is from linux, should apply to any
echo "==============="
echo "ARM Temp => ${celsius} °C"
echo "Core0Clock=> ${clock0} MHz"
# see above - no harm keeping this, but one must be consistent
echo "Core1Clock=> ${clock1} MHz"
echo "Core2Clock=> ${clock2} MHz"
echo "Core3Clock=> ${clock3} MHz"
echo "==============="

 

 

pi3b_plus_4_threads_unixbench.png.d4b006d4a66fb978d1e04bde19f4c129.png

Share this post


Link to post
Share on other sites
7 hours ago, sfx2000 said:

trust the kernel, with 4.14, the numbers are honest

 

I don't get your conclusion. The kernel has no idea what's going on. Look at your own output: http://ix.io/1niD

 

You suffer from max real cpufreq being limited to 1200 MHz once the SoC temperature exceeds 60°C (you can 'fix' this by adding 'temp_soft_limit=70' to /boot/config.txt and then reboot to let ThreadX switch back to old behavior) and as soon as 80°C is exceeded fine grained throttling starts further decreasing real ARM clockspeeds while the kernel still reports running at 1400 MHz since the mailbox interface between kernel and ThreadX returns requested and not real values)

Share this post


Link to post
Share on other sites
On 9/23/2018 at 2:05 AM, sfx2000 said:

 

5 hours ago, tkaiser said:

(you can 'fix' this by adding 'temp_soft_limit=70' to /boot/config.txt

Also use a heatsink. With temp_soft_limit it throttle's to 1200Mhz at 70°C. I see yours reaches 70°C very quickly.
Here my results with a heatsink and no fan.
http://ix.io/1iGM
My max temp is 72°C full load. While yours reaches 80°C.

Share this post


Link to post
Share on other sites

Here are some results for the OrangePi One Plus (Allwinner H6), running 4.18.0-rc7 kernel.

 

http://ix.io/1nr7

 

I have to admit that the temperature values are not accurate as the board has been passively cooled...by pressing it on a frozen gel pack ^^. I am a little afraid running it with any cooling at all.

Anyway, no throttling this way.

Share this post


Link to post
Share on other sites
1 hour ago, Werner said:

Here are some results for the OrangePi One Plus (Allwinner H6), running 4.18.0-rc7 kernel. http://ix.io/1nr7

 

Exactly same numbers as PineH64 which is not that much of a surprise given same type of memory is used with same settings. Your 7-zip numbers seem to be lower but that's just some background activity trashing the benchmark as the monitoring reveals. If you see %sys and especially %iowait percentage in the monitoring output you know you need to repeat the benchmark and stop as many active processes as possible prior to benchmark execution:

System health while running 7-zip multi core benchmark:

Time        CPU    load %cpu %sys %usr %nice %io %irq   Temp
16:15:10: 1800MHz  5.63  23%   1%  18%   0%   3%   0%  25.0°C
16:15:31: 1800MHz  5.05  84%   1%  83%   0%   0%   0%  48.2°C
16:15:51: 1800MHz  4.77  86%   1%  84%   0%   0%   0%  43.1°C
16:16:31: 1800MHz  5.15  88%  15%  53%   0%  19%   0%  39.6°C
16:16:51: 1800MHz  4.94  80%   1%  78%   0%   0%   0%  42.7°C
16:17:11: 1800MHz  4.82  92%   1%  89%   0%   0%   0%  45.0°C
16:17:31: 1800MHz  4.64  87%   1%  85%   0%   0%   0%  41.9°C
16:17:52: 1800MHz  4.74  94%  16%  72%   0%   5%   0%  43.8°C
16:18:13: 1800MHz  4.69  81%   1%  80%   0%   0%   0%  48.6°C
16:18:33: 1800MHz  4.56  86%   1%  84%   0%   0%   0%  39.5°C
16:19:28: 1800MHz  6.93  84%  12%  38%   0%  34%   0%  31.2°C

I bet unattended-upgrades was running in the background (you could check /var/log/dpkg.log)

Share this post


Link to post
Share on other sites

Should not have lots of background activity at all. The image I used was fresh without any packages added. 

I could not find activity of unattended-upgrades. I may or may not redo the benchmark though with some other processes like networkmanager, rsyslog or cron disabled.

Share this post


Link to post
Share on other sites
19 minutes ago, Werner said:

Should not have lots of background activity at all

 

Well, but that's what the extensive monitoring is for: to be able to throw results away quickly (no fire&forget benchmarking since only producing numbers without meaning).

 

The kernel reported high %sys and %io activity starting at the end of the single threaded 7-zip benchmark and that's the only reason your 7-zip numbers are lower than mine made on PineH64. Same H6, same type of memory, same u-boot, same kernel, same settings --> same performance.

 

IMO no more tests with improved cooling needed. The only remaining question is how good heat dissipation of both boards would be with otherwise identical environmental conditions (PineH64's PCB is rather huge and in both boards the copper ground plane is used as 'bottom heatsink' dissipating the heat away. But most probably PineH64 performs way better here with an appropriate heatsink due to larger PCB). But such a test is also pretty useless since results are somewhat predictable (larger PCB wins) and type of heatsink and whether there's some airflow around or not will be the more important factors. If heat dissipation problems are solved both boards will perform absolutely identical.

Share this post


Link to post
Share on other sites

I did the benchmarking two more times with some non-essential services disabled to minimize impact, but no better results. No idea what the io wait is causing. No cooling this time.

http://ix.io/1nwh

 

The 2nd one with swap disabled as I noticed swap was used after the first run, so I disabled it: http://ix.io/1nwo ... not a good idea.

 

I give up for now, last but not least because with the PineH64's results we know the results for the OPi One Plus as well...if it would rund flawless ¯\_(ツ)_/¯ 

 

 

Share this post


Link to post
Share on other sites
8 minutes ago, Werner said:

No idea what the io wait is causing

 

Haha, but now I know. I was an idiot before since it's simply zram/swap as can be easily seen by comparing iostat output from before and after:

before:
zram1             0.93         3.69         0.01       1176          4
zram2             0.93         3.69         0.01       1176          4
zram3             0.93         3.69         0.01       1176          4
zram4             0.93         3.69         0.01       1176          4

after:
zram1           588.13      1101.08      1251.45    1408792    1601184
zram2           586.62      1094.84      1251.62    1400808    1601404
zram3           582.01      1087.59      1240.44    1391524    1587092
zram4           587.14      1098.00      1250.54    1404848    1600016

That's 5.3GB read and 6.1GB written on the zram devices. I still have no idea why this benchmark on some boards (most probably kernels) runs with 1 GB without swapping but not on others like here:

RAM size:     994 MB,  # CPU hardware threads:   4
RAM usage:    882 MB,  # Benchmark threads:      4

NanoPi Fire3 also with just 1 GB RAM finishes with only minimal swapping:

RAM size:     990 MB,  # CPU hardware threads:   8
RAM usage:    901 MB,  # Benchmark threads:      8

Maybe vm.swappiness is the culprit. Can you repeat the bench another three times doing the following prior to each run:

sysctl -w vm.swappiness=0
sysctl -w vm.swappiness=1
sysctl -w vm.swappiness=60

 

Share this post


Link to post
Share on other sites
3 minutes ago, tkaiser said:

 

 

Maybe vm.swappiness is the culprit. Can you repeat the bench another three times doing the following prior to each run:


sysctl -w vm.swappiness=0
sysctl -w vm.swappiness=1
sysctl -w vm.swappiness=60

 

Sure thing.  I'll catch up later.

Share this post


Link to post
Share on other sites
26 minutes ago, Werner said:

swappiness=0 http://ix.io/1nDr
swappiness=1 http://ix.io/1nDw
swappiness=60 http://ix.io/1nDC

 

Quick summary:

  1. unfortunately all the time throttling happened so results are not comparable
  2. other than expected with vm.swappiness=1 less swap activity happened compared to vm.swappiness=0 (while 'everyone on the Internet' will tell you the opposite):
vm.swappiness=0

Compression: 2302,2234,2283
Decompression: 5486,5483,5490
Total: 3894,3858,3887

Write: 1168700
Read:  1337688

vm.swappiness=1

Compression: 2338,2260,2261
Decompression: 5506,5480,5512
Total: 3922,3870,3887

Write: 1138240
Read:   941548

vm.swappiness=60

Compression: 2266,2199,2204
Decompression: 5512,5495,5461
Total: 3889,3847,3832

Write: 1385560
Read:  1584220

vm.swappiness=100

Compression: 2261,2190,2200
Decompression: 5421,5485,5436
Total: 3841,3837,3818

Write: 1400808
Read:  1601404

 

Still no idea why with with Orange Pi Plus and 1 GB RAM massive swapping occurs while the same benchmark on NanoPi Fire3 also with 1 GB results in low swapping attempts (different kernels, maybe different contents of /proc/sys/vm/*)

Share this post


Link to post
Share on other sites


I bet unattended-upgrades was running in the background (you could check /var/log/dpkg.log)


Ha man i tried so hard to stop any service i could find or think of, but unintended-upgrades was definitely missed.

Sadly that board has been falling offline since running with the 1.8ghz patch. I'll take another pass

Sent from my SM-G950U using Tapatalk

Share this post


Link to post
Share on other sites
12 hours ago, lanefu said:

I bet unattended-upgrades was running in the background (you could check /var/log/dpkg.log)

 

But I've been wrong. The background activity was swap/zram in reality.

Share this post


Link to post
Share on other sites

Okay... I did fresh kernel build.. and tried to be a little more thorough with killing services..... and i ran a bigass fan over my whole arm cluster :P

 

http://ix.io/1nJl

 

my quick list of services I shutdown.. (many wont be relevant)

services="docker nomad consul rpcbind rpcbind.socket NetworkManager cron dbus.service haveged.service networkd-dispatcher.service ntp.service rsyslog.service snmpd.service systemd-resolved.service syslog.socket dbus.socket wpa_supplicant.service serial-getty@ttyS0.service"
systemctl stop $services

 

 

Re Crashing:

 

Doubt it was running full load.... I use it as my main linux jump host... so mostly screen and ssh, vim, ansible, occasional docker image build and a few agents.   (I'm amazed how much mileage i get with 1 gig ram!)   I had been using a Opi PC2 to do the same stuff, but migrated to the OnePlus for excitement.

Share this post


Link to post
Share on other sites
3 hours ago, lanefu said:

 

Relevant swapping occurred. Still no idea why so I added monitoring of virtual memory settings few hours ago. I'll look into this within the next months but no high priority since there is detailed monitoring in sbc-bench we know exactly why numbers differ currently between Orange Pi One Plus (1 GB) and PineH64 (more than 1 GB).

Share this post


Link to post
Share on other sites

My M4 sbc-bench results with Bionic. With 5V fan. SD-card.
http://ix.io/1nLh

A big difference with the one you have with stretch. Tinymembench does a lot worse on Bionic. While cpuminer does better(cooling?).
http://ix.io/1lzP

Weird numbers. @tkaiser Any idea what's up with tinymembench? Some scores are only 50% of the stretch ones.
I'll test everything in Stretch too when done in Bionic.

PS All the RockPro64 and T4 results are with Stretch. Why no Bionic there?

Share this post


Link to post
Share on other sites
13 minutes ago, NicoD said:

Weird numbers

 

Nope, everything as expected. The M4 number in the list https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md has been made with mainline kernel which shows way higher memory bandwidth on RK3399 (check the other RK3399 devices there). Cpuminer numbers differ due to GCC version (Stretch ships with 6.3, Bionic with 7.3 -- see the first three Rock64 numbers with 1400 MHz in the list -- always Stretch but two times with manually built newer GCC versions which significantly improve cpuminer performance)

 

If you love performance use recent software...

Share this post


Link to post
Share on other sites
On 9/28/2018 at 8:37 AM, tkaiser said:

Nope, everything as expected. The M4 number in the list https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md has been made with mainline kernel which shows way higher memory bandwidth on RK3399 (check the other RK3399 devices there). Cpuminer numbers differ due to GCC version (Stretch ships with 6.3, Bionic with 7.3 -- see the first three Rock64 numbers with 1400 MHz in the list -- always Stretch but two times with manually built newer GCC versions which significantly improve cpuminer performance)

 

If you love performance use recent software...

 

You made a good point earlier on another thread about compilers - sysbench is a good example of something that is very sensitive to options and versions - similar to UnixBench....

 

Brendan Gregg has a great rant here -- http://www.brendangregg.com/blog/2014-05-02/compilers-love-messing-with-benchmarks.html

 

key takeaway...

 

Quote

 

The results will depend not only on your hardware, but on your operating system, libraries, and even compiler.

 

As does the USAGE file under the "Interpreting the Results" section, which even suggests:

 

So you may want to make sure that all your test systems are running the same version of the OS; or at least publish the OS and compuiler versions with your results.

 

 

Which is a thing to consider - Gregg is an active supporter of benchmarking, but in a different manner - worth checking out not only the linked article, but the other posts on his site, and the presentation videos.

Share this post


Link to post
Share on other sites
On 9/28/2018 at 8:37 AM, tkaiser said:

Nope, everything as expected. The M4 number in the list https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md has been made with mainline kernel which shows way higher memory bandwidth on RK3399 (check the other RK3399 devices there). Cpuminer numbers differ due to GCC version (Stretch ships with 6.3, Bionic with 7.3 -- see the first three Rock64 numbers with 1400 MHz in the list -- always Stretch but two times with manually built newer GCC versions which significantly improve cpuminer performance)

 

The 64-bit Rockchips are showing very good memory performance - I've got a friend working on bringing up a renegade board on Arch, and the memory performance he's observed is consistent with findings here....

Share this post


Link to post
Share on other sites

NanoPI NEO on current Bionic as of 9/30/18

 

http://ix.io/1nXL

 

This is the v1.31 board so some of the temps might be lower that the 1.0/1.1 boards - it does nicely for an H3 ;)

 

RK3288-Tinker is still a hot mess - pardon the term...

 

http://ix.io/1nXQ

 

But this is expected - passive cooling with the Asus provided heatsink, and we're powering it over uUSB with a 2.5A Raspberry Pi (official) power supply... Might be interesting to see what happens with Tinker and letting it clock down to 126MHz as I'm thinking right now, it is getting heat soaked, so it spends a huge amount of time at 600MHz - current suggests that the upper limit temp wise is 70c, so when it gets there, it pulls back - and it doesn't have far to pull back - my tinker idles at 60c with the current Armbian Bionic build - that last little stint at 70c is the Tinker running sbc-bench on the current git...

 

1668737513_ScreenShot2018-09-30at3_48_58PM.png.5e27a97227dbd413b3305caf26a5ab03.png

Share this post


Link to post
Share on other sites

I am configuring another BPI-R2 machine, and I was checking the benchmarks.  For this I am using a 4.14.71 with Ubuntu.  The numbers are not better than my previous attempts.  But this happens because the standard openssl doesn't have the afalg engine.  So, I downloaded the latest openssl source, recompiled (zero modifications) and repeated only one test without and with afalg engine to verify how the R2 it is behaving.

 

Without AFALG

apps/openssl speed -elapsed -evp aes-256-cbc
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 3708058 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 1104719 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 292752 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 74300 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 9329 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 4662 aes-256-cbc's in 3.00s
OpenSSL 1.1.2-dev  xx XXX xxxx
built on: Fri Oct  5 05:54:31 2018 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread  -march=armv7-a -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc      19776.31k    23567.34k    24981.50k    25361.07k    25474.39k    25460.74k

With AFALG

 

apps/openssl speed -elapsed -evp aes-256-cbc -engine afalg
engine "afalg" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 129499 aes-256-cbc's in 2.95s
Doing aes-256-cbc for 3s on 64 size blocks: 115145 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 78540 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 34189 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 5404 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 2756 aes-256-cbc's in 3.00s
OpenSSL 1.1.2-dev  xx XXX xxxx
built on: Fri Oct  5 05:54:31 2018 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread  -march=armv7-a -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc        702.37k     2456.43k     6702.08k    11669.85k    14756.52k    15051.43k

This are numbers without the "elapsed" parameter.
 

Without AFALG

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc      19776.55k    23565.55k    24981.25k    25360.04k    25556.85k    25471.66k

With AFALG

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc       8008.18k    27638.99k   251372.80k  1167974.40k         infk  4513792.00k

 

I am not sure ... but it seems that afalg it is not available in the stock Armbian ( I checked this with a supported M2+).  With SBC seems important to have this available to use the machines potential.

 

Share this post


Link to post
Share on other sites
1 1