Jump to content

Benchmarking CPUs


pbies

Recommended Posts

@NicoD I know the computer hardware and software for 30 years now, so no need to explain me anything. This is also the time I was a developer. Lost tracking the apps just 2 years ago as they got multiplied by thousands. That's why I am asking for CPU benchmark.

 

I know that RAM will be used to start/stop the benchmark, as OS is needed and so on, but still all the operations can be done only on CPU, with small help of clock which can be just checked for time passed along with the test. So there is no need to use RAM for CPU benchmark that much. And that's the bench I was looking for.

I understand that you didn't found such a bench because I don't belive that such doesn't exist.

 

And yes, I know, that OS will interrupt the benchmark just by interrupts and IO and will be not exact, but it is fine for me - still no need to use RAM for CPU benchmark.

Link to comment
Share on other sites

I imagine such benchmark as:

 - operating only on registers = add, subtract, multiply, divide and other math instructions (ALU)

 - using many of ARM instructions, but limited to ARM, and also be reliable on x86/x64 CPUs

 - doing number of operations in specific (selectable) time

 - only operating on RAM if there is need to check the clock for the above

Link to comment
Share on other sites

 

@tkaiser

What am I doing wrong? I don't see it.

 

pi@raspberrypi:~ $ sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw
--2018-07-26 15:10:57-- https://pastebin.com/raw/Ww84KMmw
Resolving pastebin.com (pastebin.com)... 104.20.209.21, 104.20.208.21
Connecting to pastebin.com (pastebin.com)|104.20.209.21|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘/usr/local/bin/sbc-bench.sh’

/usr/local/bin/sbc-bench.sh [ <=> ] 12.18K --.-KB/s in 0.004s

2018-07-26 15:10:57 (2.69 MB/s) - ‘/usr/local/bin/sbc-bench.sh’ saved [12475]

pi@raspberrypi:~ $ sudo chmod 755 /usr/local/bin/sbc-bench.sh
pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh
sudo: unable to execute /usr/local/bin/sbc-bench.sh: No such file or directory

I've checked. Indeed no such file.

Link to comment
Share on other sites

59 minutes ago, pbies said:

And yes, I know, that OS will interrupt the benchmark just by interrupts and IO and will be not exact, but it is fine for me - still no need to use RAM for CPU benchmark.

Question out of curiosity.. What's a CPU only benchmark for else than miss information on the people reading it? Doesn't it lead to a situation which we still have? People buying CPU by:

  • the more cores it has the better
  • the more GHz is written on the box the faster it must be!

For me there are two sides in benchmarking which should be kept in mind. First it should be reliable, means others should get 'the same numbers out' when they repeat your benchmark (e.g. I could benchmark wifi sticks somewhere in the alps here, for sure throughput would be a way higher than in my hometown were I've actually around 20 others wifis visible. A 'good' benchmark would involve both situations with a comment on why and how performance differs..). You should also think about who reads your benchmark. For the '30 years in computer science guy' it might be obvious that ram-speed, general IO speed etc. matters too, for the average smartass probably not.

Things like "I bought *random sbc* cause *random guy* claimed that the CPU there is 10x faster than *random other boards CPU*. So others then have to explain the average smartass why this doesn't matter for his case due to someone smart enough to know it better decided to publish a benchmark which doesn't tell the full truth. Publish it with something like material and methods, discussion of this methods and a conclusion (somehow like a scientific paper). Otherwise your benchmark just lead to mis-assumptions which others had/have to correct.

Best example, with the average smartass: "I bought a 2A PSU cause the RPi guys said buy a 2A PSU and your SBC will run fine and microUSB is better than the outdated barrel plug cause my tiny tiny connector is gold plated and my outdated big barrel plug with a huge contact area is only nickel plated.. :lol:" The reality showed that barrel plug SBCs works 'in general' more reliable especially under high powerusage situations but the RPi guys seem to disagree on this, otherwise they would never release the 3B+, or they think it's funny to troll their own community and make some extra bucks by selling 5.15V RPi branded PSUs, a microUSB powered board which needs a 'special PSU' is IMO just error by design (5V on the microUSB output should be sufficient otherwise your PCB designer failed).. but different story...  

 

I tried once to 'benchmark' the CPU mostly with a tool called cachebench after reading through a paper written by Ulrich Drepper (and I'm quite sure, he knows what he's doing, for people interested, the paper is old but I think it's worth to read it) you could easily see if the benchmark happened in CPU cache or if it used RAM (as far as I've in mind, you could even distinguish between L1 or L2 cache mostly when it did memcpy) but I got results out of it which I could not explain, especially there were 'performance peaks' which shouldn't IMO be there which I couldn't explain in a rational matter. I decided to throw away the whole dataset and never published it cause it would only end in miss-assumptions (I think the bashscript which set-up the system reliable and converts data for analysis after its is still somewhere but I don't think it makes sense to work on it as long as I can't explain the results). 

Link to comment
Share on other sites

@tkaiser

@tkaiser
Tinymembench wasn't installed. I've installed it and redone the bench. I'm amazed by what I see. I didn't know my Raspberry's were lying to me. I'll do the same in Ubuntu. And also with a fan to see if it's still underclocked to 1.2Ghz. I had a lot better results in Ubuntu with the Rasp3b+, while the rasp3b is slower in Ubuntu vs raspbian. I found that strange. I think this is the reason.

Here the full bench with tinymembench

 

pi@raspberrypi:~/tinymembench $ sudo /usr/local/bin/sbc-bench.sh
Installing needed tools. This may take some time... Done.
Executing tinymembench. This will take a long time... Done.
Executing 7-zip benchmark. This will take a long time... Done.
Executing OpenSSL benchmark. This will take a long time... Done.


Below benchmark results:

Memory performance:
memcpy: 1132.0 MB/s (0.1%)
memset: 1532.7 MB/s (0.2%)

7-zip total scores (three runs): 3229,3234,3252

OpenSSL results:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 30443.07k 41855.19k 46518.27k 47647.74k 48207.19k 48229.03k
aes-128-cbc 30656.82k 42144.41k 46898.26k 48249.17k 48482.99k 48611.33k
aes-192-cbc 27670.34k 36476.54k 40087.98k 41066.84k 41227.61k 41298.60k
aes-192-cbc 26882.09k 35798.91k 39068.93k 40194.05k 40580.44k 40452.10k
aes-256-cbc 25480.00k 32825.15k 35353.34k 36172.46k 36413.44k 36574.55k
aes-256-cbc 25390.35k 32830.42k 35614.63k 36268.03k 36620.97k 36634.62k

Full results uploaded to http://ix.io/1ism. Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

 

Edited by NicoD
Tinymembench wasn't installed
Link to comment
Share on other sites

@chwe I always try to simplify things when there is such possibility, so I'll be short on what you written: it is a specific usage and the CPU core is the most important part. That's why I need reliable benchmark of CPU, NOT RAM, NOT IO, NOT disk.

And yes, cache is part of CPU - good for my application.

 

But as you can see the rumours are different for each app. I didn't selected the last and only bench for this moment. I'm gathering knowledge and not happy enough with the current proposals and results.

 

Also in multi-threaded and multi-core benchmarks, I think, you will not find the samresults even on the same machine in the same circumstances. What I am trying to do is to compare x86/x64 with ARM in some kind of general for those two architectures bench.

And also I don't belive that I am only one on the planet that is searching for such app. It should exist. Even if it doesn't exist - one should be able to write it. Linpack seems to be the right choice, but I need to drill the topic more.

Link to comment
Share on other sites

I did the same in Ubuntu, no fan or heatsink. The same result as Raspbian. Everything over 60°C is 1200Mhz.
Also again the 7zip problem, stopped.
 

Executing OpenSSL benchmark. This will take a long time... Done.


Below benchmark results:

Memory performance:
memcpy: 1108.8 MB/s 
memset: 1516.6 MB/s 

7-zip total scores (three runs): 

OpenSSL results:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc      37186.77k    45709.08k    48782.93k    49392.30k    49815.55k
aes-128-cbc      37996.50k    46337.66k    49573.89k    50489.34k    50686.63k
aes-192-cbc      33266.59k    39735.40k    41970.09k    42459.82k    42748.59k
aes-192-cbc      33093.75k    38628.12k    41025.45k    41634.47k    41806.51k
aes-256-cbc      29922.34k    35095.13k    36887.04k    37195.78k    37486.59k
aes-256-cbc      29542.89k    34427.33k    36255.91k    36720.30k    36828.50k

Full results uploaded to http://ix.io/1isM. Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

And here the results of Raspbian with cooling. All normal here.
http://ix.io/1isD
 

I thought it was at 70°C it would throttle to 1200Mhz. Or am I wrong here?
It shows that even the Rasp 3b+ needs a fan for good performance.
Maybe I'll make a video about it.
Thanks for all the info. Cheers

 

Link to comment
Share on other sites

3 hours ago, NicoD said:

Full results uploaded to http://ix.io/1ism

 

Thank you! So the new RPi 3 B+ with latest updates applied silently downclocks even when there's just boring tinymembench running:

System health while running tinymembench:

Time        fake/real   load %cpu %sys %usr %nice %io %irq    CPU   VCore
16:56:39: 1400/1400MHz  0.28  14%   0%  12%   0%   0%   0%  60.1°C  1.3250V
16:57:39: 1400/1200MHz  0.67  21%   0%  21%   0%   0%   0%  62.8°C  1.2313V

Would be funny to repeat the test this time with fan active since I would believe the RPi clowns do not ony downclock CPU cores but most probably also GPU, VPU and DRAM. A second test with fan should clarify.

 

Edit: already provided by @NicoD in the meantime :) 

 

3 hours ago, NicoD said:

I had a lot better results in Ubuntu with the Rasp3b+, while the rasp3b is slower in Ubuntu vs raspbian

 

  1. Upstream Ubuntu armhf packages are build with another GCC version and different compiler switches (for ARMv7) while Raspbian builds everything for ARMv6 to support their single core boards too. But Raspbian uses more aggressive compiler switches so that some code (e.g. the funny sysbench joke) performs better with the Raspbian ARMv6 binary compared to an upstream Debian or Ubuntu armhf package: see sysbench pseudo benchmark numbers made with my OMV images for RPi (using an Armbian armhf userland combined with the proprietary RPi stuff): https://forum.armbian.com/topic/1748-sbc-consumptionperformance-comparisons/?page=2
  2. Way more important what everyone ignores: the Raspberry Pi is NOT an ARM SBC like all the other boards we're using. It's a VideoCore IV (VC4) SBC with some crappily integrated ARM cores. The VC4 is the primary CPU and runs a closed source RTOS called ThreadX that fully controls the hardware. The ARM cores are just guest processors (called 'third class citizens' by the lady who tried to develop an open source replacement for the proprietary ThreadX stuff) and are only able to run a secondary OS like e.g. Linux that has not even a clue at which clockspeeds it's running

4 weeks ago the RPi clowns decided to release a new ThreadX release which contains a significant change: as soon as the SoC temperature exceeds 60"C on the RPi 3 B+ some subsystems will be silently downclocked. Since they're cheating you can't realize that by querying the usual sysfs node. In the past it was possible to spot this cheating by 'vcgencmd get_throttled' which reported throttling (and also frequency capping and undervoltage) since last reboot. Now they cheat even more and with this first clock reduction from 1.4 GHz to 1.2 GHz the relevant throttling bit will not be set any more.

 

In other words: 4 weeks ago the vast majority of RPi 3 B+ out there was a bit faster compared to after applying latest updates. The closed sourced main OS ThreadX is available to us only as BLOBs living on the FAT partition below /boot (on RPi OS images it's the 'raspberrypi-bootloader' package pulled in from archive.raspberrypi.org). This is a typical 'commit' (exchanged BLOBs no one outside RPi Trading and Broadcom can look into): https://github.com/raspberrypi/firmware/commit/0bef3cb16d600292d4185796cc042fd564bc694d

 

The whole hardware initialization as well as everything that's performance relevant happens in ThreadX, the ways to monitor what's really happening when looking from the secondary OS (Linux) are crippled (since mailbox driver is cheating and reporting fantasy clockspeeds) so on this VC4 platform it's even more important to permanently monitor as good as possible what's happening. Since benchmarking without checking what's really happening is only generating numbers without meaning.

 

TL;DR: RPi clowns decided few weeks ago to trash performance of all RPi 3 B+ out there to address the instability problems some board owners suffer from. Problem as well as workaround to get back old behaviour described here: https://www.raspberrypi.org/forums/viewtopic.php?f=63&amp;t=217056#p1335342

 

Link to comment
Share on other sites

10 minutes ago, pbies said:

@chwe I always try to simplify things when there is such possibility, so I'll be short on what you written: it is a specific usage and the CPU core is the most important part. That's why I need reliable benchmark of CPU, NOT RAM, NOT IO, NOT disk.

And yes, cache is part of CPU - good for my application.

 

You should read through Ulrich Dreppers work.. To ensure that this stuff happens on L1 or L2 cache you must put a bunch of work in your code. And then bechmark your code. But then your first post is IMO worthless (I use tool X, benchmark processor A B and C and X runs on on A 10 times faster than on B and on C it's half the speed of B).. If for your application CPU only matters than you should maybe read into what your SoC is capable to do (e.g. stuff like NEON may matter). Compiler and compiler flags too... Probably cache and cache usage too. Does the benchmark represent all these factors? Test A and look at B to conclude C is often how such stuff ends (I'm a heretic, but I did/do it to often even on fields which I think I understand what I do - e.g. optimize reaction parameters in chemistry with model molecules cause the target molecule wasn't available in the amounts needed to do proper optimization, happens quite often that something doesn't work as expected as soon as you use the target molecule).. In this case you should IMO exactly understand what your benchmark benchmarks before you even can assume that the benchmark will be somehow correlate with your 'program'. For a CPU benchmark:

  • Can you be sure that stuff happens on the CPU and RAM is not involved?
  • Can you monitor CPU clockspeed without affecting the benchmark (or at least, is this monitoring affecting the benchmark reliable)
  • How can you make sure that your results are reliable? Are you sure that nothing happens in the Background which affects your results?

From there, it looks for me that it only makes sense to benchmark your piece of code every CPU you might consider as useful for it. Cause then, you're at least sure that B is out of the equation and you only measure A and probably conclude a wrong C :lol: (at least one possibility of errors less).

 

39 minutes ago, pbies said:

Also in multi-threaded and multi-core benchmarks, I think, you will not find the samresults even on the same machine in the same circumstances.

As soon as this happens, this should be a big red flag. Cause I'm not interested in *single thing* benchmarking I 'can' happily ignore such stuff. I don't have to care if the kernel scheduler works properly or if this CPU may perform better with some out of tree kernel schedulers (I think FaceBook developed once a own scheduler or was it google? don't know anymore). All this stuff will be somehow in the S/N of the overall benchmarking of the system. But when you start to benchmark single points of a system IMO you need to take such stuff into account.. 

  • how much do my single results difference from each other
  • how many results do I need for a proper statement
  • which statistics is needed and are they correct to represent my results (e.g. geometric vs. arithmetic mean etc. - I would guess the geometric is appropriate but don't hang me on this)

Going down this whole rabbit hole just to publish some results which weren't of much interest for me wasn't something I wanted to do (it was interesting to read into the topic but I don't feel competent enough to publish the results).

On the other hand, benchmarks with parts I can replace are interesting.. Which SD-Card should I use cause it performs better than others? Compare two arm SBCs with different SoCs might be also interesting (I can replace my SBC if the one I own isn't sufficient for my needs.. :P).

 

If you compare then through different CPU architectures you open a next rabbit hole. Was the benchmark developed for this purpose? Or was it just made to compare different CPUs from the same family. It might fit perfect to compare different x86 systems but completely suck to compare arm and x86. 

 

1 hour ago, pbies said:

And also I don't belive that I am only one on the planet that is searching for such app. It should exist. Even if it doesn't exist - one should be able to write it.

If it would be that easy, someone would already wrote it. It's not that there isn't a market for reliable benchmark tools. 

Link to comment
Share on other sites

8 minutes ago, tkaiser said:

Would be funny to repeat the test this time with fan active since I would believe the RPi clowns do not ony downclock CPU cores but most probably also GPU, VPU and DRAM. A second test with fan should clarify.

Here with cooling.

http://ix.io/1isD

Then what is `better` to the +? 1/3 gigabit ethernet., wifi and a metal surounding over the soc. That`s all the inovation they did in a year. I can`t understand why they came with the 3b+. They have the biggest part of the market and come with sh*t products.
Is it ok I make a video about this, and say your name?
"Your Raspberry 3b+ is lying to you"
I think it would get views. Thanks.

Link to comment
Share on other sites

31 minutes ago, NicoD said:

I thought it was at 70°C it would throttle to 1200Mhz. Or am I wrong here?

 

They silently changed this few weeks ago with a new ThreadX release (the primary OS they call 'firmware'). BTW: I added monitoring of ThreadX settings (search for config.txt in http://ix.io/1isD output) and saw that you didn't use any overclock settings. Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now.

Link to comment
Share on other sites

4 minutes ago, tkaiser said:

Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now.

I`ll do that tomorrow. I`ve been awake too long. Can`t focus anymore. Cheers

Link to comment
Share on other sites

2 hours ago, NicoD said:

I can`t understand why they came with the 3b+

 

To sell more of these devices making nice profits? The average RPi user is pretty clueless so all that's needed to sell a new 'incremental update' is mentioning that it's faster. In fact the 3 B+ was a little bit faster for some months (1.4 GHz vs. 1.2 GHz and way better PCB design plus heatspreader resulted in higher sustained performance). Now that all the benchmarks are published they silently reverted the higher performance since everything demanding that would need the 1.4 GHz will now trigger the 60°C throttling treshold easily.

 

But hey, RPi users won't realize since /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq shows only bogus numbers. Same when undervoltage occurs. In such a situation (input voltage dropping below 4.65V which happens very very very often with Raspberries not using their 'special PSU' but standard Micro USB gear) the ARM cores are immediately downclocked to 600 MHz while /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq happily lies about 1400 MHz on the 3 B+ or 1200 MHz on the 3 B.

 

BTW: The VC4 is a 2010 design and nothing except exchanged ARM cores has changed. They have nothing else. If they would switch to a new SoC backwards compatibility wouldn't exist any more. I wouldn't be surprised if we see 2019 a last incremental update (then using an eMMC socket, implementing SDR104 mode for faster SD card access and Wi-Fi with 2x2 MIMO and real antennas) and in 2020 they're simply telling 'game over'.

Link to comment
Share on other sites

11 minutes ago, tkaiser said:

I wouldn't be surprised if we see 2019 a last incremental update (then using an eMMC socket, implementing SDR104 mode for faster SD card access and Wi-Fi with 2x2 MIMO and real antennas) and in 2020 they're simply tellling game over.

Why so pessimistic/optimistic? (depends highly on point of view :lol:Just buy and add a cheap AI block, stick it to the SoC. RPi goes AI with Blockchain!!!1! :lol: They will survive 2-3 years more with it.. :P 

 

If you accept that the RPi isn't a high performance SBC made to 'learn' programming, not really low level stuff, more different sorts of 'hello world' with GPIOs and/or camera. The RPi is still an affordable board. For this use-case their last iteration is worthless (you still can do this with a RPi3) but that's part of the deal to keep the people happy who pay their rent (seems that they aren't that happy at the moment, but hey, they sold over 19M devices so they can't be wrong right? :P At least it seems to be their main excuse for every thing they didn't do right).

 

Probably we should split the RPi rant part in the RPi thread, and @NicoDs 7zip results into this one?

 

 

 

Link to comment
Share on other sites

50 minutes ago, chwe said:

Probably we should split the RPi rant part in the RPi thread

 

No, please. The 'rant' highlights what's problematic when benchmarking boards: the existence of an operating system the average user doesn't know about (since being told ThreadX would just be some 'firmware') and how updates of this primary OS can affect performance behaviour and how hard it is to monitor this stuff to get a clue why performance differs when running this OS image vs. that OS image (especially keeping in mind that those secondary Linux operating systems pull in updates for the primary OS that change whole system behaviour)

 

Let's keep this stuff collected here. When I soon start some sort of a tutorial how to benchmark correctly I will reference some posts here.

Link to comment
Share on other sites

16 hours ago, tkaiser said:

and saw that you didn't use any overclock settings. Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now.

The RPI 3b+ all overclocks. It does downclock to 1200Mhz. But the voltage isn't lowered. Always 1.3312V
So it reaches a much higher temp of +80°C. In tinymembench it does perform better.
Here's the result overclocked in Raspbian.
 

pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh
Installing needed tools. This may take some time... Done.
Executing tinymembench. This will take a long time... Done.
Executing 7-zip benchmark. This will take a long time... Done.
Executing OpenSSL benchmark. This will take a long time... Done.


Below benchmark results:

Memory performance:
memcpy: 1312.4 MB/s (0.1%)
memset: 1821.1 MB/s (0.3%)

7-zip total scores (three runs): 3313,3285,3050

OpenSSL results:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      30650.45k    42188.33k    46899.54k    48080.90k    48657.75k    48655.02k
aes-128-cbc      30022.42k    41722.30k    46365.70k    47691.09k    47920.47k    48168.96k
aes-192-cbc      27145.16k    35704.23k    39041.02k    39630.85k    40457.56k    40594.09k
aes-192-cbc      27553.39k    36428.63k    39753.47k    40870.91k    41170.26k    41041.92k
aes-256-cbc      25483.15k    32836.35k    35496.96k    36387.16k    36626.43k    36640.09k
aes-256-cbc      24849.98k    32170.92k    34895.19k    35523.24k    35700.74k    35471.36k

Full results uploaded to http://ix.io/1ivA. Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

Here's the same overclock with cooling as comparison.
 

pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh
Installing needed tools. This may take some time... Done.
Executing tinymembench. This will take a long time... Done.
Executing 7-zip benchmark. This will take a long time... Done.
Executing OpenSSL benchmark. This will take a long time... Done.


Below benchmark results:

Memory performance:
memcpy: 1315.1 MB/s 
memset: 1945.6 MB/s 

7-zip total scores (three runs): 3907,3629,3549

OpenSSL results:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      40217.22k    55205.89k    61371.56k    62970.54k    63670.95k    63706.45k
aes-128-cbc      40171.25k    55206.31k    61369.51k    63136.77k    63501.65k    63668.22k
aes-192-cbc      36252.93k    47764.89k    52463.45k    53744.64k    53998.93k    54160.04k
aes-192-cbc      36250.63k    47887.83k    52324.01k    53736.45k    54138.20k    54012.59k
aes-256-cbc      33070.41k    42646.83k    46240.09k    47617.37k    47928.66k    47950.51k
aes-256-cbc      33311.06k    42963.97k    46611.37k    47477.42k    47925.93k    47945.05k

Full results uploaded to http://ix.io/1ivO. Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

 

I'll do more tests....

Link to comment
Share on other sites

3 hours ago, NicoD said:

I'll do more tests....

 

In the meantime I improved the script in some areas (especially throttling/undervoltage warnings). To get rid of CRLF it's as easy as

wget -O - https://pastebin.com/raw/CXtt28y1 | tr -d "\015" >/usr/local/bin/sbc-bench.sh
chmod 755 /usr/local/bin/sbc-bench.sh
sudo /usr/local/bin/sbc-bench.sh

Tested already with a bunch of boards (all the time in situations with active cooling to test for stuff like kernel differences or architecture):

The interesting stuff as follows:

  • When comparing the RK3399 boards (NanoPC T4 and RockPro) kernel version makes a huge difference wrt memory bandwidth/latency which also results in different 7-zip scores
  • arm64 vs. armhf (Rock64) is not that much of an issue. The armhf binary is slightly slower but on the other hand an armhf userland can cope with less available physical memory
  • NanoPi Fire has 8 CPU cores but just 1 GB DRAM which results in a big problem with almost all workloads that would benefit from 'as much CPU cores as possible'. As a result swapping happens. With recent Armbian not that much of a problem since we switched from SD card based emergency swap to zram which works pretty well. But when running sbc-bench with a different distro relying on swap numbers might be much lower since storage becomes the bottleneck (TBC).

I'll push the script plus explanations on Github over the weekend and create an own thread for the tool.

Link to comment
Share on other sites

2 hours ago, NicoD said:

Here's the same overclock with cooling as comparison

 

Interesting, thanks. Still throttling happened since you hit the 60°C treshold multiple times, see the 7-zip results from 3 consecutive runs: 3907,3629,3549 (declining) and

11:34:15: 1570/1200MHz  3.57  83%   1%  82%   0%   0%   0%  60.1°C  1.3312V
11:35:00: 1570/1200MHz  3.90  80%   1%  79%   0%   0%   0%  59.6°C  1.3312V

You would nee to add 'temp_soft_limit=70' to /boot/config.txt and reboot to get back 'old behaviour' so silent throttling starts as 70°C as prior to 'Jul 3 2018 14:15:46' (that's the timestamp of latest ThreadX update that destroys performance on every RPi 3+ around). 

Link to comment
Share on other sites

 

I also made a walk-through with the information I had from you.

sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw 

sudo apt-get install dos2unix
sudo dos2unix /usr/local/bin/sbc-bench.sh

sudo chmod 755 /usr/local/bin/sbc-bench.sh
sudo /usr/local/bin/sbc-bench.sh

I'm now doing the Khadas Vim2 Max. I'll start using the new script after I'm done with the Khadas.
I'll do all of them I've got that aren't in the list.
I am wondering how the NanoPC-T4 does against my XU4 and the NanoPC-T3+
I'll keep you informed of all the results.

Link to comment
Share on other sites

4 minutes ago, tkaiser said:

Interesting, thanks. Still throttling happened since you hit the 60°C treshold multiple times, see the 7-zip results from 3 consecutive runs: 3907,3629,3549 (declining) and

I didn't add my heatsink. I'll do that again. I think it stays under 60 then. And it's about the hottest day ever, how to cool then...

Link to comment
Share on other sites

53 minutes ago, NicoD said:

sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw

 

Please https://pastebin.com/raw/CXtt28y1 instead and no need for dos2unix since this can simply be achieved by stripping out the CR characters ('\015') as shown above using tr.

 

Vim2 will be very interesting since strange things happen there. Really curious about the results. NanoPC T4 results are above (but with conservative settings limiting CPU cores to 1.8/1.4GHz instead of 2.0/1.5GHz we'll use later) and NanoPC-T3+ will be more or less the same as NanoPi Fire3 since same SoC but different amount (and maybe type) of DRAM. And yeah, XU4 is also interesting.

Link to comment
Share on other sites

@tkaiser

 

I ain't getting an output url with the khadas. Any idea why?
Internet works.

Below benchmark results:

Memory performance (on big.LITTLE systems measured individually):
memcpy: 1846.3 MB/s 
memset: 5735.6 MB/s (1.1%)
memcpy: 1615.8 MB/s (0.1%)
memset: 4998.7 MB/s 

7-zip total scores (three runs): 4803,4848,4790

OpenSSL results (on big.LITTLE systems measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-cbc     177427.08k   474573.65k   793271.13k   986034.86k  1061098.84k
aes-128-cbc     125519.67k   335571.22k   561038.51k   697529.00k   750712.15k
aes-192-cbc     165722.43k   409484.03k   633686.95k   754190.34k   798203.90k
aes-192-cbc     117216.16k   288164.71k   448321.79k   533306.71k   564486.14k
aes-256-cbc     158712.70k   365550.08k   542333.27k   627987.11k   658262.70k
aes-256-cbc     112265.98k   259799.94k   383578.03k   444206.42k   465603.24k

Full results uploaded to . Please check the log for anomalies (e.g. swapping
or throttling happenend) and otherwise share this URL.

 

Link to comment
Share on other sites

12 minutes ago, NicoD said:

I ain't getting an output url with the khadas. Any idea why?

 

I've seen temporary failures as well when accessing http://ix.io -- guess I have to rework the upload routine to try it even more times (currently upload will be tried twice then given up on)

 

A simple check would be

echo Hello world. | curl -F 'f:1=<-' ix.io

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines