pbies Posted July 26, 2018 Author Posted July 26, 2018 @NicoD I know the computer hardware and software for 30 years now, so no need to explain me anything. This is also the time I was a developer. Lost tracking the apps just 2 years ago as they got multiplied by thousands. That's why I am asking for CPU benchmark. I know that RAM will be used to start/stop the benchmark, as OS is needed and so on, but still all the operations can be done only on CPU, with small help of clock which can be just checked for time passed along with the test. So there is no need to use RAM for CPU benchmark that much. And that's the bench I was looking for. I understand that you didn't found such a bench because I don't belive that such doesn't exist. And yes, I know, that OS will interrupt the benchmark just by interrupts and IO and will be not exact, but it is fine for me - still no need to use RAM for CPU benchmark. 0 Quote
pbies Posted July 26, 2018 Author Posted July 26, 2018 I imagine such benchmark as: - operating only on registers = add, subtract, multiply, divide and other math instructions (ALU) - using many of ARM instructions, but limited to ARM, and also be reliable on x86/x64 CPUs - doing number of operations in specific (selectable) time - only operating on RAM if there is need to check the clock for the above 0 Quote
tkaiser Posted July 26, 2018 Posted July 26, 2018 8 minutes ago, pbies said: such benchmark as Do a web search for 'ejolson openblas linpack' or visit directly: https://www.raspberrypi.org/forums/viewtopic.php?t=208167 Number crunching benchmark when done correctly (read as: NOT using distro packages). Close to irrelevant for anything normal (since memory bandwidth/latency always matters) 0 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 @tkaiser What am I doing wrong? I don't see it. pi@raspberrypi:~ $ sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw --2018-07-26 15:10:57-- https://pastebin.com/raw/Ww84KMmw Resolving pastebin.com (pastebin.com)... 104.20.209.21, 104.20.208.21 Connecting to pastebin.com (pastebin.com)|104.20.209.21|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: ‘/usr/local/bin/sbc-bench.sh’ /usr/local/bin/sbc-bench.sh [ <=> ] 12.18K --.-KB/s in 0.004s 2018-07-26 15:10:57 (2.69 MB/s) - ‘/usr/local/bin/sbc-bench.sh’ saved [12475] pi@raspberrypi:~ $ sudo chmod 755 /usr/local/bin/sbc-bench.sh pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh sudo: unable to execute /usr/local/bin/sbc-bench.sh: No such file or directory I've checked. Indeed no such file. 0 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 The file is there. But doesn't work. I'll download the full raspbian pixel. This was the lite with Pixel installed. Maybe something's not there. 0 Quote
zador.blood.stained Posted July 26, 2018 Posted July 26, 2018 39 minutes ago, NicoD said: https://pastebin.com/raw/Ww84KMmw This script has CR+LF line endings, it needs to be converted to Unix format first 1 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 4 minutes ago, zador.blood.stained said: CR+LF line endings Ok, it's comming now. 0 Quote
chwe Posted July 26, 2018 Posted July 26, 2018 59 minutes ago, pbies said: And yes, I know, that OS will interrupt the benchmark just by interrupts and IO and will be not exact, but it is fine for me - still no need to use RAM for CPU benchmark. Question out of curiosity.. What's a CPU only benchmark for else than miss information on the people reading it? Doesn't it lead to a situation which we still have? People buying CPU by: the more cores it has the better the more GHz is written on the box the faster it must be! For me there are two sides in benchmarking which should be kept in mind. First it should be reliable, means others should get 'the same numbers out' when they repeat your benchmark (e.g. I could benchmark wifi sticks somewhere in the alps here, for sure throughput would be a way higher than in my hometown were I've actually around 20 others wifis visible. A 'good' benchmark would involve both situations with a comment on why and how performance differs..). You should also think about who reads your benchmark. For the '30 years in computer science guy' it might be obvious that ram-speed, general IO speed etc. matters too, for the average smartass probably not. Things like "I bought *random sbc* cause *random guy* claimed that the CPU there is 10x faster than *random other boards CPU*. So others then have to explain the average smartass why this doesn't matter for his case due to someone smart enough to know it better decided to publish a benchmark which doesn't tell the full truth. Publish it with something like material and methods, discussion of this methods and a conclusion (somehow like a scientific paper). Otherwise your benchmark just lead to mis-assumptions which others had/have to correct. Best example, with the average smartass: "I bought a 2A PSU cause the RPi guys said buy a 2A PSU and your SBC will run fine and microUSB is better than the outdated barrel plug cause my tiny tiny connector is gold plated and my outdated big barrel plug with a huge contact area is only nickel plated.. " The reality showed that barrel plug SBCs works 'in general' more reliable especially under high powerusage situations but the RPi guys seem to disagree on this, otherwise they would never release the 3B+, or they think it's funny to troll their own community and make some extra bucks by selling 5.15V RPi branded PSUs, a microUSB powered board which needs a 'special PSU' is IMO just error by design (5V on the microUSB output should be sufficient otherwise your PCB designer failed).. but different story... I tried once to 'benchmark' the CPU mostly with a tool called cachebench after reading through a paper written by Ulrich Drepper (and I'm quite sure, he knows what he's doing, for people interested, the paper is old but I think it's worth to read it) you could easily see if the benchmark happened in CPU cache or if it used RAM (as far as I've in mind, you could even distinguish between L1 or L2 cache mostly when it did memcpy) but I got results out of it which I could not explain, especially there were 'performance peaks' which shouldn't IMO be there which I couldn't explain in a rational matter. I decided to throw away the whole dataset and never published it cause it would only end in miss-assumptions (I think the bashscript which set-up the system reliable and converts data for analysis after its is still somewhere but I don't think it makes sense to work on it as long as I can't explain the results). 1 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 (edited) @tkaiser @tkaiser Tinymembench wasn't installed. I've installed it and redone the bench. I'm amazed by what I see. I didn't know my Raspberry's were lying to me. I'll do the same in Ubuntu. And also with a fan to see if it's still underclocked to 1.2Ghz. I had a lot better results in Ubuntu with the Rasp3b+, while the rasp3b is slower in Ubuntu vs raspbian. I found that strange. I think this is the reason. Here the full bench with tinymembench pi@raspberrypi:~/tinymembench $ sudo /usr/local/bin/sbc-bench.sh Installing needed tools. This may take some time... Done. Executing tinymembench. This will take a long time... Done. Executing 7-zip benchmark. This will take a long time... Done. Executing OpenSSL benchmark. This will take a long time... Done. Below benchmark results: Memory performance: memcpy: 1132.0 MB/s (0.1%) memset: 1532.7 MB/s (0.2%) 7-zip total scores (three runs): 3229,3234,3252 OpenSSL results: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 30443.07k 41855.19k 46518.27k 47647.74k 48207.19k 48229.03k aes-128-cbc 30656.82k 42144.41k 46898.26k 48249.17k 48482.99k 48611.33k aes-192-cbc 27670.34k 36476.54k 40087.98k 41066.84k 41227.61k 41298.60k aes-192-cbc 26882.09k 35798.91k 39068.93k 40194.05k 40580.44k 40452.10k aes-256-cbc 25480.00k 32825.15k 35353.34k 36172.46k 36413.44k 36574.55k aes-256-cbc 25390.35k 32830.42k 35614.63k 36268.03k 36620.97k 36634.62k Full results uploaded to http://ix.io/1ism. Please check the log for anomalies (e.g. swapping or throttling happenend) and otherwise share this URL. Edited July 26, 2018 by NicoD Tinymembench wasn't installed 0 Quote
pbies Posted July 26, 2018 Author Posted July 26, 2018 @chwe I always try to simplify things when there is such possibility, so I'll be short on what you written: it is a specific usage and the CPU core is the most important part. That's why I need reliable benchmark of CPU, NOT RAM, NOT IO, NOT disk. And yes, cache is part of CPU - good for my application. But as you can see the rumours are different for each app. I didn't selected the last and only bench for this moment. I'm gathering knowledge and not happy enough with the current proposals and results. Also in multi-threaded and multi-core benchmarks, I think, you will not find the same results even on the same machine in the same circumstances. What I am trying to do is to compare x86/x64 with ARM in some kind of general for those two architectures bench. And also I don't belive that I am only one on the planet that is searching for such app. It should exist. Even if it doesn't exist - one should be able to write it. Linpack seems to be the right choice, but I need to drill the topic more. 0 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 I did the same in Ubuntu, no fan or heatsink. The same result as Raspbian. Everything over 60°C is 1200Mhz. Also again the 7zip problem, stopped. Executing OpenSSL benchmark. This will take a long time... Done. Below benchmark results: Memory performance: memcpy: 1108.8 MB/s memset: 1516.6 MB/s 7-zip total scores (three runs): OpenSSL results: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 37186.77k 45709.08k 48782.93k 49392.30k 49815.55k aes-128-cbc 37996.50k 46337.66k 49573.89k 50489.34k 50686.63k aes-192-cbc 33266.59k 39735.40k 41970.09k 42459.82k 42748.59k aes-192-cbc 33093.75k 38628.12k 41025.45k 41634.47k 41806.51k aes-256-cbc 29922.34k 35095.13k 36887.04k 37195.78k 37486.59k aes-256-cbc 29542.89k 34427.33k 36255.91k 36720.30k 36828.50k Full results uploaded to http://ix.io/1isM. Please check the log for anomalies (e.g. swapping or throttling happenend) and otherwise share this URL. And here the results of Raspbian with cooling. All normal here.http://ix.io/1isD I thought it was at 70°C it would throttle to 1200Mhz. Or am I wrong here? It shows that even the Rasp 3b+ needs a fan for good performance. Maybe I'll make a video about it. Thanks for all the info. Cheers 0 Quote
tkaiser Posted July 26, 2018 Posted July 26, 2018 3 hours ago, NicoD said: Full results uploaded to http://ix.io/1ism Thank you! So the new RPi 3 B+ with latest updates applied silently downclocks even when there's just boring tinymembench running: System health while running tinymembench: Time fake/real load %cpu %sys %usr %nice %io %irq CPU VCore 16:56:39: 1400/1400MHz 0.28 14% 0% 12% 0% 0% 0% 60.1°C 1.3250V 16:57:39: 1400/1200MHz 0.67 21% 0% 21% 0% 0% 0% 62.8°C 1.2313V Would be funny to repeat the test this time with fan active since I would believe the RPi clowns do not ony downclock CPU cores but most probably also GPU, VPU and DRAM. A second test with fan should clarify. Edit: already provided by @NicoD in the meantime 3 hours ago, NicoD said: I had a lot better results in Ubuntu with the Rasp3b+, while the rasp3b is slower in Ubuntu vs raspbian Upstream Ubuntu armhf packages are build with another GCC version and different compiler switches (for ARMv7) while Raspbian builds everything for ARMv6 to support their single core boards too. But Raspbian uses more aggressive compiler switches so that some code (e.g. the funny sysbench joke) performs better with the Raspbian ARMv6 binary compared to an upstream Debian or Ubuntu armhf package: see sysbench pseudo benchmark numbers made with my OMV images for RPi (using an Armbian armhf userland combined with the proprietary RPi stuff): https://forum.armbian.com/topic/1748-sbc-consumptionperformance-comparisons/?page=2 Way more important what everyone ignores: the Raspberry Pi is NOT an ARM SBC like all the other boards we're using. It's a VideoCore IV (VC4) SBC with some crappily integrated ARM cores. The VC4 is the primary CPU and runs a closed source RTOS called ThreadX that fully controls the hardware. The ARM cores are just guest processors (called 'third class citizens' by the lady who tried to develop an open source replacement for the proprietary ThreadX stuff) and are only able to run a secondary OS like e.g. Linux that has not even a clue at which clockspeeds it's running 4 weeks ago the RPi clowns decided to release a new ThreadX release which contains a significant change: as soon as the SoC temperature exceeds 60"C on the RPi 3 B+ some subsystems will be silently downclocked. Since they're cheating you can't realize that by querying the usual sysfs node. In the past it was possible to spot this cheating by 'vcgencmd get_throttled' which reported throttling (and also frequency capping and undervoltage) since last reboot. Now they cheat even more and with this first clock reduction from 1.4 GHz to 1.2 GHz the relevant throttling bit will not be set any more. In other words: 4 weeks ago the vast majority of RPi 3 B+ out there was a bit faster compared to after applying latest updates. The closed sourced main OS ThreadX is available to us only as BLOBs living on the FAT partition below /boot (on RPi OS images it's the 'raspberrypi-bootloader' package pulled in from archive.raspberrypi.org). This is a typical 'commit' (exchanged BLOBs no one outside RPi Trading and Broadcom can look into): https://github.com/raspberrypi/firmware/commit/0bef3cb16d600292d4185796cc042fd564bc694d The whole hardware initialization as well as everything that's performance relevant happens in ThreadX, the ways to monitor what's really happening when looking from the secondary OS (Linux) are crippled (since mailbox driver is cheating and reporting fantasy clockspeeds) so on this VC4 platform it's even more important to permanently monitor as good as possible what's happening. Since benchmarking without checking what's really happening is only generating numbers without meaning. TL;DR: RPi clowns decided few weeks ago to trash performance of all RPi 3 B+ out there to address the instability problems some board owners suffer from. Problem as well as workaround to get back old behaviour described here: https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=217056#p1335342 0 Quote
chwe Posted July 26, 2018 Posted July 26, 2018 10 minutes ago, pbies said: @chwe I always try to simplify things when there is such possibility, so I'll be short on what you written: it is a specific usage and the CPU core is the most important part. That's why I need reliable benchmark of CPU, NOT RAM, NOT IO, NOT disk. And yes, cache is part of CPU - good for my application. You should read through Ulrich Dreppers work.. To ensure that this stuff happens on L1 or L2 cache you must put a bunch of work in your code. And then bechmark your code. But then your first post is IMO worthless (I use tool X, benchmark processor A B and C and X runs on on A 10 times faster than on B and on C it's half the speed of B).. If for your application CPU only matters than you should maybe read into what your SoC is capable to do (e.g. stuff like NEON may matter). Compiler and compiler flags too... Probably cache and cache usage too. Does the benchmark represent all these factors? Test A and look at B to conclude C is often how such stuff ends (I'm a heretic, but I did/do it to often even on fields which I think I understand what I do - e.g. optimize reaction parameters in chemistry with model molecules cause the target molecule wasn't available in the amounts needed to do proper optimization, happens quite often that something doesn't work as expected as soon as you use the target molecule).. In this case you should IMO exactly understand what your benchmark benchmarks before you even can assume that the benchmark will be somehow correlate with your 'program'. For a CPU benchmark: Can you be sure that stuff happens on the CPU and RAM is not involved? Can you monitor CPU clockspeed without affecting the benchmark (or at least, is this monitoring affecting the benchmark reliable) How can you make sure that your results are reliable? Are you sure that nothing happens in the Background which affects your results? From there, it looks for me that it only makes sense to benchmark your piece of code every CPU you might consider as useful for it. Cause then, you're at least sure that B is out of the equation and you only measure A and probably conclude a wrong C (at least one possibility of errors less). 39 minutes ago, pbies said: Also in multi-threaded and multi-core benchmarks, I think, you will not find the same results even on the same machine in the same circumstances. As soon as this happens, this should be a big red flag. Cause I'm not interested in *single thing* benchmarking I 'can' happily ignore such stuff. I don't have to care if the kernel scheduler works properly or if this CPU may perform better with some out of tree kernel schedulers (I think FaceBook developed once a own scheduler or was it google? don't know anymore). All this stuff will be somehow in the S/N of the overall benchmarking of the system. But when you start to benchmark single points of a system IMO you need to take such stuff into account.. how much do my single results difference from each other how many results do I need for a proper statement which statistics is needed and are they correct to represent my results (e.g. geometric vs. arithmetic mean etc. - I would guess the geometric is appropriate but don't hang me on this) Going down this whole rabbit hole just to publish some results which weren't of much interest for me wasn't something I wanted to do (it was interesting to read into the topic but I don't feel competent enough to publish the results). On the other hand, benchmarks with parts I can replace are interesting.. Which SD-Card should I use cause it performs better than others? Compare two arm SBCs with different SoCs might be also interesting (I can replace my SBC if the one I own isn't sufficient for my needs.. ). If you compare then through different CPU architectures you open a next rabbit hole. Was the benchmark developed for this purpose? Or was it just made to compare different CPUs from the same family. It might fit perfect to compare different x86 systems but completely suck to compare arm and x86. 1 hour ago, pbies said: And also I don't belive that I am only one on the planet that is searching for such app. It should exist. Even if it doesn't exist - one should be able to write it. If it would be that easy, someone would already wrote it. It's not that there isn't a market for reliable benchmark tools. 0 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 8 minutes ago, tkaiser said: Would be funny to repeat the test this time with fan active since I would believe the RPi clowns do not ony downclock CPU cores but most probably also GPU, VPU and DRAM. A second test with fan should clarify. Here with cooling. http://ix.io/1isD Then what is `better` to the +? 1/3 gigabit ethernet., wifi and a metal surounding over the soc. That`s all the inovation they did in a year. I can`t understand why they came with the 3b+. They have the biggest part of the market and come with sh*t products. Is it ok I make a video about this, and say your name? "Your Raspberry 3b+ is lying to you" I think it would get views. Thanks. 0 Quote
tkaiser Posted July 26, 2018 Posted July 26, 2018 31 minutes ago, NicoD said: I thought it was at 70°C it would throttle to 1200Mhz. Or am I wrong here? They silently changed this few weeks ago with a new ThreadX release (the primary OS they call 'firmware'). BTW: I added monitoring of ThreadX settings (search for config.txt in http://ix.io/1isD output) and saw that you didn't use any overclock settings. Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now. 0 Quote
NicoD Posted July 26, 2018 Posted July 26, 2018 4 minutes ago, tkaiser said: Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now. I`ll do that tomorrow. I`ve been awake too long. Can`t focus anymore. Cheers 0 Quote
tkaiser Posted July 26, 2018 Posted July 26, 2018 2 hours ago, NicoD said: I can`t understand why they came with the 3b+ To sell more of these devices making nice profits? The average RPi user is pretty clueless so all that's needed to sell a new 'incremental update' is mentioning that it's faster. In fact the 3 B+ was a little bit faster for some months (1.4 GHz vs. 1.2 GHz and way better PCB design plus heatspreader resulted in higher sustained performance). Now that all the benchmarks are published they silently reverted the higher performance since everything demanding that would need the 1.4 GHz will now trigger the 60°C throttling treshold easily. But hey, RPi users won't realize since /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq shows only bogus numbers. Same when undervoltage occurs. In such a situation (input voltage dropping below 4.65V which happens very very very often with Raspberries not using their 'special PSU' but standard Micro USB gear) the ARM cores are immediately downclocked to 600 MHz while /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq happily lies about 1400 MHz on the 3 B+ or 1200 MHz on the 3 B. BTW: The VC4 is a 2010 design and nothing except exchanged ARM cores has changed. They have nothing else. If they would switch to a new SoC backwards compatibility wouldn't exist any more. I wouldn't be surprised if we see 2019 a last incremental update (then using an eMMC socket, implementing SDR104 mode for faster SD card access and Wi-Fi with 2x2 MIMO and real antennas) and in 2020 they're simply telling 'game over'. 1 Quote
chwe Posted July 26, 2018 Posted July 26, 2018 11 minutes ago, tkaiser said: I wouldn't be surprised if we see 2019 a last incremental update (then using an eMMC socket, implementing SDR104 mode for faster SD card access and Wi-Fi with 2x2 MIMO and real antennas) and in 2020 they're simply tellling game over. Why so pessimistic/optimistic? (depends highly on point of view ) Just buy and add a cheap AI block, stick it to the SoC. RPi goes AI with Blockchain!!!1! They will survive 2-3 years more with it.. If you accept that the RPi isn't a high performance SBC made to 'learn' programming, not really low level stuff, more different sorts of 'hello world' with GPIOs and/or camera. The RPi is still an affordable board. For this use-case their last iteration is worthless (you still can do this with a RPi3) but that's part of the deal to keep the people happy who pay their rent (seems that they aren't that happy at the moment, but hey, they sold over 19M devices so they can't be wrong right? At least it seems to be their main excuse for every thing they didn't do right). Probably we should split the RPi rant part in the RPi thread, and @NicoDs 7zip results into this one? 1 Quote
tkaiser Posted July 26, 2018 Posted July 26, 2018 50 minutes ago, chwe said: Probably we should split the RPi rant part in the RPi thread No, please. The 'rant' highlights what's problematic when benchmarking boards: the existence of an operating system the average user doesn't know about (since being told ThreadX would just be some 'firmware') and how updates of this primary OS can affect performance behaviour and how hard it is to monitor this stuff to get a clue why performance differs when running this OS image vs. that OS image (especially keeping in mind that those secondary Linux operating systems pull in updates for the primary OS that change whole system behaviour) Let's keep this stuff collected here. When I soon start some sort of a tutorial how to benchmark correctly I will reference some posts here. 0 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 16 hours ago, tkaiser said: and saw that you didn't use any overclock settings. Might be interesting whether tuning of DRAM settings also gets reverted when SoC temperature exceeds 60°C now. The RPI 3b+ all overclocks. It does downclock to 1200Mhz. But the voltage isn't lowered. Always 1.3312V So it reaches a much higher temp of +80°C. In tinymembench it does perform better. Here's the result overclocked in Raspbian. pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh Installing needed tools. This may take some time... Done. Executing tinymembench. This will take a long time... Done. Executing 7-zip benchmark. This will take a long time... Done. Executing OpenSSL benchmark. This will take a long time... Done. Below benchmark results: Memory performance: memcpy: 1312.4 MB/s (0.1%) memset: 1821.1 MB/s (0.3%) 7-zip total scores (three runs): 3313,3285,3050 OpenSSL results: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 30650.45k 42188.33k 46899.54k 48080.90k 48657.75k 48655.02k aes-128-cbc 30022.42k 41722.30k 46365.70k 47691.09k 47920.47k 48168.96k aes-192-cbc 27145.16k 35704.23k 39041.02k 39630.85k 40457.56k 40594.09k aes-192-cbc 27553.39k 36428.63k 39753.47k 40870.91k 41170.26k 41041.92k aes-256-cbc 25483.15k 32836.35k 35496.96k 36387.16k 36626.43k 36640.09k aes-256-cbc 24849.98k 32170.92k 34895.19k 35523.24k 35700.74k 35471.36k Full results uploaded to http://ix.io/1ivA. Please check the log for anomalies (e.g. swapping or throttling happenend) and otherwise share this URL. Here's the same overclock with cooling as comparison. pi@raspberrypi:~ $ sudo /usr/local/bin/sbc-bench.sh Installing needed tools. This may take some time... Done. Executing tinymembench. This will take a long time... Done. Executing 7-zip benchmark. This will take a long time... Done. Executing OpenSSL benchmark. This will take a long time... Done. Below benchmark results: Memory performance: memcpy: 1315.1 MB/s memset: 1945.6 MB/s 7-zip total scores (three runs): 3907,3629,3549 OpenSSL results: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 40217.22k 55205.89k 61371.56k 62970.54k 63670.95k 63706.45k aes-128-cbc 40171.25k 55206.31k 61369.51k 63136.77k 63501.65k 63668.22k aes-192-cbc 36252.93k 47764.89k 52463.45k 53744.64k 53998.93k 54160.04k aes-192-cbc 36250.63k 47887.83k 52324.01k 53736.45k 54138.20k 54012.59k aes-256-cbc 33070.41k 42646.83k 46240.09k 47617.37k 47928.66k 47950.51k aes-256-cbc 33311.06k 42963.97k 46611.37k 47477.42k 47925.93k 47945.05k Full results uploaded to http://ix.io/1ivO. Please check the log for anomalies (e.g. swapping or throttling happenend) and otherwise share this URL. I'll do more tests.... 0 Quote
tkaiser Posted July 27, 2018 Posted July 27, 2018 3 hours ago, NicoD said: I'll do more tests.... In the meantime I improved the script in some areas (especially throttling/undervoltage warnings). To get rid of CRLF it's as easy as wget -O - https://pastebin.com/raw/CXtt28y1 | tr -d "\015" >/usr/local/bin/sbc-bench.sh chmod 755 /usr/local/bin/sbc-bench.sh sudo /usr/local/bin/sbc-bench.sh Tested already with a bunch of boards (all the time in situations with active cooling to test for stuff like kernel differences or architecture): Raspberry Pi 2, kernel 4.14, default RPi settings, not throttled: http://ix.io/1ivw NanoPC T4, kernel 4.17, preliminary settings, not throttled: http://ix.io/1ivB RockPro64, kernel 4.18, ayufan/arm64 settings, not throttled: http://ix.io/1iw5 RockPro64, kernel 4.4, ayufan/arm64 settings, not throttled: http://ix.io/1ivR NanoPi Fire3, kernel 4.14, Armbian settings, not throttled, zram swapping: http://ix.io/1ivC Clearfog Pro, kernel 4.14, Armbian settings, not throttled: http://ix.io/1ivE Rock64, kernel 4.4, Armbian settings, not throttled: http://ix.io/1ivG Rock64, kernel 4.4, ayufan/armhf settings, also 1392 MHz, not throttled: http://ix.io/1iwz The interesting stuff as follows: When comparing the RK3399 boards (NanoPC T4 and RockPro) kernel version makes a huge difference wrt memory bandwidth/latency which also results in different 7-zip scores arm64 vs. armhf (Rock64) is not that much of an issue. The armhf binary is slightly slower but on the other hand an armhf userland can cope with less available physical memory NanoPi Fire has 8 CPU cores but just 1 GB DRAM which results in a big problem with almost all workloads that would benefit from 'as much CPU cores as possible'. As a result swapping happens. With recent Armbian not that much of a problem since we switched from SD card based emergency swap to zram which works pretty well. But when running sbc-bench with a different distro relying on swap numbers might be much lower since storage becomes the bottleneck (TBC). I'll push the script plus explanations on Github over the weekend and create an own thread for the tool. 0 Quote
tkaiser Posted July 27, 2018 Posted July 27, 2018 2 hours ago, NicoD said: Here's the same overclock with cooling as comparison Interesting, thanks. Still throttling happened since you hit the 60°C treshold multiple times, see the 7-zip results from 3 consecutive runs: 3907,3629,3549 (declining) and 11:34:15: 1570/1200MHz 3.57 83% 1% 82% 0% 0% 0% 60.1°C 1.3312V 11:35:00: 1570/1200MHz 3.90 80% 1% 79% 0% 0% 0% 59.6°C 1.3312V You would nee to add 'temp_soft_limit=70' to /boot/config.txt and reboot to get back 'old behaviour' so silent throttling starts as 70°C as prior to 'Jul 3 2018 14:15:46' (that's the timestamp of latest ThreadX update that destroys performance on every RPi 3+ around). 0 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 I also made a walk-through with the information I had from you. sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw sudo apt-get install dos2unix sudo dos2unix /usr/local/bin/sbc-bench.sh sudo chmod 755 /usr/local/bin/sbc-bench.sh sudo /usr/local/bin/sbc-bench.sh I'm now doing the Khadas Vim2 Max. I'll start using the new script after I'm done with the Khadas. I'll do all of them I've got that aren't in the list. I am wondering how the NanoPC-T4 does against my XU4 and the NanoPC-T3+ I'll keep you informed of all the results. 0 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 4 minutes ago, tkaiser said: Interesting, thanks. Still throttling happened since you hit the 60°C treshold multiple times, see the 7-zip results from 3 consecutive runs: 3907,3629,3549 (declining) and I didn't add my heatsink. I'll do that again. I think it stays under 60 then. And it's about the hottest day ever, how to cool then... 0 Quote
tkaiser Posted July 27, 2018 Posted July 27, 2018 53 minutes ago, NicoD said: sudo wget -O /usr/local/bin/sbc-bench.sh https://pastebin.com/raw/Ww84KMmw Please https://pastebin.com/raw/CXtt28y1 instead and no need for dos2unix since this can simply be achieved by stripping out the CR characters ('\015') as shown above using tr. Vim2 will be very interesting since strange things happen there. Really curious about the results. NanoPC T4 results are above (but with conservative settings limiting CPU cores to 1.8/1.4GHz instead of 2.0/1.5GHz we'll use later) and NanoPC-T3+ will be more or less the same as NanoPi Fire3 since same SoC but different amount (and maybe type) of DRAM. And yeah, XU4 is also interesting. 1 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 @tkaiser I ain't getting an output url with the khadas. Any idea why? Internet works. Below benchmark results: Memory performance (on big.LITTLE systems measured individually): memcpy: 1846.3 MB/s memset: 5735.6 MB/s (1.1%) memcpy: 1615.8 MB/s (0.1%) memset: 4998.7 MB/s 7-zip total scores (three runs): 4803,4848,4790 OpenSSL results (on big.LITTLE systems measured individually): type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-cbc 177427.08k 474573.65k 793271.13k 986034.86k 1061098.84k aes-128-cbc 125519.67k 335571.22k 561038.51k 697529.00k 750712.15k aes-192-cbc 165722.43k 409484.03k 633686.95k 754190.34k 798203.90k aes-192-cbc 117216.16k 288164.71k 448321.79k 533306.71k 564486.14k aes-256-cbc 158712.70k 365550.08k 542333.27k 627987.11k 658262.70k aes-256-cbc 112265.98k 259799.94k 383578.03k 444206.42k 465603.24k Full results uploaded to . Please check the log for anomalies (e.g. swapping or throttling happenend) and otherwise share this URL. 0 Quote
tkaiser Posted July 27, 2018 Posted July 27, 2018 12 minutes ago, NicoD said: I ain't getting an output url with the khadas. Any idea why? I've seen temporary failures as well when accessing http://ix.io -- guess I have to rework the upload routine to try it even more times (currently upload will be tried twice then given up on) A simple check would be echo Hello world. | curl -F 'f:1=<-' ix.io 0 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 1 minute ago, tkaiser said: (currently upload will be tried twice then given up on) I did the bench twice, same result. Now the xu4 and extra cooled and overclocked rpi 3b+ on the way. 0 Quote
tkaiser Posted July 27, 2018 Posted July 27, 2018 11 minutes ago, NicoD said: Now the xu4 and extra cooled and overclocked rpi 3b+ on the way I pushed the script to Github. From now on the (persistent) URL is: https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/sbc-bench.sh (no need to convert crlf to lf any more) 0 Quote
NicoD Posted July 27, 2018 Posted July 27, 2018 26 minutes ago, tkaiser said: echo Hello world. | curl -F 'f:1=<-' ix.io That works on the Khadas. Rasp 3b+ overclocked and cooled Raspbian 4.14http://ix.io/1iwQ 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.