wildcat_paris Posted June 14, 2016 Posted June 14, 2016 (edited) I am a little surprised with the result of Phenom2 965 vs. Pine64 vs. XU4 time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run Phenom2 965 AMD64 (4 cores) (Ubuntu Trusty kernel 4.2 x86_64 SSE2/3) => 6.8390s gr@gr ~ $ time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 6.8390s total number of events: 10000 total time taken by event execution: 54.6398 per-request statistics: min: 2.69ms avg: 5.46ms max: 57.81ms approx. 95 percentile: 14.69ms Threads fairness: events (avg/stddev): 1250.0000/40.40 execution time (avg/stddev): 6.8300/0.01 real 0m6.845s user 0m26.932s sys 0m0.008s gr@gr ~ $ uname -a Linux gr 4.2.0-37-generic #43~14.04.1-Ubuntu SMP Wed May 18 17:25:51 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux gr@gr ~ $ cat /proc/version Linux version 4.2.0-37-generic (buildd@lgw01-20) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) ) #43~14.04.1-Ubuntu SMP Wed May 18 17:25:51 UTC 2016 Pine64 (arm64) 4 cores (Pine64 based - aarch64 NEON) => 7.9824s ubuntu@pine64:~$ time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 7.9824s total number of events: 10000 total time taken by event execution: 63.7509 per-request statistics: min: 3.17ms avg: 6.38ms max: 33.21ms approx. 95 percentile: 15.68ms Threads fairness: events (avg/stddev): 1250.0000/6.67 execution time (avg/stddev): 7.9689/0.01 real 0m8.002s user 0m31.320s sys 0m0.000s ubuntu@pine64:~$ uname -a Linux pine64 3.10.101-0-pine64-longsleep #39 SMP PREEMPT Sat May 7 12:39:25 CEST 2016 aarch64 aarch64 aarch64 GNU/Linux ubuntu@pine64:~$ zcat /proc/config.gz | grep NEON CONFIG_KERNEL_MODE_NEON=y CONFIG_CRYPTO_AES_ARM64_NEON_BLK=y XU4 (4+4 cores) (Armbian armv7l NEON) => 45.4731s gr@odroidxu4:~$ time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 45.4731s total number of events: 10000 total time taken by event execution: 363.5303 per-request statistics: min: 22.21ms avg: 36.35ms max: 246.75ms approx. 95 percentile: 55.20ms Threads fairness: events (avg/stddev): 1250.0000/426.03 execution time (avg/stddev): 45.4413/0.01 real 0m45.530s user 5m58.405s sys 0m0.135s gr@odroidxu4:~$ zcat /proc/config.gz | grep NEON CONFIG_NEON=y gr@odroidxu4:~$ uname -a Linux odroidxu4 3.10.101-odroidxu4 #3 SMP PREEMPT Mon May 23 22:45:55 CEST 2016 armv7l armv7l armv7l GNU/Linux Lamobo-R1 A20 2 cores (Armbian custom - armv7l NEON) => 442.9593s [gr@bpi:~] $ time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 442.9593s total number of events: 10000 total time taken by event execution: 3542.8610 per-request statistics: min: 137.58ms avg: 354.29ms max: 1404.13ms approx. 95 percentile: 427.10ms Threads fairness: events (avg/stddev): 1250.0000/17.83 execution time (avg/stddev): 442.8576/0.04 real 7m22.989s user 13m8.570s sys 0m49.780s [gr@bpi:~] $ zcat /proc/config.gz | grep NEON CONFIG_NEON=y CONFIG_KERNEL_MODE_NEON=y CONFIG_CRYPTO_SHA1_ARM_NEON=m [gr@bpi:~] $ uname -a Linux bpi 4.6.2-sunxi #1 SMP Sun Jun 12 21:59:49 CEST 2016 armv7l armv7l armv7l GNU/Linux any idea where I am wrong, please? Edited June 14, 2016 by wildcat_paris updated data with ARCH & NEON
tkaiser Posted June 14, 2016 Posted June 14, 2016 any idea where I am wrong, please? You're using sysbench (calculating prime numbers, on some CPUs with optimized engines -- eg. NEON/ARMv8 -- on some not). Sysbench can not be used to compare different architectures (unless your job is to calculate prime numbers, then this might matter for you since you get your prime numbers in less time). Do a simple google search for sysbench kitchen sink site:armbian.com 1
Igor Posted June 14, 2016 Posted June 14, 2016 i7-4790S CPU (4 cores / 8 threads) => 3.2s This must be at least comparable to AMD time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 3.1960s total number of events: 10000 total time taken by event execution: 25.5333 per-request statistics: min: 2.22ms avg: 2.55ms max: 27.11ms approx. 95 percentile: 2.52ms Threads fairness: events (avg/stddev): 1250.0000/51.15 execution time (avg/stddev): 3.1917/0.00 real 0m3.198s user 0m25.048s sys 0m0.000s 1
kometchtech Posted June 14, 2016 Posted June 14, 2016 FYI. NanoPC-T3 result. time sysbench --test=cpu --cpu-max-prime=20000 --num-threads=8 run sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 8 Doing CPU performance benchmark Threads started! Done. Maximum prime number checked in CPU test: 20000 Test execution summary: total time: 57.1916s total number of events: 10000 total time taken by event execution: 457.3030 per-request statistics: min: 45.43ms avg: 45.73ms max: 166.89ms approx. 95 percentile: 45.73ms Threads fairness: events (avg/stddev): 1250.0000/1.00 execution time (avg/stddev): 57.1629/0.01 real 0m57.223s user 7m36.520s sys 0m0.068s
tkaiser Posted June 14, 2016 Posted June 14, 2016 FYI. NanoPC-T3 result. execution time (avg/stddev): 57.1629/0.01 Yes, this one is nice! The result of having an octa-core 64-bit Cortex-A53 design running an old boring 32-bit kernel not making use of ARMv8 instruction set! Just like RPi 3. Or NanoPi M3 How does the output of the following command looks like? uname -a cat /proc/version
kometchtech Posted June 14, 2016 Posted June 14, 2016 (edited) Yes, this one is nice! The result of having an octa-core 64-bit Cortex-A53 design running an old boring 32-bit kernel not making use of ARMv8 instruction set! Just like RPi 3. Or NanoPi M3 How does the output of the following command looks like? uname -a cat /proc/version FYI tkaiser uname -a Linux dns02.kometch.local 3.4.39-s5p6818 #2 SMP PREEMPT Fri May 20 15:51:46 HKT 2016 armv7l GNU/Linux cat /proc/version Linux version 3.4.39-s5p6818 (root@jensen) (gcc version 4.9.3 (ctng-1.21.0-229g-FA) ) #2 SMP PREEMPT Fri May 20 15:51:46 HKT 2016 Edited June 14, 2016 by wildcat_paris typo it is tkaiser, not tkai-tsar :)
wildcat_paris Posted June 14, 2016 Author Posted June 14, 2016 (edited) Hello Tk, sysbench kitchen sink site:armbian.com either google or duckduckgo give "nothing" interesting. edit : updated the first post with ARCH & NEON (+Distro) Edited June 14, 2016 by wildcat_paris edit ARCH + NEON
wildcat_paris Posted June 14, 2016 Author Posted June 14, 2016 @kometchtech uname -a Linux dns02.kometch.local 3.4.39-s5p6818 #2 SMP PREEMPT Fri May 20 15:51:46 HKT 2016 armv7l GNU/Linux ubuntu@pine64:~$ uname -a Linux pine64 3.10.101-0-pine64-longsleep #39 SMP PREEMPT Sat May 7 12:39:25 CEST 2016 aarch64 aarch64 aarch64 GNU/Linux => aarch64 vs. armv7l
kometchtech Posted June 14, 2016 Posted June 14, 2016 @wildcat_paris Thanks for the Info! Or better still you recompiled in aarch64... Only official, there is that way.
wildcat_paris Posted June 14, 2016 Author Posted June 14, 2016 (edited) @kometchtech @wildcat_paris Thanks for the Info! Or better still you recompiled in aarch64... Only official, there is that way. Linux dns02.kometch.local 3.4.39-s5p6818 => armv7l As you know how to compile the kernel, you may add the aarch64 compiler to see if the kernel is working (or the recipe only work on aarch32 because of bootloader limitation???) well it is Samsung based... they recently moved forward on ARM7 XU4 on kernel 4.6-4.7 beta, Samsung OpenSource can work for their customers (that is good) edit 1 ubuntu@pine64:~$ dpkg -l "*gcc*" | grep arm64 ii gcc 4:5.3.1-1ubuntu1 arm64 GNU C compiler ii gcc-5 5.3.1-14ubuntu2.1 arm64 GNU C compiler ii gcc-5-base:arm64 5.3.1-14ubuntu2.1 arm64 GCC, the GNU Compiler Collection (base package) ii gcc-6-base:arm64 6.0.1-0ubuntu1 arm64 GCC, the GNU Compiler Collection (base package) ii libgcc-5-dev:arm64 5.3.1-14ubuntu2.1 arm64 GCC support library (development files) ii libgcc1:arm64 1:6.0.1-0ubuntu1 arm64 GCC support library my VM to build Armbian gr@server1404:~$ dpkg -l "*gcc*" | grep arm64 ii gcc-aarch64-linux-gnu 4:4.8.2-1 amd64 The GNU C compiler for arm64 architecture ii libgcc-4.8-dev-arm64-cross 4.8.4-2ubuntu1~14.04.1cross0.11.2 all GCC support library (development files) ii libgcc1-arm64-cross 1:4.8.4-2ubuntu1~14.04.1cross0.11.2 all GCC support library gr@server1404:~$ dpkg -l "*cpp*" | grep arm64 ii cpp-aarch64-linux-gnu 4:4.8.2-1 amd64 The GNU C preprocessor (cpp) for arm64 architecture Edited June 14, 2016 by wildcat_paris edit #1
kometchtech Posted June 14, 2016 Posted June 14, 2016 @wildcat_paris I do not know how to aarch64 because the official Git doesn't work like that. The Forum is not working actively. edit:https://github.com/friendlyarm/
wildcat_paris Posted June 14, 2016 Author Posted June 14, 2016 @kometchtech if you manage to install the cross compiler for aarch64 as you are using kernel 3.4.x, probably gcc 4.8 or 4.9 would be accurate (newer gcc may give you compilation error) CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make clean defconfig CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j4 Image if you cannot find a PPA or cross compiler toolchain provided by your distro you can still download the toolchain from linaro https://www.linaro.org/downloads/historic/(older version) But still you may face the issue if the bootloader cannot load the aarch64 kernel
kometchtech Posted June 15, 2016 Posted June 15, 2016 @kometchtech if you manage to install the cross compiler for aarch64 as you are using kernel 3.4.x, probably gcc 4.8 or 4.9 would be accurate (newer gcc may give you compilation error) CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make clean defconfig CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j4 Image if you cannot find a PPA or cross compiler toolchain provided by your distro you can still download the toolchain from linaro https://www.linaro.org/downloads/historic/(older version) But still you may face the issue if the bootloader cannot load the aarch64 kernel Thanks,@wildcat_paris Shown below, seems to not compile. # CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make clean defconfig Makefile:568: /home/linux-3.4.y/arch/arm64/Makefile: No such file or directory make[1]: *** No rule to make target `/home/linux-3.4.y/arch/arm64/Makefile'. Stop. make: *** [clean] Error 2 1
kometchtech Posted June 15, 2016 Posted June 15, 2016 So do not arm64 in the following source is apparently not compile. Do you have a solution? https://github.com/friendlyarm/linux-3.4.y
tkaiser Posted June 15, 2016 Posted June 15, 2016 well it is Samsung based... they recently moved forward on ARM7 XU4 on kernel 4.6-4.7 beta, Samsung OpenSource can work for their customers (that is good) Nope, it is NOT Samsung (even if everybody does copy&paste and writes just that). This SoC is from Nexell instead: http://www.nexell.co.kr/chi/pro/pro04.html A SoC for the chinese tablet/smartphone market: many cores, low performance, crappy software. So everything Samsung does for their SoCs is absolutely irrelevant for Nexell SoCs that will remain at a 32-bit 3.4.39 kernel forever (and still there is the claim that this is in reality a 2.6.x kernel: "Specifically, this is a Linux-3.4 kernel that looks more like a Linux-2.6.28 platform port that was forward-ported.")
wildcat_paris Posted June 15, 2016 Author Posted June 15, 2016 @Tk, Nope, it is NOT Samsung (even if everybody does copy&paste and writes just that). This SoC is from Nexell instead: http://www.nexell.co.kr/chi/pro/pro04.html A SoC for the chinese tablet/smartphone market: many cores, low performance, crappy software. So everything Samsung does for their SoCs is absolutely irrelevant for Nexell SoCs that will remain at a 32-bit 3.4.39 kernel forever (and still there is the claim that this is in reality a 2.6.x kernel: "Specifically, this is a Linux-3.4 kernel that looks more like a Linux-2.6.28 platform port that was forward-ported.") ok, thank you so much. I already read the comment about the "linux2.6-like porting" it seems this company (nexell) just misleads users/sellers using Samsung-like & NXP-semiconductor-like naming -- see http://www.nexell.co.kr/eng/pro/pro03.html I am wondering why NXP & Samsung are not starting a lawsuit and disband their "CEO" http://www.nexell.co.kr/eng/com/com02.html (unless Nexell is a low-cost venture of NXP & Samsung)
wildcat_paris Posted June 15, 2016 Author Posted June 15, 2016 @kometchtech So do not arm64 in the following source is apparently not compile. Do you have a solution? https://github.com/friendlyarm/linux-3.4.y Sorry I guess there is nothing we can do, as there is no aarch64 architecture in this kernel. Fortunately I ordered a FriendlyARM NanoPi M1 with the Allwinner SoC -- as a cheaper RPi2 (note: I had to ask FriendlyARM to send me the board -- 2 weeks after purchase, the 10$ shipment is fortunately "registered mail" I can easily track). Let's see how fine is the board.
tkaiser Posted June 15, 2016 Posted June 15, 2016 the 10$ shipment ...has been reduced to $5 maybe due to some public poking 1
wildcat_paris Posted June 15, 2016 Author Posted June 15, 2016 (edited) ...has been reduced to $5 maybe due to some public poking thanks TK that may explain the delay but I hope for $5 it is still "registered mail" you can track from 17track/afterpost and in also with French Post in realtime. French Post (France-to-France) is so much expensive that $10 for international tracking is not so much edit: yes it is now $5 or €4.44 Edited June 15, 2016 by wildcat_paris $5
tkaiser Posted June 15, 2016 Posted June 15, 2016 either google or duckduckgo give "nothing" interesting. Post #6. but I hope for $5 it is still "registered mail" you can track from 17track/afterpost and in also with French Post in realtime. What do you plan to do with the M1? Using it with a 5MP camera module? I ask since this is still the only use case or why to buy a NanoPi. To be honest: Apart from the camera module NanoPi M1 with 1GB DRAM is almost identical to Orange Pi PC but the latter is faster (due to better voltage regulator and better overall heat dissipation limiting throttling situations) and costs less even when shipping is also considered. I still feel I'm overseeing something. BTW: Regarding French Post I made some (pretty bad) experiences. My daughter moved to Montpellier last year and it was always 'adventure time' sending her parcels 1
tkaiser Posted June 15, 2016 Posted June 15, 2016 To be honest: Apart from the camera module NanoPi M1 with 1GB DRAM is almost identical to Orange Pi PC but the latter is faster (due to better voltage regulator and better overall heat dissipation limiting throttling situations) and costs less even when shipping is also considered. I still feel I'm overseeing something. In the meantime I found a 2nd use case: Using NanoPi M1 as Ethernet dongle doing fancy network stuff (for example handing this out to others to force them connecting them to their local network only through the H3 Ethernet dongle to ensure an uncompromised VPN endpoint for example) NanoPi M1 could be powered reliably through an USB computer port given a short USB cable with low resistance is used and some settings are adjusted to ensure consumption never exceeds 2.5W which is possible limiting CPU clockspeed to 600 MHz for example (500mA should be provided at 5V --> 2.5W) Why NanoPi M1 instead of small Oranges? Since the latter would require a little hardware tweak to power on when connected to an USB computer port through their OTG port: http://blog.atx.name/orange-pi-pc-first-impressions/ 1
wildcat_paris Posted June 15, 2016 Author Posted June 15, 2016 @Tk Believe me, you will buy a NanoPi M1 soon The ethernet dongle is a nice idea, btw, I can use the NanoPi M1 to move the Pi-Hole DNS https://pi-hole.net/ from my Lamobo-R1 router (I am waiting my Turris Omnia) *** As I wrote, I guess I will be using the M1 as a Rpi2 "replacement", maybe with realtime kernel Patch as the Armbian "lib" tool makes it possible Because I have recently learned how to burn an Arduino Uno fake copy. I have replaced my burnt board with 2x Sainsmart UNO and learnt how to avoid burning another "ONE" (projects including: battery charger with LCD and other DIY projects) But with my Avalanche-based multi-module RNG generator + RNG data testing (like ent+dieharder+NIST), I need a cheaper board I can "burn" in case I make a mistake on the GPIO pins. Ok now, I have voltage converters, modern Schmitt triggers (low power, 3.3V) and know-how to avoid burning GPIO pins but in case, a spare is always useful. So NanoPi M1 + Armbian looks like a good choice. RPi2 and RPi3 are *very* expensive boards. Now Farnell has been bought by a Swiss company, the Raspi prices will probably rise even more.
Recommended Posts