Jump to content

tkaiser

Members
  • Posts

    5462
  • Joined

Everything posted by tkaiser

  1. Ok, then to get more reasonable readouts on your system the following will 'fix' this: mkdir -p /etc/armbianmonitor/datasources ln -s /sys/devices/virtual/thermal/thermal_zone0/temp /etc/armbianmonitor/datasources/soctemp The real clockspeeds reported look ok (though I've no idea why) since this time you get 7680 7-zip MIPS while it was 5290 before. On the other hand now your tinymembench numbers were lower in the beginning than last time. So this time your CPU cores were bottlenecked to ~1565 in the beginning and later clockspeeds jumped up to ~2415. Benchmarking is never easy and that's what the various monitoring stuff in sbc-bench wants to take care of. At least we identified that the cpufreq driver has no control over cpufreq on your Jetson (just like on Raspberries or various Amlogic SoCs). No idea what's responsible for this but mpst probably it's an MCU running inside the SoC controlled by a firmware BLOB? Maybe another run with creation of /etc/armbianmonitor/datasources/soctemp first and latest sbc-bench gives more clues. Or you might search sysfs for "*volts" entries since maybe cpufreq/DVFS scaling depends on undervoltage (just like on Raspberry Pi ) too or something like that? BTW: I prepared sbc-bench to run on x86 SBC too in the meantime but due to being too lazy to search for my UP Board I did the changes on a Xeon box I had to check cpufreq governor behaviour anyway. With ondemand and performance governors TurboBoost can be spotted nicely: http://ix.io/1j79
  2. Aliexpress and FriendlyELEC? Usually not a good idea unless you want to pay more for no reason: https://aliexpress.com/item/NanoPi-K2-Development-Board-Quad-core-Cortex-A53-1-5GHz-WiFi-Bluetooth-USB-Cable-RC100-Remote/32813030887.html vs. https://www.friendlyarm.com/index.php?route=product/product&path=69&product_id=186 Those Amlogic SoCs support ABFC so 'raw' memory bandwidth is not that much of an issue. Another comparison between S905 and S905X: http://www.stane1983.com/index.php/2017/08/18/some-thoughts-on-amlogic-part-2/ In case it's not known already: https://fosdem.org/2018/schedule/event/kodi/attachments/slides/2166/export/events/attachments/kodi/slides/2166/FOSDEM_Presentation_2018___Lukas_Rusak.pdf
  3. Great! Then re-running sbc-bench in latest version or at least this little loop https://github.com/ThomasKaiser/sbc-bench/blob/1ff14ce6b30e3131c22bd7d3461b7ccba843a72f/sbc-bench.sh#L655-L661 would be great and might provide insights why S905 is faster than S905X in some areas.
  4. That's great since the purpose of this test is an overall estimate how performant the relevant board would be when doing 'server stuff'. If you click directly on the 7-zip link here https://github.com/ThomasKaiser/sbc-bench#7-zip then you get an explanation what's going on at the bottom of that page: In other words: as expected. Your use case with Blender is something entirely different and 7-zip scores are useless for this. Primarly for the reason that Blender involves floating point stuff (while 7-zip focuses on integer and low memory latency). It's as always about the use case If we look closely on the other results we see that S905 for example has an advantage with cpuminer compared to the rather similar A53 SoCs S905X and RK3328 (that perform rather identical with 7-zip for example). Maybe the root cause for cpuminer's better scores will also be responsible for better Blender results on S905 compared to other A53 SoCs? It needs a different benchmark and a lot of cross-testing with the real application to get an idea how to reliably test for your use case.
  5. If this increases speed I would strongly suggest looking into @talraash's hint wrt round-trip times (do a ping test to the apt server in question and so on)
  6. Huh? Why posting links to smelly pages found somewhere on the Internet? The RK 'open source SoCs' have a home: http://opensource.rock-chips.com/wiki_Main_Page
  7. The numbers above (mhz, 7-zip) already suggested the Jetson does 'hidden' throttling at around ~1565 MHz. The openssl benchmark might also be useful to compare results and to determine clockspeeds. First row is from an ODROID-XU4 (A15 cores at 2000 MHz), 2nd from your Jetson: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes ODROID-XU4 59359.98k 66782.42k 70469.97k 71398.06k 71740.07k 71527.08k Jetson TK1 40004.73k 50301.27k 54554.88k 55691.61k 56008.70k 55973.21k If we use XU4 numbers as base (1998 MHz) and do some simple math we may determine real clockspeed of the NVIDIA: 1998 / (59359.98 / 40004.73) = 1347 1998 / (66782.42 / 50301.27) = 1505 1998 / (70469.97 / 54554.88) = 1547 1998 / (71398.06 / 55691.61) = 1558 1998 / (71740.07 / 56008.70) = 1560 1998 / (71527.08 / 55973.21) = 1563 With less initialization overhead == larger data chunks (1K or above) this calculation approach seems to work fairly well (due to openssl benchmark not relying on memory bandwidth so 2 different A15 cores could be compared directly). Your A15 cores being limited to ~1565 MHz under load seem to be very plausible.
  8. Thanks for the suggestion. Just did a test on the slowest device I've around (single Cortex-A8 at 912 MHz). Before: Cpufreq OPP: 240 Measured: 113.205/238.015/271.572 Cpufreq OPP: 624 Measured: 622.264/620.388/622.821 Cpufreq OPP: 864 Measured: 859.927/863.043/861.208 Cpufreq OPP: 912 Measured: 910.931/870.387/432.613 And after (now with 100000): Cpufreq OPP: 240 Measured: 216.979/237.445/237.738 Cpufreq OPP: 624 Measured: 584.162/285.411/622.403 Cpufreq OPP: 864 Measured: 862.854/825.796/862.574 Cpufreq OPP: 912 Measured: 908.568/910.098/869.719 Ok, benchmarking such a slow system where background activity is pretty high all the time due to just a single CPU core is rather pointless. But I keep the 100000 for now and change the OPP checking routine after the most demanding benchmark has been running to decline from highest to lowest OPP to hopefully spot hidden throttling as much as possible (like on the Jetson, Amlogic S905X/S912 and of course the Raspberry Pi). Providing a separate sbc-diag is a good idea. This tool could also focus on testing for anomalies wrt cpufreq scaling (sbc-bench immediately switches to performance governor to prevent results being harmed by strange cpufreq scaling behaviour so not suited to give the answer 'why behaves my system so slow?')
  9. The discussion how to proceed with the various upstream kernel branches for RK devices started and unfortunately also ended here: https://forum.armbian.com/topic/7498-nanopc-t4/ (no idea why there are zero responses to @ayufan's ideas) Same here: https://forum.armbian.com/topic/7661-one-bsp-kernel-f-rk3399rk3328-and-rk3288/?do=findComment&comment=58352
  10. Impossible to answer if you don't tell your use case. https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md If the RK3288 claims to run at 1.8 GHz it's less than 1730 MHz in reality and until recently all Linux OS images for RK3328 were limited to 1.3 GHz. We changed this just recently in Armbian (nightlies) and enabled the '1.4 GHz OPP' (1380 MHz in reality) by default while with ayufan images you need to enable higher cpufreq OPP yourself. So usually it's 1726 vs. 1286 MHz which doesn't matter that much since as you pointed out the RK3288 uses high-end ARM cores while the RK3328 relies on slow A53 single-threaded peak performance of the RK3288 at '1.8 GHz' is ~1550 7-zip MIPS while RK3328 scores ~1000 at '1.4 GHz' sustained CPU performance with the RK3288 without huge heatsink (or fan) will pretty fast drop down to RK3328 levels or below. The RK3288 generates way more heat it's always about 'use case first' -- for a 'Desktop Linux' totally different performance metrics are important compared to the 'NAS use case' or when the board should serve as a VPN endpoint.
  11. In this case distro version doesn't matter since cpuminer test fails on 32-bit platforms anyway (and here libs and GCC version would've been important). Buster still uses 7-zip v16.02 so numbers are comparable somewhat. Well, sbc-bench is using @wtarreau's nice mhz tool to calculate real clockspeeds and I hope I use it correctly. It seems CPU clockspeeds are controlled by some firmware in reality and the cpufreq driver reports nonsense since with an idle system we see measured clockspeeds being much higher just to be limited to ~1565 MHz while running the 7-zip benchmark: Cpufreq OPP: 204 Measured: 224.991/224.846/224.977 Cpufreq OPP: 306 Measured: 334.677/334.810/334.586 Cpufreq OPP: 408 Measured: 444.497/444.497/443.872 Cpufreq OPP: 510 Measured: 554.177/554.061/554.038 Cpufreq OPP: 612 Measured: 671.761/670.167/670.167 Cpufreq OPP: 714 Measured: 779.803/780.207/779.591 Cpufreq OPP: 816 Measured: 889.699/889.881/889.642 Cpufreq OPP: 918 Measured: 1003.636/1006.018/1005.933 Cpufreq OPP: 1020 Measured: 1116.065/1115.752/1115.655 Cpufreq OPP: 1122 Measured: 1225.857/1225.945/1225.916 Cpufreq OPP: 1224 Measured: 1335.698/1333.532/1335.864 Cpufreq OPP: 1326 Measured: 1451.911/1452.270/1451.944 Cpufreq OPP: 1428 Measured: 1562.109/1561.977/1562.147 Cpufreq OPP: 1530 Measured: 1562.373/1562.241/1558.959 Cpufreq OPP: 1632 Measured: 1561.882/1561.769/1561.769 Cpufreq OPP: 1734 Measured: 1561.731/1561.580/1561.901 Cpufreq OPP: 1836 Measured: 1561.939/1561.901/1561.920 Cpufreq OPP: 1938 Measured: 1562.241/1562.090/1561.693 Cpufreq OPP: 2014 Measured: 1561.750/1562.165/1561.693 Cpufreq OPP: 2116 Measured: 1561.807/1561.977/1561.825 Cpufreq OPP: 2218 Measured: 1561.825/1559.053/1561.637 Cpufreq OPP: 2320 Measured: 1561.825/1561.958/1561.618 7-zip contains an own measuring routine and seems to agree: CPU Freq: 1425 1524 1557 1558 1557 1558 1558 1557 1557 CPU Freq: 1524 1540 1558 1558 1557 1558 1557 1558 1558 CPU Freq: 1557 1560 1560 1560 1560 1559 1560 1559 1559 CPU Freq: 1562 1562 1563 1563 1563 1563 1563 1563 1563 As a reference Tinkerboard (quad-core A17 in RK3288) scores 5350 7-zip MIPS at 1730 MHz while your Jetson scores less: 5290. Memory bandwidth is much higher on the Jetson so the 1565 MHz start to look plausible. I've sysbench numbers from an ODROID-XU4 here made only on the A15-cluster. 'sysbench --test=cpu --cpu-max-prime=20000 run --num-threads=4' took 62 seconds but that was on Debian Jessie (GCC 4.7) so numbers on Buster (GCC 8.1) will be completely different. Are the reported temperatures real? ~33°C seem way too low and most probably the sysfs node for CPU temperature is a different one than the one my script used. Can you provide the output of the following please? find /sys -name "thermal_zone*" | while read ; do echo "${REPLY}: $(cat ${REPLY}/type) $(cat ${REPLY}/temp)" done
  12. One less and it's even MIPS instead of ARM
  13. Same PCB to provide a HC1 and a HC2. Even with support for NBASE-T (2.5GbE): The remaining PCIe lane exposed as mPCIe allows users to add 2 or 4 more SATA ports and if such a thing would be priced competitively it would be very interesting. @frank-w: Neither USB3 nor SATA are PCIe attached here.
  14. Armbian provides desktop images to allow users to directly download a larger OS image instead of installing this manually since the reality out there is users not being aware what's important with SD cards. They look at totally irrelevant sequential transfer speeds and buy crappy cards by looking at 'reputable brands' that are none. Again: https://forum.armbian.com/topic/954-sd-card-performance/ User reality out there is that crappy flash storage lowers performance of every apt operation (eMMC users usually aren't affected until their eMMC dies and they need to switch to SD card or replace the board). Installing tons of packages on those crappy cards might result in them dying (same with dist-upgrades and such stuff). We can't do that much against this problem other than still trying to educate users about what's important with their rootfs (RANDOM IO performance and choosing good cards). Unless this problem is resolved it's pretty pointless to think about improving download speeds for large package installation orgies since the average user suffers from a different problem. Armbian lacks developers and the TODO list is longer as you might imagine. So I bet the only chance that something will change here is you starting to develop an easy way to let apt-fast do the work in a transparent way and then submit a PR after testing all distro variants Armbian currently supports (no idea whether it can be used as alias, won't look into since irrelevant)
  15. Important information somewhere hidden in random forum post: http://forum.banana-pi.org/t/banana-pi-bpi-r64-open-source-router-with-mtk-mt7622-64-bit-chip-design/6347/9 In other words: There's only USB2 on the mPCIe connector but no PCIe (with the R2 they did it the other way around).
  16. When doing apt on SBC usually the limiting factor is random IO performance of the SD card in question. Apt uses tons of sync calls to ensure stuff on 'disk' is commited correctly (so package handling doesn't break) and with 'average SD cards' this is what's responsible for poor install and update performance. Testing this more or less by accident since a few days since I got some crappy 4 GB cards back from a customer (replaced with good and amazingly fast SanDisk A1 Ultra) and want to destroy them by using them (to see how they fail). An 'apt upgrade' on an image that is a bit outdated takes ages compared to doing the same stuff on a fast A1 card or eMMC (my download connection is 100 Mbits/sec so this is not the bottleneck). Adding to this: Armbian's philosophy is a bit different to other distros. We try to educate our users about such stuff like SD card characteristics so they can either buy better products or learn that sequential writes is something entirely different than random writes. In other words: if you run off an average SD card and then want to 'install the desktop environment' this will take ages due to your SD card and might even wear the SD card out multiple times faster comparing to sequentially burning an Armbian desktop image to the same card in the first place (we provide desktop images for a reason ) TL;DR: I don't think the problem is limited download speeds when talking about slow apt processing.
  17. For now I added just a warning: https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md -- I'll add a TODO section soon where I'll try to explain how to deal with 'numbers generated' vs. 'insights generated': coming up with other benchmarks that more properly describe real-world use cases. Wrt the crypto stuff: Most probably using cryptsetup and then doing also some real-world tasks that can be measured (involving a ton of other dependencies like filesystem performance and so on)
  18. Thank you. Unfortunately our DT clocks little cores just with 1400 MHz and some minor throttling happened: 2000 MHz: 2083.27 sec 1900 MHz: 25.23 sec 1800 MHz: 7.58 sec Anyway, numbers are usable. Will add them with next Results.md update.
  19. In the meantime I built GCC 8.2 on Stretch on the NanoPC T4. Cpuminer now scores 10.27 kH/s on Debian Stretch when built with GCC 8.2 vs. 8.24 kH/s when built with Stretch's GCC 6.3. This is stuff I want to outline. How cheap it can be to get better performance by simply caring about software To build a new GCC version on the machine I followed this recipe (last line important since cpuminer needs to be rebuild afterwards): GCCVer="7.3.0" # replace with "8.2.0" when wanting to use this version cd /usr/local/src wget https://ftp.gnu.org/gnu/gcc/gcc-${GCCVer}/gcc-${GCCVer}.tar.xz tar xf gcc-${GCCVer}.tar.xz && rm gcc-${GCCVer}.tar.xz mkdir build cd gcc-${GCCVer} ./contrib/download_prerequisites cd ../build ../gcc-${GCCVer}/configure make -j $(grep -c '^processor' /proc/cpuinfo) make install echo "/usr/local/lib64" >/etc/ld.so.conf.d/usrLocalLib64.conf ldconfig gcc --version [ -d /usr/local/src/cpuminer-multi ] && rm -rf /usr/local/src/cpuminer-multi (I did the first steps always on a RK3399 board, and then transferred the build directoy to a RK3328 board and executed the final steps starting with 'make install' there -- twice as fast)
  20. Well, identifying such stuff is IMO part of the journey. One of my goals is to collect several numbers for the same hardware to be able to educate users that they never 'benchmark the hardware' but all the time also 'software and settings'. One side effect of such projects is that we could walk through all the kernels we maintain to consolidate crypto options then being able to demonstrate how important such stuff is (by re-running sbc-bench and listing one board multiple times). I just checked the output for OpenSSL versions: Ubuntu Xenial: OpenSSL 1.0.2g 1 Mar 2016 Debian Stretch: OpenSSL (version 1.1.0f, built on 25 May 2017) Ubuntu Bionic: OpenSSL (version 1.1.0g, built on 2 Nov 2017) If ARMv8 Crypto Extensions are available funnily openssl numbers with smaller data chunks are higher with 1.0.2g than 1.1.0g (see NanoPC T3+ numbers or those for Vim2). Did you already check whether all our kernels have the respective AF_ALG and NEON switches enabled? Yes, can be observed with armhf userland. I didn't care that much since at least with Cortex-A7 cpuminer/NEON performance is that low that it's close to misleading to list numbers. Wrt precompiled binaries... I'm currently building GCC 7.3 on Stretch just to let sbc-bench run again this time with GCC 7.3 building cpuminer binary. Since currently we've for whatever reasons way better cpuminer scores when running with Bionic and I want to see whether it's maybe just GCC version difference (6.3 vs. 7.3)
  21. Hmm... not really needed since @zador.blood.stained already tested with exactly same image (I was curious whether openssl distro package can make use of the crypto engine with Hardkernel's image -- but to no avail). If XU4 again then our https://dl.armbian.com/odroidxu4/Debian_stretch_next.7z would be interesting but most probably exactly same numbers as already collected). Vim2 would be interesting with Debian Stretch and 4.9 kernel (no idea whether that's available somewhere). Do you also have the S905X Vim? Testing 32-bit boards with A7 cores IMO isn't needed. Maybe the OPi Plus 2 with mainline kernel...
  22. Only way is via USB and FEL mode: https://github.com/zador-blood-stained/fel-mass-storage
  23. That's filesystem corruption. Just restoring one file might be not sufficient (and your UUID used anywhere else will prevent booting anyway). I would strongly recommend to do ASAP a filesystem check (fsck) verify package integrity (sudo armbianmonitor -v)
  24. Well, given you have no network right now it will be quite hard trying to compile a driver for the RTL8188. Otherwise next step would be https://docs.armbian.com/User-Guide_Advanced-Features/#how-to-build-a-wireless-driver My personal choice in such situations is an RTL8153 based GbE dongle since IMO one can make use of such a thing many times and it performs on USB3 ports not that much different compared to 'native' Gigabit Ethernet. On USB2 ports performance is still good.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines