Jump to content

tkaiser

Members
  • Posts

    5462
  • Joined

Everything posted by tkaiser

  1. Well, with flash storage we have an additional layer where things can go wrong. Filesystems use allocation tables to map 'entities' (blocks) to file contents (or vice versa). If such a pointer is written wrongly then mysteriously other blocks are mapped into a file from somewhere else ('mysteriously' can be related to bit flips happening in memory, one of the many reasons to want devices with ECC RAM support and also filesystems providing data integrity like ZFS, btrfs, ReFS or even Apple's APFS the latter at least providing filesystem metadata integrity so stuff like this can't happen) On flash storage the underlying 'entities' (pages) are fully abstracted from the view from above. There's the FTL (flash translation layer) in between dynamically mapping between representation to the host as block device and internal representation as pages. And if something here went wrong the same happens: some data (pages this time) from somewhere now mapped to somewhere else Personally I think 'bit flip in RAM' is most probably the culprit (so no SD card to blame this time) @chwe: IMO this subthread is worth being splitted off to common issues subforum.
  2. So now a quick overview of the relevant SoC families with Armbian -- results in 7-zip-MIPS: A20 @ 912 MHz: ~900 A64 @ 1152 MHz: ~2550 H3 @ 1100 MHz: ~2100 H5 @ 816 MHz: ~2200 H6 @ 912 MHz: ~2550 i.MX6 @ 1000 MHz: ~2100 RK3288 @ 1800 MHz: ~5400 RK3328 @ 1392 MHz: ~3600 S5P6818 @ 1400 MHz: ~7200 S905 @ 1500 MHz: ~3700 (for A20, H3, H5, H6, RK3328 and S5P6818 results see above, for A64 see here and there, for i.MX6 see here (search for my Wandboard), for RK3288 see MiQi results here and for S905 see ODROID-C2 numbers there) All the SoCs above are quad-core except A20 (dual) and S5P6818 (octa). And it's all about the type of CPU cores: A20 and H3 are Cortex-A7, A64/H5/H6/RK3328/S5P6818/S905 are Cortex-A53, i.MX6 is Cortex-A9 and RK3288 is Cortex-A17. So let's look at all the results, take count of cores into account and MHz also. The following is a table of 7-zip-MIPS per single core at 1GHz clockspeed: A7: 475 A9: 525 A53: 625 A15: 700 A17: 750 A72: 850 (yeah, A15 and A72 also exist -- see below). So that's roughly what you can expect from each individual Cortex core running at 1 GHz. As expected if a SoC contains more cores specific workloads that benefit from parallel code execution get faster (once again: most workloads are single-threaded!). Also as expected clockspeeds matter: if you buy an H3 or H5 board without voltage regulation limiting the maximum clockspeed then obviously this board will perform slower compared to another H3/H5 board with sophisticated voltage regulation allowing the CPU cores to clock much higher. What also matters with this benchmark and most if not all real world workloads: memory bandwidth. Boards with just a 16-bit memory interface are slower than those with 32- or even 64-bit memory interfaces (something that the incapable sysbench pseudo cpu test is not able to report since whole execution happens inside the CPU cores). Boards that use 'better' DRAM (DDR4 vs DDR3) can be faster as long as available software/settings are available (and that's often not the case -- for example we're still waiting for Rockchip releasing new BLOBs with faster DRAM initialization for (L)PPDR4 equipped Rockchip boards). Speaking about software/settings it should also be obvious that in the meantime we always also have to take care about heat dissipation of ARM SoCs used today. Heat dissipation is an issue to prevent damaging the SoCs due to overheating under load. But without fully functioning cpufreq/dvfs/thermal drivers we can not allow the CPU cores to clock at their upper limits since we need working throttling to protect the chips. And that's why the results for Allwinner H5 and H6 boards look that bad: since linux-sunxi community still is working on upstreaming driver support and/or we at Armbian have not incorporated latest patches flying around into the build system. Once cpufreq/dvfs/thermal is ready for those newer Allwinner SoCs H5 boards will get 1.5 as fast and H6 boards almost twice as fast as today. Software/settings matter. Always. That's why it's so disappointing to see all those benchmark numbers flying around not taking this into account. What about Cortex-A15 and A72? When we look at boards Armbian supports we find those cores in SoCs implementing big.LITTLE: ODROID XU4/HC1/HC2 use Exynos 5442 which consists of 4 fast A15 and 4 slow A7 cores (32-bit ARMv7). Boards based on RK3399 have 2 fast A72 cores and 4 slow A53 cores (64-bit ARMv8) Does it make sense to run the very same 7-zip benchmark on those big.LITTLE designs? Not that much since we can not easily draw any conclusion for normal workloads from such a benchmark number. For example when executing '7z b' on all 6 cores of an ODROID-N1 at the same time we get an overall score of ~6550 7-zip MIPS. When limiting benchmark execution to only the 2 fast A72 cores at 2 GHz we get ~3350 (that's ~1700 7-zip MIPS per core), when we execute the benchmark on the 4 little cores only (1.5GHz) then it's ~3900 (~975 7-zip MIPS per core). A single threaded task running on one of the two big cores will perform almost twice as fast compared to running on a little core. That's important to keep in mind since based on the workload running in reality some of the benchmark numbers are simply misleading or just... numbers without meaning. Same with ODROID-XU4/HC1/HC2: when running on the A15 big cores ~4950 7-zip MIPS are reported at around 1.8GHz (~1250 7-zip MIPS per big core), when running only on the little A7 cores at 1.4 GHz it's ~2725 (~675 per core). Same situation as with RK3399: single threaded stuff moved to the big cores performs almost twice as fast as on the little cores. I never measured 7-zip running on all cores together since 'numbers without meaning' but I would assume we get something similar as with RK3399: not the addition of big+little numbers (3350+3900=7250 vs. 6550 in reality) but something lower since all cores have to fight for memory bandwidth. Other things to keep in mind: When looking at the above benchmark numbers we see A53 cores performing with this specific benchmark 30% better compared to A7 cores at the same clockspeed (so there's a slight advantage ARMv8/64-bit has over ARMv7/32-bit). But as soon as we use other software/benchmarks that make heavy use of NEON optimizations we usually see a performance increase much higher (A53 usually performing twice as fast as A7 -- can be easily checked with cpuminer). So as always it depends on the use case. Speaking of 'use case' we should also keep in mind that all those ARM SoCs have special engines for this and that. Almost all ARMv8/64-bit SoCs for example contain a cryptographic acceleration engine called 'ARMv8 Crypto Extensions' that make a massive difference with AES for example compared to 32-bit/ARMv7 SoCs that have to do crypto stuff on the CPU cores (see here for numbers). So again: it's about the use case: if you're interested in VPN stuff or disk encryption looking at generic CPU benchmarks is BS since you want an ARMv8 SoC with crypto support (almost all have, the only exceptions are RPi 3/3+, ODROID-C2 and NanoPi K2) CPU performance with many use cases isn't that important. With Marvell based boards (EspressoBin, Clearfogs, Helios4) CPU benchmarks look rather low but these SoCs are designed for highest I/O and networking throughput and even if the SoC in question scores low in CPU benchmarks those boards outperform everything else if it's about fast storage and network
  3. Olimex Lime2 (dual core Allwinner A20) capped to 912 MHz with slight background activity: NanoPi K1 Plus (quad core Allwinner H5) capped to 816 MHz due to missing cpufreq/dvfs/thermal support in mainline kernel: NanoPi NEO (quad core Allwinner H3) tested again this time with 1.1 GHz:
  4. NanoPi Fire3 with Samsung/Nexell S5P6818 which is an octa-core A53 clocked by default with 1.4 GHz (results irrelevant): Why are results irrelevant? For two reasons: Throttling occured. The board with vendor's standard heatsink starts to overheat badly when running demanding loads. Active cooling is needed and that's why monitoring when running benchmarks is that important! This is an octa-core Cortex-A53 SoC showing with this benchmark a score of well above 7000 when no throttling happens. Once again: such multithreaded results are BS wrt most real world workloads. An RK3399 board like an ODROID-N1 scoring 'just' 6500 will be the faster performing board with almost all usual workloads since equipped with 2 fast A72 cores while the Fire3 only has 8 slow A53 cores. Most workloads do not scale linearly with count of CPU cores. This has to be taken into account.
  5. PineH64 with Allwinner H6 using mainline kernel which has no working cpufreq/dvfs/thermal drivers, therefore clocking H6 with 912 MHz (results irrelevant): PineH64 scores close to 2600 which as said already is irrelevant for the SoC's performance since with current mainline kernel the SoC runs at a fixed clockspeed reported by 7-zip as ~910 MHz -- it's 912 MHz). We know that the chip can clock up to 1.8 GHz (confirmed by @wtarreau's cool mhz tool) so once cpufreq/dvfs is working we see almost twice the numbers than these 2600. Why only 'almost twice'? Since the 7-zip benchmark depends also on memory bandwidth (like any real workload -- that's just another reason why sysbench sucks as CPU benchmark since sysbench does not depend on memory bandwidth at all). Those numbers are made with 4 tasks stressing all 4 CPU cores in parallel. Most real world workloads look differently and are single threaded. So even if a H6 clocked at 1.8 GHz produces 7-zip scores above 5000 any big.LITTLE ARM SoC with a similar score will be way faster in reality since when single threaded loads are running on the big cores they perform much faster.
  6. With this commit I added 7-zip benchmark reporting to Armbian now. Will be available after next updates and with next batch of new images. Why not recommending to just do an 'apt install p7zip ; 7zr b'? Since 'fire and forget' benchmarking is always BS. You need some monitoring in parallel to know whether your system was really idle and at which clockspeeds the CPU cores were operating (throttling occuring or not?). Most recent 7-zip contains an own routine to 'pre-heat' the system prior to starting the benchmark (to let cpufreq scaling switch from low clockspeeds to highest ones and e.g. on Intel systems let the system enter TurboBoost modes). This 7-zip code runs single threaded so based on the kernel's scheduler sometimes ending up on the 'wrong' CPU core (e.g. a little core on big.LITTLE SoCs) On a NanoPC T4 with conservative settings (limiting big CPU cores to 1.8 GHz and little cores to 1.4 GHz) this looks like this: root@nanopct4:/home/tk# armbianmonitor -z Preparing benchmark. Be patient please... 7-Zip (a) [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs LE) LE CPU Freq: 1413 1414 1414 1411 1413 1414 1414 1414 1415 RAM size: 3878 MB, # CPU hardware threads: 6 RAM usage: 1323 MB, # Benchmark threads: 6 Compressing | Decompressing Dict Speed Usage R/U Rating | Speed Usage R/U Rating KiB/s % MIPS MIPS | KiB/s % MIPS MIPS 22: 3642 363 976 3543 | 98020 543 1540 8359 23: 3691 365 1030 3761 | 95217 541 1522 8239 24: 3606 354 1094 3878 | 92662 535 1520 8133 25: 4597 451 1164 5249 | 89079 529 1498 7928 ---------------------------------- | ------------------------------ Avr: 383 1066 4108 | 537 1520 8165 Tot: 460 1293 6136 Monitoring output recorded while running the benchmark: Time big.LITTLE load %cpu %sys %usr %nice %io %irq CPU C.St. 10:16:19: 1800/1416MHz 0.12 12% 1% 7% 2% 1% 0% 44.4°C 0/5 10:16:25: 408/ 600MHz 0.11 0% 0% 0% 0% 0% 0% 43.9°C 0/5 10:16:30: 600/1416MHz 0.10 1% 0% 0% 0% 0% 0% 45.0°C 0/5 10:16:35: 1800/1416MHz 0.17 40% 0% 39% 0% 0% 0% 49.4°C 0/5 10:16:40: 1800/1416MHz 0.32 77% 0% 77% 0% 0% 0% 55.0°C 0/5 10:16:45: 1800/1416MHz 0.94 73% 0% 72% 0% 0% 0% 51.1°C 0/5 10:16:50: 1800/1416MHz 0.94 65% 0% 65% 0% 0% 0% 53.3°C 0/5 10:16:55: 1800/1416MHz 1.19 68% 0% 67% 0% 0% 0% 56.1°C 0/5 10:17:00: 1800/1416MHz 1.49 79% 1% 78% 0% 0% 0% 53.9°C 0/5 10:17:06: 1800/1416MHz 1.45 31% 0% 31% 0% 0% 0% 57.8°C 0/5 10:17:11: 1800/1416MHz 2.07 68% 0% 67% 0% 0% 0% 57.2°C 0/5 10:17:17: 1800/1416MHz 2.30 78% 0% 77% 0% 0% 0% 58.9°C 0/5 10:17:22: 1800/1416MHz 2.52 90% 1% 89% 0% 0% 0% 57.8°C 0/5 10:17:27: 1800/1416MHz 2.72 81% 0% 80% 0% 0% 0% 57.2°C 0/5 Time big.LITTLE load %cpu %sys %usr %nice %io %irq CPU C.St. 10:17:32: 1800/1416MHz 2.66 61% 0% 60% 0% 0% 0% 60.6°C 0/5 We get an overall score of above 6100 and 7-zip's 'CPU Freq' line reports CPU0 (a little core) being clocked at 1.4 GHz. But since this is a big.LITTLE design we need the monitoring output that gets displayed below 7-zip benchmark numbers. By looking at the 2nd line we see that the system was totally idle prior to starting the benchmark (I implemented a 10 second sleep between starting monitoring and firing up the benchmark for this reason -- to control whether the system was already busy or not). As a comparison 7-zip numbers of another RK3399 board that allowed the CPU cores to clock slightly higher (2.0/1.5 GHz): ODROID-N1 scored 6500. As a reference some other boards. Rock64 with new 1.4 GHz settings: NanoPi NEO with cpufreq scaling limited to 816 MHz to keep the board always at lowest DVFS voltage (results irrelevant) Please keep in mind that benchmarks that run fully multi threaded are NOT representative for most workloads running on computers (they're single threaded). Also please keep in mind that while 7-zip is not that much affected by different compiler settings (like the infamous sysbench) of course it is somewhat. So when you see 7-zip benchmark numbers generated few years ago when the 7z binary has been built with a GCC 4.x most probably with today's software and a binary built by GCC 7.x you see higher scores. So take these comparison numbers with a grain of salt: https://s1.hoffart.de/7zip-bench/ To get new armbianmonitor with -z functionality today it's as easy as wget -O /usr/bin/armbianmonitor https://raw.githubusercontent.com/armbian/build/master/packages/bsp/common/usr/bin/armbianmonitor
  7. For people not rejecting reality... again why sysbench is unrealiable (not able to indicate CPU performance AT ALL). Sysbench not able to compare different CPU architectures since only a compiler benchmark (you get 15 times higher 'CPU performance' reported with a 64-bit distro than a 32-bit distro on 64-bit ARM boards) Switch the distro and your sysbench numbers differ (in fact it's just different distros building their packages with different GCC versions) Update your distro and your sysbench numbers improve (since it's just a compiler benchmark) Different sysbench version, different 'benchmark' results (start at 'Performance with legacy Armbian image') Why sysbench's method to 'calculate CPU performance' is that weird and does not apply to anything performance relevant in the real world For people loving to 'shoot the messenger'... it's not only me who describes sysbench as the totally wrong tool, e.g. https://www.raspberrypi.org/forums/viewtopic.php?t=208314&start=25 Again: 7-zip's benchmark mode is not just using an insanely limited synthetic benchmark routine like sysbench (calculating prime numbers only involving CPU caches but no memory access), 7-zip is not dependent on compiler versions or platform switches and 7-zip allows for cross-platform comparisons. You'll find a lot of numbers here in the forum and some comparisons on the net e.g. https://s1.hoffart.de/7zip-bench/ (again: it's just a rough estimate but at least something somewhat reliable related to CPU performance) The most important thing with benchmarking is 'benchmarking the benchmark'. Since most tools (especially the popular ones) do not do what they pretend to do. Active benchmarking is needed and not just throwing out numbers without meaning. BTW: sysbench is part of MySQL and when used correctly a great tool to provide insights. It's just the 'cpu test' that is not a CPU test at all. And it's about people firing up sysbench in 'passive benchmarking' mode generating numbers without meaning and praising insufficient tools.
  8. ...is crap. It's not a CPU but only a compiler benchmark. You chose the worst 'benchmark' possible. Sysbench numbers change with compiler version and settings or even with sysbench version (higher version number --> lower scores). There's no 'benchmark' known producing more unreliable results wrt hardware/CPU. Use 'Google site search' here to get the details. If it's about a quick and rough CPU performance estimate I would recommend 7-zip's benchmark mode (7z b).
  9. Instead of following stupid networkmanager bashing you might better focus on reality. What you obviously experience is filesystem corruption and quite normal with SBC (result of broken/counterfeit SD cards). In such a situation a filesystem check would be a good idea as well as some integrity checking (of packages and the entire SD card surface): sudo armbianmonitor -v armbianmonitor -c $HOME
  10. Boots even from SATA disks behind a crappy SATA PM.
  11. apt.armbian.com is not up to date: root@rock64:~# apt-cache search headers | grep rk3328 linux-headers-dev-rk3328 - Linux kernel headers for 4.14.0-rc2-rk3328 on arm64 linux-headers-rk3328 - Linux kernel headers for 4.4.77-rk3328 on arm64 By switching to 'nightly' with armbian-config (and accepting that you're fine with a bricked system) which replaces apt.armbian.com with beta.armbian.com things change: root@rock64:~# uname -a Linux rock64 4.4.138-rk3328 #10 SMP Mon Jul 9 13:05:42 PDT 2018 aarch64 GNU/Linux root@rock64:~# apt-cache search headers | grep rk3328 linux-headers-dev-rk3328 - Linux kernel headers for 4.17.0-rc6-rk3328 on arm64 linux-headers-rk3328 - Linux kernel headers for 4.4.138-rk3328 on arm64 Of course this is not a solution.
  12. All the affected people in the referenced issue are using USB hubs. I never use USB hubs in between host and disk and just get the usual xhci error once I connect a SuperSpeed disk to Rock64 and besides that everything is fine. If I would want to connect two fast disks to an SBC I choose an appropriate SBC or maybe would test such a JMS561 thing (but in my personal opinion USB attached storage is a bit too unreliable so I try to avoid it where possible)
  13. No one tested so far since no one ported the driver (this and firmware for Ampak AP6359SA are contained in Rockchip's 4.4 BSP but with Allwinner A64 we still deal with a smelly 3.10 kernel... that's most probably the reason why there's mentioned 'only available for RockPro64')
  14. The two RK3328 boards I mentioned have both an eMMC slot. Rock64 numbers, Renegade numbers. I think with both boards settings still aren't optimal (so access could be faster) but as you might know the more important number when looking at 'OS performance' is random IO (IOPS) and not sequential transfer speeds (MB/s). So if you choose a great performing A1 rated SD card like a SanDisk Extreme A1 even when the board is bottlenecked by DDR50 mode it will be way faster compared to using an average (AKA crappy) SD card since they all suffer from horribly low random write performance. See here for some details and especially follow the links there. Random IO (write) performance is more important than anything else. Sequential transfer speeds are close to irrelevant.
  15. Combining all of this I would end up buying a RK3328 device like Rock64 or Renegade. No idea why you need fast SD card modes since in combination with a JMS578 USB-to-SATA adapter and a SSD you have pretty fast storage: https://forum.armbian.com/topic/1925-some-storage-benchmarks-on-sbcs/?do=findComment&comment=51350 (even if USB based storage access is a lot faster compared to Allwinner A20 SATA as on Banana Pi). More CPU horsepower and a bit more pricey? RK3399. But if it's only about serving some static or dynamic web pages I would also evaluate whether cheap Allwinner H5 based boards like NanoPi NEO2 or Orange Pi Zero Plus (with massive zram overcommitment) aren't worth a look.
  16. tkaiser

    NanoPi NEO4

    Well, then I would prefer to use ODROID design since the NAS case for NEO/NEO2 unfortunately does not attach the SoC to the enclosure to efficiently dissipate heat away. (still waiting for someone designing a RK3399 'NAS board' with a JMS561 attached to each USB3 port to provide 4 SATA ports for spinning rust and a M.2 key M slot for a fast NVMe SSD or alternatively something like this to provide another 4 SATA ports)
  17. IIRC the newer and much better Ampak AP6359S based module should also work with Pine64 and PineH64.
  18. tkaiser

    NanoPi NEO4

    Funny! Same PCB dimensions and mounting hole positions as NanoPi NEO Plus 2, USB3 exposed twice, USB2 on pin header (and most probably also via USB-C), Gigabit Ethernet, HDMI. Amount of DRAM unknown without looking at the other PCB side.
  19. Everything I ordered from Xunlong's Aliexpress store so far arrived within less than 2 weeks. 'Powerful' is relative as already explained. It's about the use case and then the SoC in question (and the specific board design -- an Orange Pi Lite 2 for example has a pretty fast CPU/GPU combination but of course networking sucks somehow since only Wi-Fi on the board)
  20. https://pl.aliexpress.com/store/product/Orange-Pi-PC-Plus-ubuntu-linux-and-android-mini-PC-Beyond-Raspberry-Pi-2/1553371_32668618847.html
  21. With Rock64 and especially in such a scenario with an external USB hub only by measuring voltages at the drive's side. And please keep in mind what I've written above about USB3 controller limitations with ARM SoCs. I consider USB storage 'unreliable storage' by definition and as soon as an USB hub is in between host and drives as 'utterly unreliable storage'. The only thing I would put in between a Rock64's USB3 port and 2 disks is this JMS561 thing mentioned here: https://forum.openmediavault.org/index.php/Thread/19871-Which-energy-efficient-ARM-platform-to-choose/?postID=169303#post169303 (but exactly that. Hardkernel sells something called 'Cloudshell 2' for their ODROID-XU4 which is also based on JMS561 but their device suffers/sufferend from serious firmware issues)
  22. None of these boards is supported by Armbian (for a reason -- they're all listed as 'CSC' --> community supported configurations without any support). Just use the Google Site Search link above and search for each, you'll find reviews and a lot of information here.
  23. You should really read the above link: The fraudsters can use whatever metadata they want. And the 'better' ones now use metadata you can't distinguish from genuine products.
  24. Still don't get the use case but anyway... it's rather easy. There are slow ARM cores (A5, A7 and A53), fast ones (A15, A17 and A72) and something in between: A9 (i.MX6 boards and Clearfogs). The majority of boards Armbian supports has slow cores, there are just a few exceptions: ODROID-XU4 (big.LITTLE with 4xA7 and 4xA15) Tinkerboard and MiQi (4xA17) some RK3399 boards soon (2xA72 and 4xA53) The RK3399 boards are all rather expensive, the Tinkerboard is a bizarre fail, no idea whether you can buy a MiQi any more but maybe 8 slow cores for 35 bucks are the right thing for you: NanoPi Fire3
  25. BTW: the above mentioned checks for fake capacity (and data integrity) and SD card performance are accessible in an easy way: armbianmonitor -c $HOME will run both f3 and iozone checks but on slow/crappy/faked cards this can take ages. Also f3 is only able to check the free remaining space on the SD card so it's strongly recommended to burn the image only with Etcher since shitty burning tools that do not do a mandatory verify (a comprehensive list of tools to avoid got collected here) would otherwise allow for areas of the card which cause data corruption to be undetected. Example output from 'armbianmonitor -c': So in general rule 2) above applies: check the SD card with either F3 or h2testw prior to burning any image with Etcher. It's IMPORTANT!
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines