Jump to content

tkaiser

Members
  • Posts

    5462
  • Joined

Reputation Activity

  1. Like
    tkaiser reacted to mindee in NanoPI M4   
    NanoPi-M4 is RK3399 based too, just no eMMC on board, and no so many interface as NanoPC-T4 So a little lower cost.
     
     

  2. Like
    tkaiser got a reaction from Tido in BSP scripts RFC   
    My 2 cents on this at the bottom of the list: https://github.com/armbian/build/pull/1015
     
    Can't really test/review at the moment since running out of spare time. My only other 'Armbian contribution' this week will be a little bit NanoPC T4 testing (relying on @hjc's work).
  3. Like
    tkaiser got a reaction from NicoD in ODROID N1 -- not a review (yet)   
    Preliminary 'performance' summary
     
    Based on the tests done above and elsewhere let's try to collect some performance data. Below GPU data is missing for the simple reason that I'm not interested in anything GPU related (or attaching a display at all). Besides used for display stuff and 'retro gaming' RK3399's Mali T860 MP4 GPU is also OpenCL capable. If you search for results (ODROID N1's SoC is available for some years now so you find a lot by searching for 'RK3399' -- for example here are some OpenCL/OpenCV numbers) please keep in mind that Hardkernel might use different clockspeeds for the GPU as well (with CPU cores it's just like that: almost everywhere around big/little cores are clocked with 1.8/1.4 GHz while the N1 settings use 2.0/1.5 GHz instead)
     
    CPU horsepower
     
    Situation with RK3399 is somewhat special since it's a HMP design combining two fast Cortex-A72 cores with four 'slow' A53. So depending on which CPU core a job lands execution time can vary by factor 2. With Android or 'Desktop Linux' workloads this shouldn't be an issue since there things are mostly single-threaded and the scheduler will move these tasks to the big cores automagically if performance is needed.
     
    With other workloads it differs:
    People wanting to use RK3399 as part of a compile farm might be disappointed and still prefer ARM designs that feature four instead of two fast cores (eg. RK3288 or Exynos 5422 -- for reasons why see again comments section on CNX) For 'general purpose' server use cases the 7-zip scores are interesting since giving a rough estimate how fast a RK3399 device will perform as server (or how many tasks you can run in parallel). Overall score is 6,500 (see this comparison list) but due to the big.LITTLE design we're talking about the big cluster scoring at 3350 and the little cluster at 3900. So tasks that execute on the big cores finish almost twice as fast. Keep this in mind when setting up your environment. Experimenting with cgroups and friends to assign certain tasks to specific CPU clusters will be worth the efforts! 'Number crunchers' who can make use of NEON instructions should look at 'cpuminer --benchmark' results: We get a total 8.80 kH/s rate when running on all 6 cores (big cores only: 4.10 kH/s, little cores only: 4.90 kH/s -- so again 'per core' performance almost twice as good on the big cores) which is at the same performance level of an RK3288 (4 x A17) but gets outperformed by an ODROID XU4 for example at +10kH/s since there the little cores add a little bit to the result. But this needs improved cooling otherwise an XU4 will immediately throttle down. The RK3399 provides this performance with way lower consumption and heat generation! Crypto performance: just awesome due to ARMv8 Crypto Extensions available and useable on all cores in parallel. Simply check cryptsetup results above and our 'openssl speed' numbers and keep in mind that if your crypto stuff can run in parallel (eg. terminating few different VPN sessions) you can almost add the individual throughput numbers (and even with 6 threads in parallel at full clockspeed the RK3399 just draws 10W more compared to idle) Talking about 'two fast and four slow CPU cores': the A53 cores are clocked at 1.5GHz so when comparing with RK3399's little sibling RK3328 with only 4xA53 (ROCK64, Libre Computer Renegade or Swiftboard/Transformer) the RK3399 when running on the 'slow' cores will compete or already outperform the RK3328 boards but still has 2 big cores available for heavy stuff. But since a lot of workloads are bottlenecked by memory bandwidth you should have a look at the tinymembench results collected above (and use some google-fu to compare with other devices)
     
    Storage performance
     
    N1 has 2 SATA ports provided by a PCIe attached ASM1061 controller and 2 USB3 ports directly routed to the SoC. The per port bandwidth limitation that also seems to apply to both port groups is around 390 MB/s (applies to all ports regardless whether SATA or USB3 -- also random IO performance with default settings is pretty much the same). But this is not an overall internal SoC bottleneck since when testing with fast SSDs on both USB3 and SATA ports at the same time we got numbers at around ~750MB/s. I just retested again with an EVO840 on the N1 at SATA and USB3 ports with a good UAS capable enclosure and as a comparison repeated the same test with a 'true NAS SoC': the Marvell Armada 385 on Clearfog Pro which provides 'native SATA' by the SoC itself:
     
    If we look carefully at the numbers we see that USB3 slightly outperforms ASM1061 when it's about top sequential performance. The two ASM1061 numbers are due to different settings of /sys/module/pcie_aspm/parameters/policy (defaults to powersave but can be changed to performance which not only results in ~250mW higher idle consumption but also a lot better performance with small block sizes). While USB3 seems to perform slightly better when looking only at irrelevant sequential transfer speeds better attach disks to the SATA ports for a number of reasons:
    With USB you need disk enclosures with good USB to SATA bridges that are capable of UAS --> 'USB Attached SCSI' (we can only recommend the following ones: ASMedia ASM1153/ASM1351, JMicron JMS567/JMS578 or VIA VL711/VL715/VL716 -- unfortunately even if those chipsets are used sometimes crappy firmwares need USB quirks or require UAS blacklisting and then performance sucks. A good example are Seagate USB3 disks) When you use SSDs you want to be able to use TRIM (helps with retaining drive performance and increases longevity). With SATA attached SSDs this is not a problem but on USB ports it depends on a lot of stuff and usually does NOT work. If you understand just half of what's written here then think about SSDs on USB ports otherwise better choose the SATA ports here And PCIe is also less 'expensive' since it needs less ressources (lower CPU utilization with disk on SATA ports and less interrupts to process, see the 800k IRQs for SATA/PCIe vs. 2 million for USB3 with exactly the same workload below): 226: 180 809128 0 0 0 0 ITS-MSI 524288 Edge 0000:01:00.0 226: 0 0 0 0 0 0 ITS-MSI 524288 Edge 0000:01:00.0 227: 277 0 2066085 0 0 0 GICv3 137 Level xhci-hcd:usb5 228: 0 0 0 0 0 0 GICv3 142 Level xhci-hcd:usb7 There's also eMMC and SD cards useable as storage. Wrt SD cards it's too early to talk about performance since at least the N1 developer samples do only implement the slowest SD card speed mode (and I really hope this will change with the final N1 version later) a necessary kernel patch is missing to remove the current SD card performance bottleneck..
     
    The eMMC performance is awesome! If we look only at random IO performance with smaller block sizes (that's the 'eMMC as OS drive' use case) then the Hardkernel eMMC modules starting at 32GB size perform as fast as an SSD connected to USB3 or SATA ports. With SATA ports we get a nice speed boost by changing ASPM (Active State Power Management) settings by switching from the 'powersave' default to performance (+250mW idle consumption). Only then a SSD behind a SATA port on N1 can outperform a Hardkernel eMMC module wrt random IO or 'OS drive' performance. But of course this has a price: when SATA or USB drives are used consumption is a lot higher.
     
    Network performance
     
    Too early to report 'success' but I'm pretty confident we get Gigabit Ethernet fully saturated after applying some tweaks. With RK3328 it was the same situation in the beginning and maybe same fixes that helped there will fix it with RK3399 on N1 too. I would assume progress can be monitored here: https://forum.odroid.com/viewtopic.php?f=150&t=30126
  4. Like
    tkaiser got a reaction from pfeerick in RockPro64   
    (placeholder for a Board Bring Up thread -- now just collecting random nonsense)
     

     

     
    Board arrived today. Running with ayufan's Debian 9 image (using 1.8/1.4GHz settings for the big.LITTLE cores for whatever reasons instead of 2.0/1.5GHz) I checked for heat dissipation with Pine Inc's huge heatsink first. Huge heatsink combined with thin thermal pad shows really nice thermal values:
     
    below 50°C when running cpuburn-a53 on two A53 and two A72 cores in parallel with a fan blowing above the heatsink's surface Board in upright positition still running same cpuburn-a53 task without fan after 20 minutes reports still below 85°C even if heatsink orientation is slightly wrong  
    Summary: passive cooling is all you need and you should really spend the additional few bucks on Pine Inc's well designed heatsink since being inexpensive and a great performer.
     
  5. Like
    tkaiser got a reaction from hjc in RockPro64   
    You need to adjust DT on your NanoPC since the device claims to be only capable of PCIe Gen1. Please see also https://patchwork.kernel.org/patch/9345861/
     
    @FrankM: Would be interesting to test on your RockPro64 with link speed set to 1 and see whether now a x4 link will be established: https://github.com/ayufan-rock64/linux-kernel/blob/65dce4f6760180105cda50d6f8d603e25eaf26fc/arch/arm64/boot/dts/rockchip/rk3399-rockpro64.dts#L260
  6. Like
    tkaiser got a reaction from NicoD in RockPro64   
    (placeholder for a Board Bring Up thread -- now just collecting random nonsense)
     

     

     
    Board arrived today. Running with ayufan's Debian 9 image (using 1.8/1.4GHz settings for the big.LITTLE cores for whatever reasons instead of 2.0/1.5GHz) I checked for heat dissipation with Pine Inc's huge heatsink first. Huge heatsink combined with thin thermal pad shows really nice thermal values:
     
    below 50°C when running cpuburn-a53 on two A53 and two A72 cores in parallel with a fan blowing above the heatsink's surface Board in upright positition still running same cpuburn-a53 task without fan after 20 minutes reports still below 85°C even if heatsink orientation is slightly wrong  
    Summary: passive cooling is all you need and you should really spend the additional few bucks on Pine Inc's well designed heatsink since being inexpensive and a great performer.
     
  7. Like
    tkaiser got a reaction from Tido in a Google docs spreadsheet that tabulates the boards’ key features   
    Like all of those list it's not worth to visit them since they're full of mistakes. That's only for people who are impressed by vast amounts of data and don't give a sh*t about information (data != information).
     
    3 examples after looking for ONE MINUTE into this BS collection:
    Banana Pi M2 Berry is listed with 4 USB ports while this board in reality just uses one USB host port to provide 4 USB receptacles (shared bandwidth due to internal USB hub). Banana Pi M2 Ultra is also listed with 4 USB ports which is BS since this board has 2 USB receptacles (+ 1 x Micro USB OTG port) and at least exposes all of the USB host ports CubieBoard6/7 is listed as one entry which is OK from a hardware point of view (since the pin compatible S500/S700 SoCs are the only change between both board variants). But under 'OSes' for both boards 'Linux, Android' is listed which is simply BS since Cubieboard 7 with it's Actions Semi S700 SoC is simply an Android toy only. No Linux image has been made available, there's only one smelly 3.10 kernel that receives no updates since a long time any more and the vendor is known for providing horrible software support these days anyway. After spotting that much errors within a minute I stopped and closed the browser tab. It's just a waste of time to look through a huge collection of data providing zero (correct) information. Also the most important factor if you also want to USE a SBC and not just buy one is missing: quality of software support. Everybody should know that 'software support' makes the difference between a SBC and a paperweight or door stop. This collection of (partially wrong) data does not differentiate at all so it's completely useless.
     
  8. Like
    tkaiser got a reaction from Patrick in Odroid C2 - update to v5.44 / v5.45 not working?   
    Well, every other week the same question is asked here so the mechanism to display a random version number contained in /etc/armbian-release that is part of one specific package that doesn't get updated each time Armbian version number is increased is obviously misleading.
  9. Like
    tkaiser got a reaction from matt407 in Odroid C2 - update to v5.44 / v5.45 not working?   
    Your installation is on 5.45
     
     
    What Armbian displays with Igor's motd stuff is broken. That's all. The motd routine displays a meaningless version number based on one package's version that does not get updated every time Armbian's version number increases. It's only there to confuse people and generate unnecessary support efforts. Don't expect this to be fixed :-)
  10. Like
    tkaiser got a reaction from Tido in Librecomputer Tritium H3   
    Allwinner H3 was exciting back in 2015 and early 2016. Allwinner H5 was interesting since still inexpensive but full ARMv8 feature set (especially ARMv8 Crypto Extensions). Basically all H2+, H3 and H5 boards are the same: https://forum.armbian.com/topic/1351-h3-board-buyers-guide/?do=findComment&comment=44979
     
    For an Allwinner H device able to draw some attention it needs either to be dirt cheap or needs interesting features like great CPU performance (needs DVFS to go beyond 1.3 GHz) or great amount of DRAM (impossible since 2 GB DRAM are the max) or great graphics capabilities (impossible, only boring Mali 4x0 and limited video engine).
     
    The Tritium board (it's just one with 3 different SoCs soldered to it) was designed as an Android TV box without enclosure (closely following Allwinner's current reference design to be able to run their BSP stuff and therefore vendor's Android). No Gigabit Ethernet, no voltage regulation and more expensive than $10. No idea why I would choose this board for which use case...
     
    It's 2018 now...
     
  11. Like
    tkaiser got a reaction from Tido in Librecomputer Tritium H3   
    The first and also all the 'better' H3/H5 boards from Xunlong and FriendlyELEC use an I2C accessible SY8106A voltage regulator allowing to adjust voltage in 20mV steps. No real difference to a real PMIC wrt DVFS. The Orange Pi One was the first board with a more primitive voltage regulation scheme only supporting 2 voltages using SY8113B/AX3833 voltage regulators. Then came SinoVoip and forgot voltage regulation on their overheating BPI M2+, later FriendlyELEC decided to drop voltage regulation on their NEO2 since Allwinner's H5 BSP didn't support it any more and that was most probably also Libre Computer's 'motivation' to drop voltage regulation at the same time.
     
    But while Allwinner themselves do not support voltage regulation any more with H2+/H3 in their BSP (it has also been removed in their 4.4 kernel code drop) and never supported it with H5 the mainline kernel code that properly implements voltage regulation with both SY8106A and SY8113B/AX3833 without using the ARISC core exists for over 2 years now.
     
    In other words: the Libre Computer Allwinner boards are designed to run with Allwinner BSP based OS images (Android / crappy Linux) and not mainline kernel   This design decision allows for lower peak performance and higher idle consumption at the same time.
  12. Like
    tkaiser got a reaction from gounthar in Sharing info for others on the Orange Pi Zero (Mac Prefix and PinMap)   
    Nope, that's just a random number and there's absolutely no need to obfuscate random numbers. Same with the HW address of Ethernet (derived from the SoC's SID but linux-sunxi devs discovered that they have to change the algorithm recently so MAC adresses will change sometimes in the future anyway).
     
    Thanks for the nice picture but I prefer avoiding in vendor forums but the original version of this information and to use one single source for all sunxi devices: http://linux-sunxi.org/Xunlong_Orange_Pi_Zero#Expansion_Port
  13. Like
    tkaiser got a reaction from lanefu in H3 board buyer's guide   
    H2+/H3/H5 boards overview (early 2018 update)
     
    For the methodology/categorization please see above. This is just a brief overview adding the boards that appeared within the last few months (Banana Pi Zero, NanoPi Duo, Orange Pi R1, Orange Pi Zero Plus, Sunvell R69) and will appear soon or are just released (ACT Power X-A1, Libre Computer's H2+/H3/H5 Tritium, NanoPi NEO Core and Core2):
     
    NAS category (only due to Gigabit Ethernet available):
    Banana Pi M2+: H3, 1GB DRAM, 8GB slow eMMC, 1+2 USB ports useable, Wi-Fi/BT Banana Pi M2+ EDU: H3, 512MB DRAM, no eMMC, 1+2 USB ports useable NanoPi M1 Plus: H3, 1GB DRAM, 8GB slow eMMC, 1+3 USB ports useable, Wi-Fi/BT NanoPi M1 Plus 2: H5, 1GB DRAM, 8GB slow eMMC, 1+3 USB ports useable, Wi-Fi/BT NanoPi NEO 2: H5, 512MB DRAM, no eMMC, 1+1+2 USB ports useable NanoPi NEO Core 2: H5, 512MB/1GB DRAM, eMMC, 1+3 USB ports useable, GbE on pin header NanoPi NEO Plus 2: H5, 512MB DRAM, no eMMC, 1+2+2 USB ports useable, Wi-Fi OrangePi PC 2: H5, 1GB DRAM, no eMMC, 1+3 USB ports useable OrangePi Prime: H5, 2GB DRAM, 1+3 USB ports useable, Wi-Fi/BT OrangePi Plus: H3, 1GB DRAM, 8GB eMMC, 1+4 USB ports useable (hub), Wi-Fi OrangePi Plus 2: H3, 2GB DRAM, 16GB slow eMMC, 1+4 USB ports useable (hub), Wi-Fi OrangePi Plus 2E: H3, 2GB DRAM, 16GB fast eMMC, 1+3 USB ports useable, Wi-Fi Orange Pi Zero Plus: H5, 512MB, 1+1+2 USB ports useable, Wi-Fi X-A1: H3, 1GB DRAM, 8GB eMMC, 1+2 USB ports useable IoT category (cheap, small, energy efficient, most of them headless):
    Banana Pi Zero: H2+, 512MB DRAM, no eMMC, 1 USB port useable, Wi-Fi/BT,  Fast Ethernet (pin header) NanoPi Air: H3, 512MB DRAM, 8GB slow eMMC, 1+1+2 USB ports useable, Wi-Fi/BT, no Ethernet NanoPi Duo, H2+, 256/512MB DRAM, 1+1+2 USB ports useable, Wi-Fi, Fast Ethernet (pin header) NanoPi NEO: H3, 256/512MB DRAM, no eMMC, 1+1+2 USB ports useable, Fast Ethernet NanoPi NEO 2: H5, 512MB DRAM, no eMMC, 1+1+2 USB ports useable, Gigabit Ethernet NanoPi NEO Core: H3, 256/512MB DRAM, optional eMMC, 1+3 USB ports useable, Fast Ethernet (pin header)
    NanoPi NEO Plus 2: H5, 512MB DRAM, no eMMC, 1+1+2 USB ports useable, Wi-Fi, Gigabit Ethernet Orange Pi R1: H2+, 256MB DRAM, 1+2 USB ports useable, Wi-Fi, 2 x Fast Ethernet (1 x USB RTL8152) OrangePi Zero: H2+, 256/512MB DRAM, no eMMC, 1+1+2 USB ports useable, Wi-Fi, Fast Ethernet OrangePi Zero Plus 2: H3, 512MB DRAM, 8GB fast eMMC, 1+0+2 USB ports useable, Wi-Fi/BT, no Ethernet but HDMI OrangePi Zero Plus 2: H5, 512MB DRAM, 8GB fast eMMC, 1+0+2 USB ports useable, Wi-Fi/BT, no Ethernet but HDMI Tritium IoT: H2+, 512MB DRAM, 1+3 USB ports useable, Fast Ethernet
    General purpose (HDMI and full legacy kernel support: video/3D HW accelerated):
    Beelink X2: H3, 1GB DRAM, 8GB slow eMMC, 1+1 USB ports useable, Wi-Fi, Fast Ethernet NanoPi M1: H3, 1GB DRAM, no eMMC, 1+3 USB ports useable, Fast Ethernet OrangePi Lite: H3, 512MB DRAM, no eMMC, 1+2 USB ports useable, Wi-Fi, no Ethernet OrangePi One: H3, 512MB DRAM, no eMMC, 1+1 USB ports useable, Fast Ethernet OrangePi PC: H3, 1GB DRAM, no eMMC, 1+3 USB ports useable, Fast Ethernet OrangePi PC Plus: H3, 1GB DRAM, 8GB fast eMMC, 1+3 USB ports useable, Wi-Fi, Fast Ethernet OrangePi Zero Plus 2: H3, 512MB DRAM, 8GB fast eMMC, 1+1+2 USB ports useable, Wi-Fi/BT, no Ethernet pcDuino Nano 4: See above, it's just an OEM version of NanoPi M1 done for Linksprite Sunvell R69, H2+, 1GB DRAM, 8GB eMMC, 1+1 USB ports useable, Wi-Fi, Fast Ethernet Tritium: H3, 1GB DRAM, 1+3 USB ports useable, Fast Ethernet Tritium: H5, 2GB DRAM, 1+3 USB ports useable, Fast Ethernet
  14. Like
    tkaiser got a reaction from chwe in Quick review of NanoPi Fire3   
    No idea. The only 'good' SD card that died here so far is a SanDisk Extreme Pro 8 GB I used over 3 years intensively (since both great random IO performance and also sequential performance -- burning cards with up to 80 MB/s in my MacBook was always fun). All the other cards that died were cheap Noname/Intenso/Kingston crap.
  15. Like
    tkaiser got a reaction from Tido in BSP scripts RFC   
    If I understood @zador.blood.stained correctly he would prefer splitting this up in a more granular way. Currently we have the following functions in the script (the ones that can remain in armhwinfo marked with an X):
    collect_information X set_io_scheduler prepare_temp_monitoring X prepare_board log_hardware_info X get_flash_information X show_motd_warning X check_sd_card_speed X add_usb_storage_quirks activate_zram So prepare_board, set_io_scheduler and add_usb_storage_quirks could go into an armbiansetuphw service and zram could be handled by armbianzram (I really would prefer to prefix all our services with armbian so users get a clue what's plain Ubuntu/Debian and what's Armbian).
     
    Yes, but please see @zador.blood.stained's objections. I'm fine with any way as long as it's configurable and able to be disabled. The log2ram functionality would then have to implement a fallback mechanism to what we use today (check for /dev/zram0 whether it contains a freshly created btrfs filesystem. If yes use it, otherwise do the fallback to tmpfs)
     
    It's worse also for another reason: Recompressing already compressed stuff usually wastes more storage space. An uncompressed 10 MB log might end up being a 1 MB .gz file or as uncompressed file using 1.5 MB DRAM on an lzo zram device. But the 1 MB gzip version might need +2 MB on the same lzo compressed zram device. So it's not just a waste of CPU cycles but also inefficient DRAM use.
  16. Like
    tkaiser got a reaction from Tido in BSP scripts RFC   
    Well, switching from tmpfs to an already existing zram block device that has been prepared before is most probably the easy part.
    Prerequisit to use such a zram device would be refactoring at least armhwinfo (so that armhwinfo eventually again only does what the name tells: collecting and logging information). All the stuff that configures/optimizes should end up in own services that can be configured in a sane way so that user changes do not get lost with next update. No idea how to do that though Then the whole zram idea needs a solution since in case we reinvent the wheel and create zram devices on our own (which seems like the only way to get zram for logs to me) we need to deal with the existence of other zram related services (e.g. zram-control package in Ubuntu) And I fear the part that needs most attention is logrotate and how to interfere with. Moving old/compressed logs into RAM is no good idea at all, also moving logs compressed by logrotate from RAM to 'disk' and free RAM afterwards seems like the only sane way to me. But I've no idea how this could evolve without reinventing the wheel again and implement logrotate functionality on our own now being aware of two filesystems to use instead of one. Maybe the use of symlinks helps? The initial sync from storage to RAM does not copy old/compressed contents but creates them as symlinks to the /var/log.hdd/ variant. And then a cronjob monitors /var/log and as soon as there appear rotated logs they're moved over to /var/log.hdd and are replaced with symlinks. Active logfiles are treated differently and simply rsynced on a per file basis to /var/log.hdd
  17. Like
    tkaiser got a reaction from Tido in BananaPi R2 (.csc mt7623 as new boardfamily)   
    A 'new' wiki? The product pictures show early board revs from over one year ago that never got sold (showing one SATA port as optional, different SATA power ports, different layout of WAN/LAN ports, battery connector that is of no use, DC-IN is labeled '12V/2A' while it reads directly below '5 volt @2A via DC Power and/or Micro USB (OTG)'). Is this technical documentation or a joke?
     
    I started reading this section http://wiki.banana-pi.org/Banana_Pi_BPI-R2#Hardware_spec but already stopped after 7 lines of which 4 are wrong (more than 50%):
    'Quad-code ARM Cortex-A7' -- copy&paste as usual from their forum 'Two SATA 2.0 Port (USB-to-SATA bridge)' -- as far as I know that's not USB but PCIe based and an ASM1061 (providing SATA 3.0)? 'MTK6625L chip' -- does not exist 'resolutions up 1920x1200' -- according to SoC manual it's 1920x1080 instead So unfortunately still no technical documentation but just the careless 'copy&paste gone wrong' weirdness this SinoVoip employee is 'famous' for. Still zero progress :-(
     
  18. Like
    tkaiser got a reaction from datsuns in Quick review of NanoPi Fire3   
    Just spotted another important difference between NanoPi M3 and Fire3: 'AXP288 PMIC is gone, and replaced by an STM32 Cortex M0 MCU'. So DVFS implementation is different which also explains why with Armbian there's no difference in idle consumption/temperature when clocking with 400 vs. 1400 MHz.
  19. Like
    tkaiser got a reaction from guidol in orange-pi zero plus (H5) crashes when disabling CPUs   
    I experienced the same when playing with 4.14 on a NanoPi Fire 3. So maybe it's a problem of this kernel release? Anyway, I ended up adding
    extraargs="maxcpus=2" to /boot/armbianEnv.txt and rebooted to get a dual-core system.
  20. Like
    tkaiser got a reaction from gounthar in banana pi single boards   
    It's stupid to combine routing/firewalling with NAS and multimedia stuff on the same device. On a router/firewall you do NOT want to increase attack surface by unnecessarily running any unneeded services (especially no graphics stuff). Also you usually do NOT want to expose your NAS to the whole world (security best practices).
     
    Have you ever seen a RealTek SDK? Yeah, me neither. The W2 is a 'closed source' thingie (running Android and a chrooted OpenWRT/Linux inside) and the R2 might maybe supported sometimes in the future if mainline kernel support evolves and someone wants to spend his spare time on a questionable device (again: you do NOT want to combine routing/firewalling with NAS and multimedia stuff on the same device)
  21. Like
    tkaiser reacted to Igor in Improve 'Support over Forum' situation   
    Fixed but a bit different since it doesn't work the way we have before.
  22. Like
    tkaiser reacted to wtarreau in Quick review of NanoPi Fire3   
    Thomas, I think it's problematic that you only have a watt-meter including the PSU because the PSU's efficiency depends on the consumption (usually it's optimal around 50% load). Using a USB power meter would tell you the volts, amps and watts, and would even allow to detect under-power when it happens.
     
    I managed to get my board to shut down only once, it was powered by my laptop's USB3 port (that's what I do all the time but I'm probably close to the limit). It never happened on a 5V/2A PSU however. Since it was not yet very hot, I suspect that it's the power controller instead which had shut it down rather than the temperature.
     
    I also had to play with the critical points to avoid needlessly throttling. I seem to remember having set them to 105, 110 and 112 degrees though I may be wrong since I ran many tests. Now I packed the board inside a cardboard made "enclosure" from which the heat hardly dissipates, and it can still throttle when reaching the first critical point, but that doesn't last long.
    When it happens, usually it's at 832/1024 of 1600 MHz = 1300 MHz, and more rarely it's 640/1024*1600 = 1000 MHz. I haven't run cpuburn yet though, I can if you're interested.
     
    Regarding the use cases, I think they are limited but the board is awesome when they can be met. For me, it's fantastic as a developer to test threaded code scalability on up to 8 cores, given that the L2 cache is shared between all cores. Usually you need a huge power hungry CPU to get the same, here I have this in may laptop's bag. I also want to see what performance level I can reach on HTTP compression using libslz. I'm pretty sure that making some content recompression farms using such boards could be very affordable. Also the CPU supports native CRC32 instructions which are missing on x86 and affect gzip's performance, so I'll have to improve my lib to benefit form this. Miners may like to exploit some algorithms which perform well on ARMv8 and exploit the native AES and SHA2 implementations (I'm a bit clueless in this area). Last, for computer assisted vision, you have 8 cores with NEON ;-)
     
  23. Like
    tkaiser reacted to zador.blood.stained in zram vs swap   
    Because zram-config package exists only in Ubuntu by default, and for reasons I don't remember (probably version numbering / potential repository priority issues?) we decided to not copy it to our repository to be available for all releases.
     
    It's not set up for multi-user scenario yet (VM guests for different tasks and users instead of one shared space otherwise manual build tasks may conflict with automated ones)
     
    I meant this specific thread: https://forum.armbian.com/topic/6281-board-support-general-discussion-project-aims/
    Recent examples to add to that - there is no purpose in recently added "development" branch if "master" is completely abandoned as a result and suggesting to "switch to beta" to fix any issues defeats the purpose of "beta" - fixes should be immediately pushed to the stable branch and beta should be left for non-production ready stuff.
     
  24. Like
    tkaiser got a reaction from Tido in zram vs swap   
    Follow-Up on https://www.cnx-software.com/2018/05/13/how-to-get-started-with-opencl-on-odroid-xu4-board-with-arm-mali-t628mp6-gpu/ 
     
    It's about a compile job requiring up to 3 GB memory. Jean-Luc tested on an ODROID-XU4 with 2 GB RAM so he had to add 1 GB swap since otherwise compilation failed.
     
    It's 2018 and we all should know that swap on HDD is anachronistic or even stupid. But Linux users still do it since they did it already one or even two decades ago.
     
    Time for a 'quick' benchmark: I test on a board that is even more limited with regard to physical RAM:  It's a freshly arrived NanoPi Fire3 with eight A53 cores but just 1 GB RAM so very likely to run in out of memory situations with compile jobs running on all 8 cores in parallel.
     
    I used an Armbian Stretch image running with kernel 4.14.40 and configured 2.5 GB swap (either as /var/swap or zram). Also vm.swappiness is set to the default (60). 

    When compiling the job I've seen over 2.3 GB swap being used while building with zram with an average 3.5:1 compression ratio:
     
    root@nanopim3:/home/tk# zramctl  NAME       ALGORITHM DISKSIZE   DATA COMPR  TOTAL STREAMS MOUNTPOINT /dev/zram0 lz4         495.3M 370.1M 99.3M 104.5M       8 [SWAP] /dev/zram1 lz4         495.3M 370.2M 99.4M 104.5M       8 [SWAP] /dev/zram2 lz4         495.3M 371.3M 99.8M   105M       8 [SWAP] /dev/zram3 lz4         495.3M 368.8M 99.1M 104.2M       8 [SWAP] /dev/zram4 lz4         495.3M 370.2M 99.4M 104.5M       8 [SWAP] First tests were with zram (trying to compare lzo with lz4), then I used a swap file on ext4 formatted SD card (a fast A1 rated one and not 'the average SD card' showing magnitudes lower random IO performance). Afterwards I tested with an older 7200rpm notebook HDD and then a Samsung EVO 840 both attached via USB2 without UAS support (so both random and sequential IO are bottlenecked by USB2 -- 'typical SBC conditions'). Filesystem used was a freshly created ext4 so no performance difference between swap file and partition.
     
    Benchmark execution as follows: I deleted the working directory, then untared the archive (see above) and then let the following run:
    time scons Werror=1 -j8 debug=0 neon=1 opencl=1 embed_kernels=1 os=linux arch=arm64-v8a build=native Results:
    zram lzo                  46m57.427s zram lz4                  47m18.022s SSD                      144m40.780s SanDisk Ultra A1 16 GB   247m56.744s HDD                      570m14.406s  
    Using zram is a no-brainer. The job finished within 47 minutes with both lzo and lz4 compression scheme. When testing with swap on real storage for obvious reasons the SSD won (best random IO performance), then the A1 rated SD card could finish the job in 4 hours and of course swap on HDD failed miserably since spinning rust simply sucks when it's about random I/O performance. Swap on HDD is BS.
     
    A device with 1 GB physical DRAM is able to perform a build job requiring 3 GB memory within 47 minutes when using modern approaches (zram), when doing it like we did it last century (swapping to storage) then it's all about random I/O performance. A great performing SSD even if bottlenecked by USB2 is able to finish the job in less than two and a half hours. An el cheapo SanDisk Ultra A1 SD card with 16 GB scores 4 hours while as expected a 7.2k notebook HDD takes almost 10 hours to do the same job. HDDs suck at random IO. Period.
     
    What can we learn from this? Nothing since Armbian is only focused on adding more and more irrelevant boards to the build system instead of improving situation for all boards.
     
    As a reference iozone numbers for the 3 storage scenarios:
    SanDisk Ultra A1 random random kB reclen write rewrite read reread read write 102400 4 3677 3856 9344 9320 8070 2787 102400 16 3890 7780 16398 16216 14832 6716 102400 512 9925 13902 22494 22501 22411 13178 102400 1024 13826 13750 22651 22649 22640 13792 102400 16384 14031 13936 23141 23137 23163 15395 Hitachi 2.5" 7200 rpm random random kB reclen write rewrite read reread read write 102400 4 7537 9837 8016 7763 671 1119 102400 16 17452 20618 20852 20997 2522 4250 102400 512 33123 33403 34165 34756 25589 32262 102400 1024 33580 33310 34331 35056 29620 33148 102400 16384 32592 33804 34921 35591 35277 33838 Samsung EVO 840 SSD random random kB reclen write rewrite read reread read write 102400 4 7322 9481 9636 9682 7433 9496 102400 16 16654 19453 18576 18639 18301 19370 102400 512 31296 31489 32778 32876 32977 31337 102400 1024 31016 31986 33334 34283 33991 31273 102400 16384 30600 31336 33318 33678 33236 31913  
  25. Like
    tkaiser got a reaction from chwe in Some storage benchmarks on SBCs   
    More or less irrelevant since btrfs is a modern filesystem utilizing modern concepts. E.g. 'checksumming' to ensure data integrity. This does not only 'waste' CPU cycles but results also in higher storage activity for the same tasks (since checksums have to be calculated, written, read, verified).
     
    E.g. Copy-on-write (CoW) which directly affects performance of write patterns that are of the same size or less than the filesystem's block size (usually 4K or larger) since now every write is in reality a 'read, modify, write' cycle since already existing data needs to be read from disk, then the new stuff will be added and then the modified block will be written to a new location and only afterwards the old reference deleted. That's why btrfs and other CoW filesystems in all benchmarks writing small chunks of data shows horribly low performance. Same when running database benchmarks on btrfs with defaults --> horrible performance as expected since a CoW filesystem is nothing you want to put database storage on (and if you disable CoW in btrfs then also checksumming is gone and then you're better off using ext4 or XFS anyway)
     
    The (in)famous clickbait site dedicated to provide only numbers without meaning (Phoronix) periodically 'benchmarks' various filesystems inappropriately so if you're after numbers visit https://www.phoronix.com/scan.php?page=article&item=linux414-fs-compare and similar posts.
     
     
    Unfortunately not since passive benchmarking ('fire and forget' mode) never works. It only provides numbers without meaning and nice looking graphs but zero insights. With storage benchmarks as with every other benchmark only 'Active benchmarking' works. And that requires some understanding, a lot of time and the will to throw away most of your work (+95% of all benchmark results since usually something goes wrong, you have to find out what and then repeat). Almost nobody does this.
     
    On every benchmarked host but especially on SBC with their weak ARM CPU cores and limited resources you always need to monitor various resources in parallel (htop, 'iostat 5', 'vmstat 5' and so on), switch at least to performance governor, take care about process and IRQ affinity (watching htop) especially on those boards with big.LITTLE implementations since otherwise you end up with the usual passive benchmarking result: Casual benchmarking: you benchmark A, but actually measure B, and conclude you've measured C.
     
    Next problem: if you generated numbers it's about to generate insights from these numbers. Recently someone showed me this link as a proof that bcache in Linux would be a great way to accelerate HDD access by using SSDs: http://www.accelcloud.com/2012/04/18/linux-flashcache-and-bcache-performance-testing/
     
    True or not? What do the numbers tell? While almost everyone looking at those numbers and graphs will agree that bcache is great in reality it's exactly the opposite what these benchmark numbers show. But people ignore this since they prefer data over information and ignore what the numbers really tell them.
     
    Back on topic (SBC storage and not filesystem performance): only reasonable way to compare different boards is to use same filesystem (ext4 since most robust and not that prone to show different performance depending on kernel version) and performance governor, eliminating all background tasks that could negatively impact performance and having at least an eye on htop. If you see there one CPU core being utilized at 100% you know that you run in a CPU bottleneck and have to take this into account (either by accepting/believing that a certain SBC is simply too weak to deliver good storage performance since CPU bottlenecked or by starting to improve settings as it's Armbian's or my 'Active Benchmarking' approach with benchmarks then ending up with optimized settings --> there's a reason 'our' images perform on identical hardware sometimes even twice as fast as other SBC distros that don't care about what's relevant)
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines