Jump to content

sfx2000

Members
  • Posts

    631
  • Joined

  • Last visited

Everything posted by sfx2000

  1. You made a good point earlier on another thread about compilers - sysbench is a good example of something that is very sensitive to options and versions - similar to UnixBench.... Brendan Gregg has a great rant here -- http://www.brendangregg.com/blog/2014-05-02/compilers-love-messing-with-benchmarks.html key takeaway... Which is a thing to consider - Gregg is an active supporter of benchmarking, but in a different manner - worth checking out not only the linked article, but the other posts on his site, and the presentation videos.
  2. We're getting off topic, but yes, I agree - and we'll likely see any effort on optimization on lzo/lz4 on Aarch64 first - mainly due to the gazillions of Android 64 bit phones and the recent push for big CPU's in the data center (Ampere and Thunder X2 for example). If it trickles down to ARMv7-A, that would be nice... I'd expect ARM to take the lead on that, just like Intel did with OpenSSL - where they leveraged SSE4 along with AES-NI to improve certain things (AES-128-GCM for example, even on non AES-NI, it performs very well)
  3. Noticed today that this happens with Armbian 5.6... Looking at something else, downloaded the image for NanoPi NEO, and did the update/upgrade thing, and this popped up.. update-initramfs: Generating /boot/initrd.img-4.14.70-sunxi I: The initramfs will attempt to resume from /dev/zram4 I: (UUID=b387caf1-0414-4831-98f6-e8e10b3c3daa) I: Set the RESUME variable to override this.
  4. So I bumped the clocks to 1.2GHz, and it still runs fairly cool - putting a bit of stress on it (unixbench -c 4 - it's not CPU burn, I know, but it's still a sustained load over a lot of different mixes of instructions) Temps hit a high of 64C, which is pretty good all things considered - credit to FA for including a good heatsink there... I'll give @tkaiser excellent bench script a try in a bit... but I'm thinking nothing exceptional, as he's tested the board extensively...
  5. Kind of reminds me of the Intel System Management Engine - the AR100 block has direct access to memory outside of the ARM's, Mali, and Cedrus - could be a security concern... and the firmware blob from AW is a closed binary (some folks have decompiled it) Anyways - did some spelunking around the webs, and found this... https://github.com/allwinner-zh/linux-3.4-sunxi/tree/master/drivers/arisc I don't think mainline includes this, as there are other ways to do that functionality - some of the Orange PI images do it include it (remember the Orange Pi Zero overheating issue, where ARISC would spam the logs if the right image wasn't used).
  6. Confirmed that AR100 is not involved with the FA Ubuntu 16.04 image, so that's solved... The FA image borrows a lot from the community at large - they do include all their add-ons to support the boards and hats - GPIO is incorrect as it tries to ID the board as an M1, not NEO, but there's a guy over in Japan that has a fix to sort that out. It does run better than the old Allwinner BSP, but they know that the GPU/VPU isn't fully baked yet, so if that's important, either the AW BSP or something else might be the right path until the bootlin stuff gets fully mainlined... Going to give another stab at 5.6, will try the Bionic release, rather than stretch which I tried a bit earlier.
  7. Checking in... NanoPi-NEO v.1.31 with the OLED hat... need to be a bit specific as there are multiple HW revs of the NEO - v1.2 and later moved from LDO to a DC/DC regulator, so info from 1.0/1.1 might not be accurate these days - still sorting out the regulator in use, as there are a couple of options with Allwinner H3 there. I think it's the SY8113B, but hard to tell... Testing with Armbian 5.61 vs. FriendlyArm's 4.14.52 build for their OLED hat (I'd source a direct link, but they have two source - baidu and google drive) -- The friendlyarm image is pretty good with temps on the 4.14.52 kernel, typically around 31-33c idle, and I've not seen temps above 46c under load. the friendly arm image - the Clocks are pretty conservative - max cpu is 1.0GHz... I'm assuming that mem is at 432 defaults from allwinner bsp tinymembench... standard memcpy : 448.1 MB/s (2.9%) standard memset : 1448.4 MB/s (2.5%) (note - the NEO is well heat-sinked with the provided hs, and it's in a metal closure - it's the cute little item from friendlyarm... so one has the plate HS and the housing) My concern - armbian 5.61 ubuntu idles at 42c in the same scenario... it's the 10 degree difference that's interesting in idle state. subjectively - one can feel the housing warmer with the armbian 5.61 image running - but subjective results are never good... Looking into this, not sure which numbers to trust there - did friendlyarm pull in the dvfs changes? are they levering into the AR100 system manager block? Comment - as soon as one gets into AR100 - it's a blackbox blob - so some concerns there obviously as it's a system manager as such, perhaps better than superuser privs I'm assuming that current Allwinner armbian mainline pulls in the linux upstream here so some of the dvfs items haven't been pushed and accepted there. thoughts?
  8. I suppose one could go into network manager and set ipv6 to link-local only, this will result in your device getting an fe80::/64 address - which is handy for mDNS/Avahi on the local subnet...
  9. Looks like you and the team already did the investigation there with swappiness set to zero - and I agree with the results... Which is pretty much what I said...
  10. Agreed - doesn't make sense when looking at paper, but in the real world, lzo can do better than lz4 in the zram use case... There was a ticket open over on ChromiumOS - and performance for them was a wash with the code they have... They already had lzo in place, and lz4 didn't offer enough improvement so it got moved to wontfix - only bring this up as there was a fair amount of discussion about it, and supports the current armbian position.
  11. Swappiness=0 didn't really disable swap, it just makes the kernel very aggressive not wanting to swap - even to the point of it getting out the OOM killer with extreme prejudice to kill off tasks, depending on other factors... the easiest way to ensure that swap is disabled is never to create a swap file/partition in the first place, and if that is present, swapoff -a fixes that... The flow diagram that @tkaiser points out is a good representation of the complexities involved... ZRAM, by and large, is a good thing, and I was very happy to see it enabled as part of the distro in the first place (just like mapping out /tmp to tmpfs, which is also good for flash based devices). Anyways, never took it personal, and initiating the discussion wasn't intended to be a personal attack on things - that it encourages a documentation change is a good thing and everybody wins! That's the best possible outcome of constructive discourse.
  12. Are you referring to the bootlin effort on the Allwinner cedrus stuff? Great work there from a high-quality team... not sure if all they've done has made it into mainline yet... but it's promising. A word on transcoding in general - one is always going to get a quality hit when moving from one format to another - I'd always consider doing it at the last stage for playback only, not for archival purposes...
  13. I'll be the first to apologize, for opening a can of worms - whatever the defaults in armbian are, that's ok, people have spent time evaluating things, and have made decisions that are generally good in their experience. The VM params - most of them are tunable, can be done on the fly without a reboot - different use cases - desktop general vs. database server (yes, people can and do run databases on SBC's), each is going to have params that are more suitable for that particular application. On a personal level - yes, I'm sometimes very forward and frank on stating my opinions, and it can be offensive - it's not intended to be, and I'm sorry.
  14. Confused by your statement here... or maybe reflecting your understanding of things... Simply put - benchmarks are one thing, but application responsiveness is another - one can chose to aggressively swap or not - the value right now with Armbian default, IMHO, is excessively high, and one that concurs with Android and ChromiumOS devs, along with rest of world... including those of us that have scaled across different architectures, use cases, and platforms. For those who want to test and evaluate on their own.... read them first, so one knows the current state before changing... cat /proc/sys/vm/swappiness cat /proc/sys/vm/vfs_cache_pressure Now we can change this - it's a hot change, no need to reboot... sudo sysctl -w vm.swappiness=60 - range is 0 to 100 # default here is the kernel at 60 since @tkaiser brings this up, one can also tinker about with the file system pressure - sudo sysctl -w vm.vfs_cache_pressure=100 # the default here is 100, but one can gently turn it down try 10 for swappiness and 50 for cache pressure - works well with small machines Definitive source -- https://www.kernel.org/doc/Documentation/sysctl/vm.txt
  15. That I do agree - but Apple does have the benefit of integrating their own HW (even chips this days with iOS) with their SW...
  16. Did you test with vm.swappiness set to 10, which was part of the discussion in the first place? You mention 60 and 100... And benchmarking makes for nice numbers, but look at the real-world experience across different applications.
  17. Apologies up front - after digging thru the forums, you have a fair investment in your methods and means... fair enough, and much appreciated. Just ask that you keep an open mind on this item - I've got other things to worry about... current tasks are rk3288 clocks and temps, and an ask to look at rk_cypto performance overall... Keep it simple there... many use cases to consider - one can always find a benchmark to prove a case... I've been there, and this isn't the first ARM platform I've worked with - I've done BSP's for imx6, mvedbu, broadcom, and QCA... not my first rodeo here. Just trying to help.
  18. Actually it does and doesn't - with big.LITTLE, we have ARM GTS on our side which makes things a bit transparent, so one can always do a single zram pool and let the cores sort it out with the appropriate kernel patches from ARM... my little script assumes all cores are the same, so we do take some liberty there with allocations...
  19. Pi's and VC4 VCOS - aka ThreadX as per what @tkaiser refers to... Seems that the Pi Folks have done a bit - however, I agree that VCOS does stretch things a bit - if you ask VCOS, it says one thing, you ask the kernel, it says another, and always trust the kernel... Example below - 4 threads on UnixBench - Pi3BPlus is running at 1.4GHz, and the results suggest that everything is good there when compared to it's little brother, the Pi3 - current rpi-firmware throttles back at 80c in my experience, and under load, it dances close to it... for folks that work on DVFS curves, there's a bit to appreciate there. Notice there's no throttle at 60c now, and hasn't been for a while - the gimpage was early on with the 1.4GHz product... That being said - don't trust VCOS reports from userland -- trust the kernel, with 4.14, the numbers are honest - for all Pi's... pinfo.sh... #!/bin/bash # pinfo.sh looks at firmware/kernel - clocks, temps for cpu # create this file in the user dir, and make it executable celsius=$(cat /sys/class/thermal/thermal_zone0/temp | sed 's/.\{3\}$/.&/') clock0=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/') # can perhaps comment below out, some SoC's gang the clocks - but with a quad, all are reported clock1=$(cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/') clock2=$(cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/') clock3=$(cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq | sed 's/.\{3\}$/.&/') echo "Host => $(date) @ $(hostname)" echo "Uptime =>$(uptime)" echo "SW Rev => $(uname -vr)" # VC4 stuff for RPi variants - comment out as needed for non-VC targets echo "FW Rev => $(vcgencmd version)" echo "===============" echo "ARM Mem => $(vcgencmd get_mem arm)" echo "GPU Mem => $(vcgencmd get_mem gpu)" echo "===============" echo "Pi Temp => $(vcgencmd measure_temp)" echo "Pi Volts => $(vcgencmd measure_volts core)" echo "Pi Clock => $(vcgencmd measure_clock arm)" # end VC4 stuff # rest is from linux, should apply to any echo "===============" echo "ARM Temp => ${celsius} °C" echo "Core0Clock=> ${clock0} MHz" # see above - no harm keeping this, but one must be consistent echo "Core1Clock=> ${clock1} MHz" echo "Core2Clock=> ${clock2} MHz" echo "Core3Clock=> ${clock3} MHz" echo "==============="
  20. Hint - putting a task to observe changes the behavior, as the task itself takes up time and resources... Even JTAG does this, and I've had more than a few junior engineers learn this the hard way... Back in the days when I was doing Qualcomm MSM work - running the DIAG task on REX changed timing, or running additional debug/tracing in userland - so things that would crash the MSM standalone, wouldn't crash when actually trying to chase the problem and fix it. This was especially true with the first MSM's that did DVFS - the MSM6100 was the first one I ran into... It's a lightweight version of Schrödinger's Cat -- https://en.wikipedia.org/wiki/Schrödinger's_cat I always asked my guys - "did you kill the cat?" on their test results....
  21. It would be awesome to have a decent NIC onboard, 2*2 MIMO and/or 5GHz... I think part of the challenge here is that the sub-$50 SBC boards are largely driven by cost considerations, so the on-board wireless NIC's are obviously either the lowest cost available, including regulatory testing, which drives the System in Package* like the Ampak's etc, or heavily subsidized, e.g. the Broadcom NIC's soldered directly on the Pi boards - fair to say that Broadcom has a major interest in the Pi boards... * SIP's have an advantage, as the SIP vendor does the heavy wireless testing/certification, so when integrated onto a board, it's less work for the OEM/ODM there, and work == non recurring expense costs That and available interfaces on the SoC itself, which most vendors go to SDIO because it's generally there, and it's easy enough for a combo part that has BT to run both interfaces, WiFi on SPI and BT on UART. That's why when one looks at the BW on the Pi3B+ in 5GHz, even though it's a single stream 11ac connection, the constraint is not on the OTA, but the bus that supports the NIC in the first place - SPI/SDIO can only go so fast, even on a dedicated bus that is not shared with SDCard or eMMC. The external NIC support, esp with USB wifi, on Armbian is outstanding something to be appreciated...
  22. I think we're going to have to agree to disagree here - and frank discussion is always good... What you have to look at is the tendency to swap, and what that cost actually is - one can end up unmapping pages if not careful, and have a less responsive system - spinning rust, compcache, nvme, etc... swap is still swap. swap_tendency = mapped_ratio/2 + distress + vm_swappiness (for the lay folks - the 0-100 value in vm.swappiness is akin to the amount free memory in use before swapping is initiated - so a value of 60 says that as long as we have free memory of 60 percent, we don't swap, if less than that, we start swapping out pages - it's a weighted value) So if you want to spend time thrashing memory, keep it high - higher does keep the caches free, which may or may not be desired depending on the particular workload in play... worst case if set too high, app responsiveness may suffer... One of the other consideration is that some apps does try to manage their own memory - mysql/mariadb is a good example, where it can really send memory manager off the deep end if heavily loaded... So it's ok to have different opinions here, and easy enough to test/modify/test again... for those that want to play - it's easy enough to change on the fly.... sudo sysctl -w vm.swappiness=<value> # the range here is 0-100 - 0 is swap disabled
  23. Tossing this over the fence... Really need to get some 5GHz action going - the Cypress (ex-Broadcom) devices do well here... Rpi3B Plus... some concern about the retr on the one side - it's likely that the 802.11ac link is faster than the SDIO link on the Pi, so there's an issue there with flow control perhaps - not much different that the Pi3BPlus on ethernet, which is a similar issue with GBe on the PHY there... iperf3 -c 192.168.1.20 -t 120 && iperf3 -R -c 192.168.1.20 -t 120 Source to sink... [ 4] 0.00-120.00 sec 1.58 GBytes 113 Mbits/sec 0 sender [ 4] 0.00-120.00 sec 1.58 GBytes 113 Mbits/sec receiver sink to source... [ 4] 0.00-120.00 sec 1.50 GBytes 108 Mbits/sec 3558 sender [ 4] 0.00-120.00 sec 1.50 GBytes 107 Mbits/sec receiver sudo iwconfig wlan0 wlan0 IEEE 802.11 ESSID:"homeernet" Mode:Managed Frequency:5.18 GHz Access Point: 90:72:40:AA:BB:CC Bit Rate=433.3 Mb/s Tx-Power=31 dBm Retry short limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:on Link Quality=70/70 Signal level=-38 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0
  24. Just for fun - http://ix.io/1niO -- Pi Zero W with stock raspbian and current (as of 3/21/2018) firmware... @tkaiser -- Might reconsider the reported gimping of the Pi3 B+, the current VC4 firmware resolves many of the concerns raised on the github, and there's still a an item with "vcgencmd measure_clock arm" where it returns values that are consistent with Pi3, but not the Plus there - so the "fake" results are probably what's returned from the VC4 "vcgencmd measure_temp" is a bit more accurate than the arm temps there on the chip, and it measures the chip package, not the actual cores, which is expected, as we're asking the firmware which reports what VC4 sees I trust the kernel returns more than the VC4 info for clocks - and data between Pi3 and Pi3 Plus confirms this... here's current pi3 Plus results... http://ix.io/1niD Similar with Pi Zero W - where VC4 reports a consistent 1GHz, but kernel shows that the arm is at 700MHz when idle, even though VC4 reports 1GHz all the time VC4 is a mess - that much is true... and there's little insight into what Broadcom provides there. Maybe the folks at RPf know more, but that's likely all NDA and closed...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines