ag123 Posted July 4 Author Posted July 4 this is somewhat 'off-topic' but still relevant to 'orange pi zero 3' If Orange Pi Zero 3 is operated in warm climates (e.g. room temperature 30 deg C etc) , it can at times run up to like 60 deg C. this is in open still air adding a fan blowing at it reduce that by some 20 deg C to 40 deg C ! And this is my ghetto fan setup, no fancy case, no heatsink nothing, just a single long machine screw that lifts it up checking temperatures is easy > armbianmonitor -m Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq Tcpu C.St. 18:03:39 480 MHz 0.00 0% 0% 0% 0% 0% 0% 40.8 °C 0/7^C strictly speaking, 60 deg C is 'nothing to scream about' , I've a Rpi 4 hitting up 80 deg C and it throttles. similarly use a fan blowing at it + a heat sink over the cpu, drastically reduce running temperatures. for 'occasional' use, I don't think it is necessary to have a fan blowing at the Orange Pi Zero 3. I think it is feasible to run at lower temperatures if I disable and unclock the GPU and HDMI, but for now I'm not sure how to go about doing that. Initially, I'm thinking maybe the wifi is causing it, but now I don't think so, it is moderately likely the gpu is heating it up a bit. And still air don't seem to dissipate heat very well. 0 Quote
ag123 Posted July 7 Author Posted July 7 just like to say that the recent images works just well _ _ _ _ _ /_\ _ _ _ __ | |__(_)__ _ _ _ __ ___ _ __ _ __ _ _ _ _ (_) |_ _ _ / _ \| '_| ' \| '_ \ / _` | ' \ / _/ _ \ ' \| ' \ || | ' \| | _| || | /_/ \_\_| |_|_|_|_.__/_\__,_|_||_|_\__\___/_|_|_|_|_|_\_,_|_||_|_|\__|\_, | |___| |__/ v25.8 rolling for Orange Pi Zero3 running Armbian Linux 6.12.35-current-sunxi64 Packages: Debian stable (bookworm) Support: for advanced users (rolling release) IPv4: (LAN) xxx.xxx.xxx.xxx (WAN) yyy.yyy.yyy.yyy IPv6: fd00:xxxx:xxxx::xxxx:xxxx (WAN) xxxx:xxxx::yyyy:yyyy WiFi AP: SSID: (ssid), Performance: Load: 2% Uptime: 3:50 Memory usage: 4% of 3.83G CPU temp: 41°C Usage of /: 3% of 58G RX today: 7 MiB Commands: Configuration : armbian-config Monitoring : htop 0 Quote
ag123 Posted Saturday at 10:27 AM Author Posted Saturday at 10:27 AM thread about video in case anyone is looking for it and a recent 'success story' 0 Quote
TRay Posted Saturday at 01:52 PM Posted Saturday at 01:52 PM On 6/29/2025 at 7:59 AM, TRay said: Will armbian-config be fixed for OZPI v3 etc. problem using overlay-prefix in device tree? https://github.com/armbian/configng/issues/592 @Igor thank you for fixing overlay-prefix problem in current armbin-config now on OZPI V1 and V3 overlays in armbianEnv.txt are without prefix 0 Quote
robertoj Posted Monday at 05:39 PM Posted Monday at 05:39 PM On 7/12/2025 at 3:27 AM, ag123 said: and a recent 'success story' My story is definitely NOT a success story I see the media.patches in the cache folder, I compile armbian edge, but the image didn't contain the cedrus+v4l2 kernel modules I need for decoding acceleration 0 Quote
ag123 Posted Tuesday at 05:01 AM Author Posted Tuesday at 05:01 AM Quote * mpv plays most mp4s VERY SMOOTHLY BUT WITH 100% CPU oops, I missed reading that 100% cpu, but it is ok it is a a53 after all 😅 videos I'd guess is still 'difficult' on z3, accordingly there is some support for gpu vector graphics but I'd guess mostly just triangles. video decoding can be done with just neon (vector computation) , but i'd guess there is still limited access to video decoding hardware. using neon is likely to give that 100% cpu reading as the cpu is busy literally, using real video hardware would be 'invisible' in a sense, the cpu usage may look low but that one won't see that the video hardware itself may after all be reading 100%. 0 Quote
robertoj Posted Tuesday at 05:29 PM Posted Tuesday at 05:29 PM (edited) 12 hours ago, ag123 said: oops, I missed reading that 100% cpu, but it is ok it is a a53 after all 😅 videos I'd guess is still 'difficult' on z3, accordingly there is some support for gpu vector graphics but I'd guess mostly just triangles. video decoding can be done with just neon (vector computation) , but i'd guess there is still limited access to video decoding hardware. using neon is likely to give that 100% cpu reading as the cpu is busy literally, using real video hardware would be 'invisible' in a sense, the cpu usage may look low but that one won't see that the video hardware itself may after all be reading 100%. Thank you for replying. I run xscreensavers-gl in a window and it always gets 30FPS with <10% CPU in my opiz3, I even show 3D models in F3D... so I am getting 3D MESA acceleration in HDMI and SPI-LCD displays. If the video decoding in my opiz3 is using ARM NEON instructions, then I am fortunate I have that, at least (note: this is possible without needing ffmpeg-v4l2request) I will have to re-check how I was successful with 1080P H264 acceleration last year (I was even getting temporary glitches and pink hues sometimes). Edited Tuesday at 05:30 PM by robertoj 0 Quote
ag123 Posted Tuesday at 10:12 PM Author Posted Tuesday at 10:12 PM Arm Neon is quite a thing, SIMD https://developer.arm.com/documentation/102159/latest/ https://github.com/thenifty/neon-guide and accordingly aarch64 (e.g. Cortex A53, A55, A72, A75, A76 etc etc i.e. arm V8a onwards have them) https://developer.arm.com/documentation/102474/0100/Fundamentals-of-Armv8-Neon-technology the H618 is an A53 and hence should have it. it is a good 'replacement' for proprietary hardware etc as this like Intel's sse, avx , simd are defiined and standardized by Arm. Hence, they'd work if programs are coded and compiled to use them. Accordiingly, the pripietary video hardware is still undocumented (at least not publicly accessible), and most of that works are reverse engineered and incomplete. apps written to use Neon SIMD would however 'just works' and accelerated by virtue that it is SIMD. 0 Quote
ag123 Posted Wednesday at 09:14 AM Author Posted Wednesday at 09:14 AM ok we have a cheap SBC Z3 H618, but we'd still want to run it as like a supercomputer https://linux-sunxi.org/Benchmarks#Linpack download https://www.netlib.org/benchmark/linpackc.new save as linpack.c makefiile all: linpack-noopt linpack-o3 linpack-noopt: linpack.c gcc -o $@ $^ linpack-o3: linpack.c gcc -O3 -o $@ $^ -lm -mcpu=cortex-a53 -march=armv8-a -ftree-vectorize -funsafe-math-optimizations clean: linpack-noopt linpack-o3 rm $^ .PHONY: all clean ok, for your convenience it is in the attached zip file. to unzip you may need (as sudo): apt install zip unzip for the compilers you may need apt install build-essential $ make gcc -o linpack-noopt linpack.c gcc -O3 -o linpack-o3 linpack.c -lm -mcpu=cortex-a53 -march=armv8-a -ftree-vectorize -funsafe-math-optimizations $ ./linpack-noopt Enter array size (q to quit) [200]: Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS ---------------------------------------------------- 32 0.68 88.14% 2.66% 9.20% 71117.671 64 1.36 88.13% 2.66% 9.21% 71103.230 128 2.72 88.14% 2.66% 9.20% 71118.447 256 5.44 88.14% 2.66% 9.20% 71117.368 512 10.89 88.14% 2.66% 9.20% 71118.505 Enter array size (q to quit) [200]: q $ ./linpack-o3 Enter array size (q to quit) [200]: Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS ---------------------------------------------------- 128 0.53 86.33% 2.89% 10.78% 374433.231 256 1.05 86.33% 2.88% 10.79% 374573.654 512 2.10 86.34% 2.88% 10.79% 374443.201 1024 4.21 86.32% 2.88% 10.80% 374574.751 2048 8.42 86.32% 2.88% 10.80% 374612.768 4096 16.83 86.33% 2.88% 10.79% 374574.926 Enter array size (q to quit) [200]: q This is single core benchmark, apparently gcc -o3 does Neon SIMD linpack.zip 0 Quote
robertoj Posted Wednesday at 04:54 PM Posted Wednesday at 04:54 PM (edited) I don't even have a reference viewpoint what should I start comparing? I read claims that Python3-numpy, python3-opencv are highly optimized, but I never researched HOW OPTIMIZED I have also heard that DRM can help accelerate machine learning https://www.youtube.com/watch?v=NQz6VqvtehI&t=5m7s Edited 23 hours ago by robertoj 0 Quote
Gabriel Negrisiolo Righi Posted 21 hours ago Posted 21 hours ago (edited) For those who are seeking to enable video decode i've been able to get it working with those libs https://www.elektroda.pl/rtvforum/topic4018092.html#20840047 download h618_hwdec.tar.gz and replace the libs and add "extraargs=cma=256M" to your /boot/armbianEnv.txt , mpv works flawlessly with --hwdec=drm --profile=fast, scrcpy is also fast and with minimal cpu usage... i'm also using rolling edge kernel, and latest 25.2 mesa from source libva-v4l2-request-HACK_HEVC.zip Edited 20 hours ago by Gabriel Negrisiolo Righi include info 1 Quote
ag123 Posted 16 hours ago Author Posted 16 hours ago @robertoj Quote I don't even have a reference viewpoint what should I start comparing? I read claims that Python3-numpy, python3-opencv are highly optimized, but I never researched HOW OPTIMIZED I have also heard that DRM can help accelerate machine learning https://www.youtube.com/watch?v=NQz6VqvtehI&t=5m7s well, Neon SIMD isn't just useful for that matrix math, it is useful e.g. as a video decoder/encoder in place of specialized on chip video hardware. it could partially explain the 'better performance' of mpv (https://mpv.io/) e.g. if mpv is after all built with -o3 or that mpv uses a library that is optimised iwth Neon SIMD, it could likely practically see a performance as the on-chip proprietary video hardware which is not publicly documented. with an apparent 100% cpu usage if all 4 cpu cores are used with Neon SIMD. I think I once chanced upon an Rpi forum comment about shifting the codes to Neon SIMD instead instead of using propietary video hardware, partly as these 'small' chips has 'limited' capabilities for on chip video processing etc. it isn;'t really a bad thing if after all we'd use say an Opi Z3 as a 'dedicated' video streamer. A thing is at 100% cpu, non compute threads may struggle to get a slot to run at times, it may take setting 'nice' levels so that some threads get a higher priority. I've been thinking about running a (crypto coin) miner on it, probably would do that some time. They certainly don't get close to say even a Haswell, or Ryzen or even a 'low end gpu' but that they are faster than the 'older' 'smaller' chips for a comparison, the quoted 'old' figures https://linux-sunxi.org/Benchmarks#Linpack -mcpu=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fno-fast-math Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS ---------------------------------------------------- 16 0.61 88.52% 6.56% 4.92% 37885.057 32 1.21 85.12% 2.48% 12.40% 41459.119 64 2.43 93.83% 2.47% 3.70% 37561.254 128 4.86 91.77% 2.47% 5.76% 38381.368 256 9.70 92.06% 2.89% 5.05% 38173.000 512 19.41 91.29% 2.47% 6.23% 38634.432 mcpu=cortex-a8 -mtune=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS ---------------------------------------------------- 16 0.53 90.57% 1.89% 7.55% 44843.537 32 1.05 90.48% 3.81% 5.71% 44390.572 64 2.13 90.14% 2.35% 7.51% 44615.905 128 4.23 90.54% 3.07% 6.38% 44390.572 256 8.46 90.19% 2.84% 6.97% 44672.596 512 17.03 90.55% 2.76% 6.69% 44250.892 vs that above is like 8x - 10x improvements on a single core 0 Quote
ag123 Posted 13 hours ago Author Posted 13 hours ago tried mining feathercoin, git clone https://github.com/ghostlander/cpuminer-neoscrypt lots of missing dependencies to build that apt install automake autoconf-archive pkg-config libtool libcurl4-openssl-dev but once done it is autogen.sh, configure, make next register on https://www.mining-dutch.nl/ then run ./minerd -D --algo=neoscrypt --url=stratum+tcp://mining-dutch.nl:9993 -u username.worker1 -p d=10 Spoiler Hash: 020E9F4B68201E05469CC87039286A9EEFAFB6525E9CDDAD40E4DCFC6D950000x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:06] thread 2: 537 hashes, 1.097 KH/s [2025-07-17 20:07:06] accepted: 14/14 (100.000%), 4.398 KH/s (yay!!!) [2025-07-17 20:07:14] DEBUG (little endian): hash <= target Hash: FD8193D5659404573894CC22F8A37859A2347842E382CCF151747A0E34220100x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:14] thread 1: 9481 hashes, 1.100 KH/s [2025-07-17 20:07:14] accepted: 15/15 (100.000%), 4.397 KH/s (yay!!!) [2025-07-17 20:07:18] DEBUG (little endian): hash <= target Hash: 9B92D3883EA6CE4BC66BE97CF9CDC6E8C51B0D28B8C3D7F2BA28B8C58D310100x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:18] thread 2: 13983 hashes, 1.100 KH/s [2025-07-17 20:07:18] accepted: 16/16 (100.000%), 4.400 KH/s (yay!!!) [2025-07-17 20:07:32] DEBUG (little endian): hash <= target Hash: 3FC00FD320605941571915FC723A4CBB2C9E798552E309C980F333CA76B40000x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:32] thread 1: 20346 hashes, 1.101 KH/s [2025-07-17 20:07:32] accepted: 17/17 (100.000%), 4.401 KH/s (yay!!!) [2025-07-17 20:07:34] thread 3: 66030 hashes, 1.100 KH/s [2025-07-17 20:07:36] DEBUG (little endian): hash <= target Hash: 787E9A8E6CF0D6293659419BA9963ED809F7D606A2AD8E879968A5B075620000x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:36] thread 1: 4421 hashes, 1.100 KH/s [2025-07-17 20:07:36] accepted: 18/18 (100.000%), 4.400 KH/s (yay!!!) [2025-07-17 20:07:42] DEBUG (little endian): hash <= target Hash: 27163C9021EAE5D4D5D458F384751990AE336F19BD3AC8D980AA97C55A940000x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:07:42] thread 3: 8654 hashes, 1.101 KH/s [2025-07-17 20:07:42] accepted: 19/19 (100.000%), 4.400 KH/s (yay!!!) [2025-07-17 20:08:00] thread 0: 65995 hashes, 1.100 KH/s [2025-07-17 20:08:18] thread 2: 65977 hashes, 1.100 KH/s [2025-07-17 20:08:36] thread 1: 65992 hashes, 1.100 KH/s [2025-07-17 20:08:37] DEBUG (little endian): hash <= target Hash: 037AD1DB8D7C7FAB0E022F8541BA34D67A7E9B3695B396FAF84C6D47AD950000x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:08:37] thread 0: 40427 hashes, 1.100 KH/s [2025-07-17 20:08:37] accepted: 20/20 (100.000%), 4.401 KH/s (yay!!!) [2025-07-17 20:08:39] DEBUG (little endian): hash <= target Hash: 5FBB0159477A05E1324B6A3D240B89DAEDF34A83A8AA78FE155FA86DE3140100x0 Target: 0000000000000000000000000000000000000000000000000000000098990100x0 [2025-07-17 20:08:39] thread 3: 62568 hashes, 1.101 KH/s [2025-07-17 20:08:39] accepted: 21/21 (100.000%), 4.401 KH/s (yay!!!) a whopping 1.1 k hash/s on each core, well not very impressivve, but it mines i think this is no Neon SIMD Stop monitoring using [ctrl]-[c] Time CPU load %cpu %sys %usr %nice %io %irq Tcpu C.St. 20:22:02 1416 MHz 3.90 100% 0% 0% 99% 0% 0% 53.2 °C 0/7 20:22:07 1416 MHz 3.91 100% 0% 0% 99% 0% 0% 53.2 °C 0/7 20:22:12 1416 MHz 3.92 100% 0% 0% 99% 0% 0% 53.4 °C 0/7 ^ this is with the fan on optimise it a little in makefile #CFLAGS = -g -O2 CFLAGS = minerd_CPPFLAGS = -O3 -mcpu=cortex-a53 -march=armv8-a -ftree-vectorize -funsafe-math-optimizations Spoiler [2025-07-17 20:41:42] DEBUG (little endian): hash <= target Hash: 02D285CA9C499E30195BD3EC4F4DF544D03EDDC7A00DFC0D255EB1E7BA160000x0 Target: 0000000000000000000000000000000000000000000000000000008099190000x0 [2025-07-17 20:41:42] thread 1: 19364 hashes, 1.127 KH/s [2025-07-17 20:41:42] accepted: 1/1 (100.000%), 1.127 KH/s (yay!!!) [2025-07-17 20:41:54] thread 3: 32766 hashes, 1.128 KH/s [2025-07-17 20:41:54] thread 0: 32766 hashes, 1.127 KH/s [2025-07-17 20:41:54] thread 2: 32766 hashes, 1.124 KH/s [2025-07-17 20:42:42] thread 1: 67603 hashes, 1.127 KH/s [2025-07-17 20:42:53] thread 3: 67650 hashes, 1.128 KH/s [2025-07-17 20:42:54] thread 0: 67639 hashes, 1.127 KH/s [2025-07-17 20:42:54] thread 2: 67439 hashes, 1.124 KH/s [2025-07-17 20:43:42] thread 1: 67627 hashes, 1.128 KH/s well, just a very minor 0.025 k hash/s improvement per core. perhaps it already has Neon SIMD or that it needs 'hand optimization', that is hard. 0 Quote
Nick A Posted 2 hours ago Posted 2 hours ago (edited) If you are having trouble with hardware decoding. You need to disable the compositor. Follow jock’s setup instructions in the link below. Use this command to restart xfwm4 with compositor disabled. killall xfwm4 && xfwm4 --compositor=off & Edited 40 minutes ago by Nick A 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.