schunckt Posted Sunday at 11:31 AM Posted Sunday at 11:31 AM Hi all! On NanoPi Duo2 I'm trying to use the builtin video hw processor. ffmpeg already works with -hwaccel v4l2request but throws errors: Press [q] to stop, [?] for help [h264 @ 0x11645f0] Using V4L2 media driver cedrus (6.12.35) for S264 [V4L2RequestContext @ 0xae5f0db0] Failed to create buffer of type 1: Cannot allocate memory (12) [h264 @ 0x11645f0] Failed setup for format drm_prime: hwaccel initialisation returned error. dmesg [ 8906.864389] cma: __cma_alloc: reserved: alloc failed, req-size: 3038 pages, ret: -12 [ 8906.872255] cma: number of available pages: 42@86+128@384+34@3550+34@6622+34@9694+34@12766+34@15838+34@18910+34@21982+1570@25054=> 1978 free of 26624 total pages [ 8906.886783] cedrus 1c0e000.video-codec: dma alloc of size 12443648 failed root@nanopiduo2:~# I already did play around with armbianEnv tweaking extraargs cma but no success. I found a link talking about VPU device tree dma limitations https://git.sec.in.tum.de/croemheld/linux/-/blob/v5.1-rc5/Documentation/devicetree/bindings/media/cedrus.txt Zitat Device-tree bindings for the VPU found in Allwinner SoCs, referred to as the Video Engine (VE) in Allwinner literature. The VPU can only access the first 256 MiB of DRAM, that are DMA-mapped starting from the DRAM base. This requires specific memory allocation and handling. I already decompiled the DT and verified there are is no such "reserved-memory" section. Is this the root cause? Maybe someone can provide some hints or ideas confirming that I'm on the right track? If yes I'd give it a try adjusting the DT. T. 0 Quote
laibsch Posted Sunday at 12:27 PM Posted Sunday at 12:27 PM are you sure this is not a genuine running-out-of-memory situation? 0 Quote
going Posted Sunday at 02:34 PM Posted Sunday at 02:34 PM (edited) 3 часа назад, schunckt сказал: I found a link talking about VPU device tree dma limitations https://git.sec.in.tum.de/croemheld/linux/-/blob/v5.1-rc5/Documentation/devicetree/bindings/media/cedrus.txt v5.1-rc5 3 часа назад, schunckt сказал: [h264 @ 0x11645f0] Using V4L2 media driver cedrus (6.12.35) v6.12.35 Please use the current documentation for the CURRENT kernel. sun4i-a10-video-engine.yaml sun8i-h3-deinterlace.yaml Documentation/arch/arm/sunxi.rst arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-duo2.dts arch/arm/boot/dts/allwinner/sun8i-h3.dtsi P.S. Please read this. repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner Edited Sunday at 02:43 PM by going Add P.S. 0 Quote
schunckt Posted yesterday at 06:45 AM Author Posted yesterday at 06:45 AM Great, thanks for the links! Meanwhile it works partially. The DT tweaking was not needed. I made a mistake by specifying the armbian extraargs. I added a second line to armbianEnv.txt but realized all args must be one line. I had to increase cma=256M (Yes, really, tested all lower values). Then it works, BUT ... Fun fact, will further investigate: Decoding with "-hwaccel drm" results in lower fps (about 6..8) whereas software decoder goes up to 10 😀 Decoding has been verified with htop. CPU only => 4 cores 100%. hwaccel one core about 20% which is likely th e yuv to rgb and scaling. Maybe this is still an issue caused by DT, at least when reading https://gregdavill.com/posts/allwinner-s3-videoencoders/ It specifies memory-region = <&cma_pool>; which is missing in my decompiled DT, also the referenced reserved-memory section. Maybe that's not needed if it's coded inside the driver or specified elswhere. T. 0 Quote
robertoj Posted yesterday at 05:27 PM Posted yesterday at 05:27 PM On 8/10/2025 at 7:34 AM, going said: v6.12.35 Please use the current documentation for the CURRENT kernel. If it doesn't work, compile your own Armbian with EDGE linux (what worked for me). Stay away from Trixie at this time (its mpv doesn't work as well as Bookworm's) 0 Quote
going Posted yesterday at 05:33 PM Posted yesterday at 05:33 PM 4 минуты назад, robertoj сказал: compile your own Armbian with EDGE linux Which OS works well for you? 0 Quote
robertoj Posted yesterday at 05:36 PM Posted yesterday at 05:36 PM 1 minute ago, going said: Which OS works well for you? Self compiled Armbian Bookworm + XFCE with Linux Edge 6.15.x Then follow all the instructions in https://forum.armbian.com/topic/32449-repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner/#findComment-176981 Then add extraargs=cma=256M to armbianEnv.txt 1 Quote
Ryzer Posted yesterday at 07:41 PM Posted yesterday at 07:41 PM Some fixes are kernel specific. If I understand correctly, the "memory-region" is only necessary when using the legacy cedar driver with a more recent kernel. It is supported up to kernel 6.1. You can confirm CMA allocation by running "sudo dmesg | grep CMA" or by running "cat /proc/meminfo | grep Cma" That's interesting, although cedrus only acts as the video decoding engine while the display engine is responsible for the actual rendering. 0 Quote
schunckt Posted 15 hours ago Author Posted 15 hours ago @robertoj right, that worked for me as well Zitat Then add extraargs=cma=256M to armbianEnv.txt But the remaining issue is the slowness. Meanwhile I also tested ffmpeg unscaled and no rgb conversion to /dev/null outputt but still slow. Maybe there is some pre/postprocessing done which cloud be tuned further. If i remember right there are some v4l2* features which may impact the processing but I'm not sure if this was camera capture related. Another idea: It seems the VPU clock source is configurable (inside DT) maybe that's not quite right? T. 0 Quote
robertoj Posted 13 hours ago Posted 13 hours ago (edited) Did you compile your armbian OS with linux edge 6.15.x, bookworm, xfce? https://forum.armbian.com/topic/32449-repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner/#findComment-216587 Edited 13 hours ago by robertoj 0 Quote
schunckt Posted 8 hours ago Author Posted 8 hours ago No, I did not compile this time. I used the downloaded image (need to double check exactly which one). Before trying this path (have to update my build env first😀) I'd try to get a better understanding about the root cause of the slowness. I think next I'll play around with mpv instead of ffmpeg. I'd prefer ffmpeg for other reasons, but testing mpv is worth to spend some time. T. 0 Quote
robertoj Posted 40 minutes ago Posted 40 minutes ago My main theory is that linux 6.12 doesn't have the v4l2 improvements needed for hw acceleration, that you can only get with linux 6.13.... The link i published explains that. 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.