schunckt Posted August 10 Posted August 10 Hi all! On NanoPi Duo2 I'm trying to use the builtin video hw processor. ffmpeg already works with -hwaccel v4l2request but throws errors: Press [q] to stop, [?] for help [h264 @ 0x11645f0] Using V4L2 media driver cedrus (6.12.35) for S264 [V4L2RequestContext @ 0xae5f0db0] Failed to create buffer of type 1: Cannot allocate memory (12) [h264 @ 0x11645f0] Failed setup for format drm_prime: hwaccel initialisation returned error. dmesg [ 8906.864389] cma: __cma_alloc: reserved: alloc failed, req-size: 3038 pages, ret: -12 [ 8906.872255] cma: number of available pages: 42@86+128@384+34@3550+34@6622+34@9694+34@12766+34@15838+34@18910+34@21982+1570@25054=> 1978 free of 26624 total pages [ 8906.886783] cedrus 1c0e000.video-codec: dma alloc of size 12443648 failed root@nanopiduo2:~# I already did play around with armbianEnv tweaking extraargs cma but no success. I found a link talking about VPU device tree dma limitations https://git.sec.in.tum.de/croemheld/linux/-/blob/v5.1-rc5/Documentation/devicetree/bindings/media/cedrus.txt Zitat Device-tree bindings for the VPU found in Allwinner SoCs, referred to as the Video Engine (VE) in Allwinner literature. The VPU can only access the first 256 MiB of DRAM, that are DMA-mapped starting from the DRAM base. This requires specific memory allocation and handling. I already decompiled the DT and verified there are is no such "reserved-memory" section. Is this the root cause? Maybe someone can provide some hints or ideas confirming that I'm on the right track? If yes I'd give it a try adjusting the DT. T. 0 Quote
laibsch Posted August 10 Posted August 10 are you sure this is not a genuine running-out-of-memory situation? 0 Quote
going Posted August 10 Posted August 10 (edited) 3 часа назад, schunckt сказал: I found a link talking about VPU device tree dma limitations https://git.sec.in.tum.de/croemheld/linux/-/blob/v5.1-rc5/Documentation/devicetree/bindings/media/cedrus.txt v5.1-rc5 3 часа назад, schunckt сказал: [h264 @ 0x11645f0] Using V4L2 media driver cedrus (6.12.35) v6.12.35 Please use the current documentation for the CURRENT kernel. sun4i-a10-video-engine.yaml sun8i-h3-deinterlace.yaml Documentation/arch/arm/sunxi.rst arch/arm/boot/dts/allwinner/sun8i-h3-nanopi-duo2.dts arch/arm/boot/dts/allwinner/sun8i-h3.dtsi P.S. Please read this. repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner Edited August 10 by going Add P.S. 0 Quote
schunckt Posted August 11 Author Posted August 11 Great, thanks for the links! Meanwhile it works partially. The DT tweaking was not needed. I made a mistake by specifying the armbian extraargs. I added a second line to armbianEnv.txt but realized all args must be one line. I had to increase cma=256M (Yes, really, tested all lower values). Then it works, BUT ... Fun fact, will further investigate: Decoding with "-hwaccel drm" results in lower fps (about 6..8) whereas software decoder goes up to 10 😀 Decoding has been verified with htop. CPU only => 4 cores 100%. hwaccel one core about 20% which is likely th e yuv to rgb and scaling. Maybe this is still an issue caused by DT, at least when reading https://gregdavill.com/posts/allwinner-s3-videoencoders/ It specifies memory-region = <&cma_pool>; which is missing in my decompiled DT, also the referenced reserved-memory section. Maybe that's not needed if it's coded inside the driver or specified elswhere. T. 0 Quote
robertoj Posted August 11 Posted August 11 On 8/10/2025 at 7:34 AM, going said: v6.12.35 Please use the current documentation for the CURRENT kernel. If it doesn't work, compile your own Armbian with EDGE linux (what worked for me). Stay away from Trixie at this time (its mpv doesn't work as well as Bookworm's) 0 Quote
going Posted August 11 Posted August 11 4 минуты назад, robertoj сказал: compile your own Armbian with EDGE linux Which OS works well for you? 0 Quote
robertoj Posted August 11 Posted August 11 1 minute ago, going said: Which OS works well for you? Self compiled Armbian Bookworm + XFCE with Linux Edge 6.15.x Then follow all the instructions in https://forum.armbian.com/topic/32449-repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner/#findComment-176981 Then add extraargs=cma=256M to armbianEnv.txt 1 Quote
Ryzer Posted August 11 Posted August 11 Some fixes are kernel specific. If I understand correctly, the "memory-region" is only necessary when using the legacy cedar driver with a more recent kernel. It is supported up to kernel 6.1. You can confirm CMA allocation by running "sudo dmesg | grep CMA" or by running "cat /proc/meminfo | grep Cma" That's interesting, although cedrus only acts as the video decoding engine while the display engine is responsible for the actual rendering. 0 Quote
schunckt Posted August 12 Author Posted August 12 @robertoj right, that worked for me as well Zitat Then add extraargs=cma=256M to armbianEnv.txt But the remaining issue is the slowness. Meanwhile I also tested ffmpeg unscaled and no rgb conversion to /dev/null outputt but still slow. Maybe there is some pre/postprocessing done which cloud be tuned further. If i remember right there are some v4l2* features which may impact the processing but I'm not sure if this was camera capture related. Another idea: It seems the VPU clock source is configurable (inside DT) maybe that's not quite right? T. 0 Quote
robertoj Posted August 12 Posted August 12 (edited) Did you compile your armbian OS with linux edge 6.15.x, bookworm, xfce? https://forum.armbian.com/topic/32449-repository-for-v4l2request-hardware-video-decoding-rockchip-allwinner/#findComment-216587 Edited August 12 by robertoj 0 Quote
schunckt Posted August 12 Author Posted August 12 No, I did not compile this time. I used the downloaded image (need to double check exactly which one). Before trying this path (have to update my build env first😀) I'd try to get a better understanding about the root cause of the slowness. I think next I'll play around with mpv instead of ffmpeg. I'd prefer ffmpeg for other reasons, but testing mpv is worth to spend some time. T. 0 Quote
robertoj Posted August 12 Posted August 12 My main theory is that linux 6.12 doesn't have the v4l2 improvements needed for hw acceleration, that you can only get with linux 6.13.... The link i published explains that. 0 Quote
schunckt Posted August 13 Author Posted August 13 Thanks for your feedback. But as I wrote, it looks like hw accel works in general when checking the much lower CPU load vs. the fps. CPU 4x100% => ~10fps HW 1x20% => ~6fps Thats why I think the VPU really gets used but not optimal. btw. I did not yet get mpv to work with frambebuffer. T. 0 Quote
Ryzer Posted August 15 Posted August 15 Those specific patches only apply to the H61X SOCs. Very Strange that hardware decoding is apparently slower. Out of interest if you run something like glxgears to see what the reported screen refresh rate is. Not impossible to be the VPU but I suspect it is more likely to be the dma-buf transfers which could be a potential bottleneck. Could you provide a more detailed log by --log-file=test1.txt When working with the framebuffer, try drm-copy instead of drm. 0 Quote
robertoj Posted August 15 Posted August 15 Can you try if any of the H3 images from libre-elec would get you video acceleration? https://libreelec.tv/downloads/allwinner/ i once tried the orange pi pc image in my orange pi zero lts (h3) and it worked 0 Quote
schunckt Posted Monday at 04:10 PM Author Posted Monday at 04:10 PM @Ryzer There is no "ffmpeg -hwaccel drm-copy" option. Looks like this is mpv only, which - as said - doen't work with framebuffer. glxgears also can't work because i have no OpenGL. But in general that's a good hint. I'll create some small videos with different resolutions and compressions and measure the fps. Maybe that helps to nail down the issue. btw. I've not yet tested VLC. Need to figure out if this is a ffmpeg issue or something else inside the kernel, i.e. v4l2request related ... @robertoj Trying another image doesn't make much sense for me for several other reasons. T. 0 Quote
Ryzer Posted Monday at 06:01 PM Posted Monday at 06:01 PM @schunckt Yes, I should have clarified that drm-copy is an argument is for MPV, which according to the guide mentioned above should allow the frame-buffer to be accessed directly. It is worth noting that mpv makes use of ffmpeg under the hood. Tried with ffplay once but did not have much luck with it. last I checked VLC is not supported other than the legacy vaapi. Please see: https://linux-sunxi.org/Sunxi-Cedrus You can use the sample media from linaro: https://samplemedia.linaro.org/ Just to check are you using the ffmpeg-v4l2-request? 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.