divis1969 Posted April 18, 2017 Posted April 18, 2017 It looks like there is no hardware accelerated video decoding/encoding on the Banana PI M1 currently (on the server build). Is there any chance to enable it?
Igor Posted April 19, 2017 Posted April 19, 2017 It's enabled only on desktop builds with legacy kernel. Starting with fresh image is recommended since upgrade scenario is not completely tested and polished.
divis1969 Posted April 19, 2017 Author Posted April 19, 2017 Actually I do not need a desktop. It looks like this acceleration somehow depends on X11. Does it mean it can only be used for HW-accelerated graphics (i.e. rendering on screen)? I need this functionality for surveillance system (ex. kerberos.io) and do not need to draw it on screen. Is it possible to install some packages for X11 to use acceleration?
rellla Posted April 19, 2017 Posted April 19, 2017 Hardware acceleration works with legacy kernel via libvdpau-sunxi. And this depends on X11. You do not need a complete desktop/window manager environment, just X. There are other ways to use cedrus code directly without VDPAU API to decode/encode sth. Code snippets should lay somewhere around the net, but i don't really know more about them. Regards rellla
divis1969 Posted April 19, 2017 Author Posted April 19, 2017 Is the exact list of packages for X needed for vdpau already known?
rellla Posted April 19, 2017 Posted April 19, 2017 Dependencies from https://packages.debian.org/de/jessie/libvdpau-dev https://packages.debian.org/de/jessie/libvdpau1 and the xorg package itself should help. Regards rellla
divis1969 Posted April 19, 2017 Author Posted April 19, 2017 I've tried to install some of the packages (xorg, libvdpau-sunxi1, etc) but was unable to make vdpauinfo work. Here is the log $ vdpauinfo debug1: client_input_channel_open: ctype x11 rchan 3 win 65536 max 16384 debug1: client_request_x11: request from ::1 42042 debug1: channel 1: new [x11] debug1: confirm x11 display: localhost:10.0 screen: 0 debug1: client_input_channel_open: ctype x11 rchan 4 win 65536 max 16384 debug1: client_request_x11: request from ::1 42043 debug1: channel 2: new [x11] debug1: confirm x11 debug1: channel 1: FORCE input drain debug1: channel 2: FORCE input drain Error creating VDPAU device: 25 debug1: channel 1: free: x11, nchannels 3 debug1: channel 2: free: x11, nchannels 2 Note that I was able to run xtrerm. I'm running ubuntu server, kernel 3.4.113 Did I miss something?
divis1969 Posted April 19, 2017 Author Posted April 19, 2017 It seems I need to enable a cedar_dev at least. I've found the following in the kernel log: [ 1.915496] [cedar dev]: not installed! ve_mem_reserve=0 The questions are 1. What should I do to enable it 2. how to rebuild the kernel and replace it on the running machine or SD. I'm using the image built by myself. It was configured as Ubuntu server, kernel 3.4.
rellla Posted April 20, 2017 Posted April 20, 2017 The kernel module is not loaded, because noone assignes memory to the device, because the boot parameter sunxi_ve_mem_reserve is set to 0 and the kernel does not support CMA. 1. Either build the kernel with CMA enabled and set to e.g.128MB or enable sunxi_ve_mem_reserve in boot.cmd https://github.com/igorpecovnik/lib/blob/master/config/bootscripts/boot-sunxi.cmd#L40 I don't know, how you can easily set this with build tools. 2. If you build your image yourself, you can add userpatches and/or do a kernel config during build process, where you can enable CMA. Regards rellla
divis1969 Posted April 20, 2017 Author Posted April 20, 2017 Yeah, I'll try. BTW, I've tried to build the desktop to figure out how this HW acceleration is enabled. But did not find it. At least kernel config seems have the CMA disabled. Is there any receipt how to replace the kernel on the system? I do not want to use a fresh install...
zador.blood.stained Posted April 20, 2017 Posted April 20, 2017 On fresh enough desktop images disp_mem_reserves=on is added to /boot/armbianEnv.txt
rellla Posted April 20, 2017 Posted April 20, 2017 1 hour ago, divis1969 said: Is there any receipt how to replace the kernel on the system? I do not want to use a fresh install... Set https://github.com/igorpecovnik/lib/blob/master/compile.sh#L16 and the next line to yes, and your build system will give you deb packages, you can easily install on your system via dpkg -i *.deb You will also be able to change the kernel config then. Regards rellla
divis1969 Posted April 21, 2017 Author Posted April 21, 2017 Thanks everyone! I had modified the 'disp_mem_reserves' and now I can run vdpauinfo. Unfortunately, there are some issues with vdpau usage. 1. vdpauinfo only works if I use 'ssh -X' to connect bananapi. Otherwise it fails because it cannot connect X server. I'm suspecting I will not be able to use vdpau from the process running as a system service. What can I do to fix this? 2. Test with ffmpeg to add a timestamp sting over the RTP stream from camera and write it to the a file is failing. ffmpeg starts writing to a file but stucks at some point. After the termination with Ctrl-C (which is not so easy actually), the file is not looking like a valid mp4 (and is too short, about few KB) Is there any test case (maybe with the mp4 file as an input) which I can use to check vdpau functionality and performance? The RTP stream is 25 FPS, so perhaps I need to reduce the frame rate along with adding a time stamp to make it working... Is there any way to enable/collect some logging to debug the issue with ffmpeg?
RagnerBG Posted April 21, 2017 Posted April 21, 2017 12 hours ago, divis1969 said: 1. vdpauinfo only works if I use 'ssh -X' to connect bananapi. Otherwise it fails because it cannot connect X server. I'm suspecting I will not be able to use vdpau from the process running as a system service. What can I do to fix this? For ssh i am using - export DISPLAY=:0 . For ffmpeg i think you have to compile it with vdpau support. Or check this treads: FFmpeg with Cedrus H264 HW Encoder (H3 - CMOS Camera) ffmpeg H264 encoding with cedrus
divis1969 Posted April 23, 2017 Author Posted April 23, 2017 Thanks for the links. I've built the ffmpeg from https://github.com/stulluk/FFmpeg-Cedrus and performed few tests to re-encode the mpeg file http://samplemedia.linaro.org/H264/big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 to reduce the frame rate to 5 FPS 1. Encoding with cedrus works pretty good, ffmpeg consumes around 120% CPU (my estimation of average, I've just run top at the same time). Test took: real 1m10.127s, user 1m8.180s, sys 0m5.530s. FFMpeg showed ~ 5.7-6.2 FPS at the time on encoding. Command line ./ffmpeg -i big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 -pix_fmt nv12 -r 5 -an -b:v 64k -c:v cedrus264 stream.mp4 2. Decoding with vdpau is something horrible. CPU (average) 90%, time real 9m44.446s, user 9m12.470s, sys 0m14.860s. FFMpeg shows 0.8 FPS while processing. ./ffmpeg -hwaccel vdpau -i big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 -r 5 -an stream.mp4 In both these case the video is viewable. 3. I've tried to use both vdpau and cedrus codec. CPU is only 10% ! BUT!! time real 7m6.911s, user 0m28.900s, sys 0m13.050s, processing FPS is 0.9. And video is completely unviewable... FFmpeg logged at the end: Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used). File length is 419 bytes. ./ffmpeg -hwaccel vdpau -i big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 -pix_fmt nv12 -r 5 -an -b:v 64k -c:v cedrus264 stream.mp4 What might be issue with vdpau? Is is possible to use both vdpau (to decode) and cedrus encoder simultaneously?
RagnerBG Posted April 24, 2017 Posted April 24, 2017 21 hours ago, divis1969 said: Thanks for the links. I've built the ffmpeg from https://github.com/stulluk/FFmpeg-Cedrus and performed few tests to re-encode the mpeg file http://samplemedia.linaro.org/H264/big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 to reduce the frame rate to 5 FPS Is is possible to use both vdpau (to decode) and cedrus encoder simultaneously? You have been able to built this source, i remember i failed when i tried back then. Maybe it's fixed in the mean time and i have to try again. I am not that qualified to answer your questions, but about vdpau, i don't think it's about encoding, but more about decoding. Cedrus is for encoding. And i am not sure about the reason, but i don't think it will be possible to use sunxi-vdpau for decoding directly through ffmpeg. I write this because i remember i tried and had 100% CPU usage, meaning - no h/w acceleration. But you can use vdpau for decoding through some players like mpv. As for simultaneously use of encoding and decoding, this is something i would like to know too :).
rellla Posted April 25, 2017 Posted April 25, 2017 @RagnerBG: Totally right. VDPAU is an API for decoding and presentation. It's used by several players. Encoding is not included in this API. "Cedrus" is the project name of the reverse engineering effort in general. libvdpau-sunxi is based on cedrus code. ffmpeg-cedrus is based on cedrus code. As i can see in the readme, ffmpeg-cedrus only does hardware accelerated encoding. I have no clue, how the libvdpau-sunxi backend works together with ffmpeg in case of decoding with --hwaccel vdpau option - but imho it should work with some adaptions. You may check https://github.com/linux-sunxi/libvdpau-sunxi/issues/55 and give some more log to find your issue... rellla
divis1969 Posted April 25, 2017 Author Posted April 25, 2017 It looks like libvdpau-suxi and ffmeg-cedrus are actually compete for VE. First one uses libcedrus to access VE, second uses the code compiled in directly. Most likely each one affects another. So, there should be some rework needed to allow these pieces of code to co-exist. Not sure about VE (cedrus) kernel driver, does it allow two clients or not. BTW, I'm still not sure whether vdpau is actually improving the decoding (see test #2). I suppose encoding back to mpeg hides the effect. Is there a test I can use to verify it (ex. to just drop the decoded frames)?
rellla Posted April 25, 2017 Posted April 25, 2017 1 hour ago, rellla said: You may check https://github.com/linux-sunxi/libvdpau-sunxi/issues/55 and give some more log to find your issue...
divis1969 Posted April 29, 2017 Author Posted April 29, 2017 I have reworked ffmpeg's cedrus264 encoder to use libcedrus and modified libcedrus to allow few clients to use VE (in the same process). How it is possible to use both vdpau-sunxi decoder and cedrus264 encoder to transcode the video. The results for ./ffmpeg -hwaccel vdpau -i big_buck_bunny_720p_H264_AAC_25fps_3400K.MP4 -pix_fmt nv12 -r 5 -an -b:v 64k -c:v cedrus264 stream.mp4 are the following: Video is viewable, CPU usage is ~80-90%, FPS while encoding ~6.7, time real 0m56.523s, user 0m28.210s, sys 0m15.720s It is not as good as I was expecting though. Perhaps, copying data in memory is a bottleneck. Not sure it could be improved The code is located at https://github.com/divis1969/libcedrus (branch master) and https://github.com/divis1969/FFmpeg (branch 2.8-cedrus) 4
divis1969 Posted June 1, 2017 Author Posted June 1, 2017 I've tired to do some profiling and debugging and had found that ffmpeg performs some conversions from yuv420p pixel format to nv12 format. These conversions are consuming a lot of time because are performed in software. Note that cedrus_vdpau decoder (vdpau-sunxi) is doing one more conversion from the internal decoder format to yuv420p (on mine A20 which is using VE engine version 1623. On a newer chipsets with version 1680, it seems can directly decode into yuv420p). I've found that ffmpeg_vdpau.c always selects the yuv420p as the decoder output format (see https://github.com/divis1969/FFmpeg/blob/2.8-cedrus/ffmpeg_vdpau.c#L191, https://github.com/divis1969/FFmpeg/blob/2.8-cedrus/ffmpeg_vdpau.c#L265 and the code of sunxi-vdpau). I did not find yet a way to figure out that user has specified the pixel format (-pix_fmt) at this point to make this code configurable and thus I've tried to just swap lines 192 and 193 (to select nv12 first) and recompiled the ffmpeg. This increased the encoding FPS (~2 times, up to 14 FPS): CPU usage is ~75-85%, FPS while encoding ~14, time real 0m28.137s, user 0m15.670s, sys 0m8.830s
Recommended Posts