Jump to content

The VPU driver


Recommended Posts

Alright, the biggest issue with the ASUS Tinkerboard being "resolved", I'm starting to take care of the VPU driver, in order to use it with mainline kernels.


The hardware video encoding/decoding facilities are the real meat of the RK3288, and recent Rockchip boards.

The VPU driver aims to use these facilities in order to provide the smooth video playback experience that every Netflix/Hulu addict expects when buying such boards.


Which mean, once this part dealt with, you might be able to enjoy your Multimedia oriented distributions/softwares on your Rockchip boards (MiQi, Tinkerboard and maybe Pine64 too !)


So the driver is already available in Rockchip 4.4 kernels, and started to be ported in an Out-Of-Tree fashion by phhusson on Github, and then got the attention of wzyy2, one of the Rockchip guy, which updated it and took care of multiple bugs that were present in this version.


I'm now reassembling the code, include files and all in my rockchip-vcodec repository, patching the code to use it with the latest 4.13 kernels.


Now, the problems are : I don't get the big picture from the user side. So it's kinda hard to test this quickly.


Rockchip seems to rely on their library, MPP, and patched gstreamer packages if I understand correctly. Last time I tried the "test-suite" of the MPP library, things were "clunky". Since it's mostly maintained by one person, ayaka, I'm pretty sure that it's the kind of test-suite that only the owner clearly understands. I know that problem, as I did the same thing with some of my libraries.



So, basically, the main goals are :

  • ! Compile the VPU driver without issues. I took care of that for now. The driver compiles, loads AND unloads correctly, without issues.
  • ? Understand clearly what packages, patches and software are needed to use the VPU services provided by the VPU driver.
  • ? Enjoy 1080p movies without a hitch on RK3288 boards (and others Rockchip boards if possible) !


I'll add these goals on the Github pages so that everyone gets how things are going, from the GH page.


If you have any issue with the driver and/or it's use, file it on my repository and I'll see what I can do.

Link to comment
Share on other sites

Cool, I will check how this VPU driver works with a 4.13 kernel on LibreELEC later this weekend.


I would suggest testing the VPU driver and mpp using Rockchip's gstreamer plugins or possibly LongChair's mpv/ffmpeg.

I don't know much about gstreamer but it seems to be what the Rockchip's developers use to test video decoding.

Both mpv/plexmediaplayer and kodi use DRM/KMS, gbm version of libmali and a few kernel hacks on LibreELEC for smooth 1080p/4k video playback, they most likely need modifications to run on x11/wayland.


For a version of ffmpeg that use mpp for decoding, see https://github.com/LongChair/FFmpeg/commits/rockchip or the rockchip-new branch.

LongChair also have a mpv version for both those branches of ffmpeg at https://github.com/LongChair/mpv/commits/rockchip or the rockchip-new branch.
I have a ffmpeg 3.1 backport at https://github.com/Kwiboo/FFmpeg/commits/rockchip-krypton and Kodi Krypton patches at https://github.com/Kwiboo/plex-home-theater/commits/rockchip-krypton.

The Kodi Leia patches in the rockchip-leia branch will be updated once we have achieved more stable playback on rk3328 devices using mpv and Kodi Krypton.

Link to comment
Share on other sites

Glad to hear that most of the work is done through GLES via DRM/KMS, since I mainly test the Mali drivers on my kernels using this interface.


Thanks for the link to LongChair's ffmpeg, and your version. I'll try to compile and install the patched libraries, compile mpv and kodi against them and test all of this, this week-end.


Tonight I'll try to :

The last step will make it easier for anyone that already has all the right software installed, to test the VPU driver on mainline kernels.

Link to comment
Share on other sites

7 minutes ago, Myy said:

I'll try to compile and install the patched libraries, compile mpv and kodi against them and test all of this, this week-end.


Unfortunately it is not possible to link kodi krypton and mpv to the same ffmpeg version, mpv requires a rather new ffmpeg version and uses the newer send_packet/receive_frame api and automatic bit-streaming and kodi krypton uses the older video2_decode api.

There was also some audio delays and missed packets if kodi krypton is linked to the newer ffmpeg version, kodi leia will require the new ffmpeg version and the patches will be rebased and updated at a later date.


For optimal media playback there have been a few kernel changes required, see my rockchip-4.4 kernel tree at https://github.com/Kwiboo/linux-rockchip/commits/rockchip-4.4

  • Changed to use of performance gpu governor as a default, the simple_ondemand governor is too slow and limits the gpu rate to ~30fps for gui animations
  • Switched primary and overlay drm planes so we can have gui on top of video
  • Raise the vpu clock to 600Mhz for hevc decoding on rk3288
  • Allow framebuffer and videomodes not to have same size for scaling of gui from 1080p to 4k (implemented in plexmediaplayer and on my todo list for kodi)
  • Skip waiting on vblank for set plane drm calls, this was recently merged in Rockchip's release-4.4 tree and allows for rendering video and gui planes using legacy drm api without double vsync
Link to comment
Share on other sites

Alright. I've provided a first build of the 4.12.0-rc2 with the VPU included.

Now, if you don't want to clone an entire repo, I made a tarball release of this build, so just grab it and install it on your machine. You still need to get the user-space softwares installed correctly though.


The kernel loads up, enable the VPU services and the VPU module can be unloaded without issues.


Still, no optimization has been added for the moment, as I'm trying to get things working, before trying to get things working nicely.


So this week-end will be untangling the user-space.

Link to comment
Share on other sites

I'm "re-injecting" the @Myy - modified driver into the dev tree so it builds without extra steps, I'll patch/update kernel configs/etc shortly.  My test build shows it loaded successfully, ideally that means it's ready for gstreamer/etc.  I believe the ASUS folks are using a gstreamer plugin on Chromium, might be a good place to start.


If there's a more proper way to do this tell me, I don't know all the features of the build scripts as yet.

Link to comment
Share on other sites

I'll try this today. wzyy2 also linked me two patches that might be required to get the VPU working under x11. The things is, while I think I've already implemented a good part of the first patch, the second one is... It basically shunts all the security features of DRM drivers... Yeah...


Anyway, I'll deal with these patches with the next kernel release. Today, I'm dealing with the user-space mess, focusing on the DRM/KMS part first, since that interface works well on my MiQi board.

Link to comment
Share on other sites

@Myy in case it's any help, kernel 4.13 with the vcodec driver included with your adjustments.


[    6.974433] rk-vcodec ff9a0000.vpu-service: probe device
[    6.974449] rk-vcodec ff9a0000.vpu-service: vpu mmu dec eeaa4810
[    6.974694] rk-vcodec ff9a0000.vpu-service: allocator is drm
[    6.974753] rk-vcodec ff9a0000.vpu-service: checking hw id 4831
[    6.991040] rk-vcodec ff9a0000.vpu-service: init success
[    6.996089] rk-vcodec ff9c0000.hevc-service: probe device
[    6.996110] rk-vcodec ff9c0000.hevc-service: vpu mmu dec eeaa5010
[    6.996387] rk-vcodec ff9c0000.hevc-service: allocator is drm
[    6.996438] rk-vcodec ff9c0000.hevc-service: checking hw id 6867
[    7.085518] rk-vcodec ff9c0000.hevc-service: init success


Link to comment
Share on other sites

Alright, I've reassembled and rearranged, when necessary, all the patches used by wzyy2 on the mainline kernel. They compile fine but they're still untested. I'll publish them, with the rc3 release.


So far, so long, the MPP test suite is still... "clunky". Like, some tests require you to provide raw videos content, meaning H26x or VP8 content without the file header... A bit like when you strip the BMP header and keep the data to feed it directly to the GPU.


Testing the FFMPEG plugin using the h264_rkmpp or vp8_rkmpp decoders throw me some "Operation not permitted" errors. I don't know why though, strace didn't help pinpoint this issue.


So, once all the new patches added and test, I'll test with MPV, see if I can get some useful error messages at least.


In the new patches, there's another implementation of the rockchip_gem_prime_import_sg_table function. I was using the one provided by the ARM guys before.

I also found the mainline kernel patch that adds RGA, the Rockchip 2D hardware rasterizer, which is always a nice little addition.

There's a patch that adds the real rockchip_pmu_idle_request used by the VPU driver. I thought the VPU needed rockchip_pmu_set_idle_request , but it looks like it needed an unimplemented function instead. The new patch will add it.

Then there's some 10 bits formats additions, like NV10 and a Rockchip specific version of this format, if I understand correctly.


In a special branch, I'll also add the patch that shunts the DRM security, FOR TESTING PURPOSES ONLY. I really think that this patch is uncalled for, but I have to use the same environment as the testers in order to report that some things don't work, when they don't.


So yeah, all of that with the rc3 for tomorrow.

Link to comment
Share on other sites

Alright, the new patches are available from the VPU branch of RockMyy.

A new build, including the horrendous DRM security shunting patch, is available through the VPU branch of RockMyy-Build. A new tarball release has been generated too.

The rockchip-vcodec repository has been updated, and the kernel patches required to compile and run the VPU code correctly have been copied in it.


Still untested though, beside running the kernel, checking that the module was loaded correctly, unloading the module, loading it again, checking that /dev/vpu-service and /dev/hevc-service were created and rebooting.

Link to comment
Share on other sites

Alright, trying to compile MPV led me to EGL extensions definitions issues...

That's not common...

../video/out/opengl/hwdec_drmprime_egl.c:39:9: error: ‘EGL_DMA_BUF_PLANE3_FD_EXT’ undeclared here (not in a function)
../video/out/opengl/hwdec_drmprime_egl.c:45:9: error: ‘EGL_DMA_BUF_PLANE3_OFFSET_EXT’ undeclared here (not in a function)
../video/out/opengl/hwdec_drmprime_egl.c:51:9: error: ‘EGL_DMA_BUF_PLANE3_PITCH_EXT’ undeclared here (not in a function)


I searched for the first extension, it's defined in the eglext.h provided by Mesa, thanks to a patch provided in November 2016. The Mali OpenGL development headers provided by Rockchip do not define that extension. I could add it manually tough.


So, basically, trying to compile MPV on Ubuntu LTS seems to be a no-no. So my question is @Kwiboo Did you compile MPV ? If yes, on which distribution did you compile it ? The latest Ubuntu ? Debian testing ?

Link to comment
Share on other sites

I build mpv for LibreELEC using https://github.com/Kwiboo/LibreELEC.tv/blob/rockchip/projects/Rockchip/packages/mpv-rockchip/package.mk#L63 along with the configure options defined earlier in the file.


We are using a patch to update some of the EGL/GLES include files from RK's libmali repo using https://github.com/Kwiboo/LibreELEC.tv/blob/rockchip/projects/Rockchip/packages/mali-rockchip/patches/mali-rockchip-0001-update-include-files.patch

There is also some GL include files included because mpv do not play nice with only GLES headers, see https://github.com/Kwiboo/LibreELEC.tv/tree/rockchip/projects/Rockchip/packages/mpv-rockchip/GL

Link to comment
Share on other sites

So far, I'm stuck with :


Assertion p_hal->vcodec_type & ((0x00000200) | (0x00000001) | (0x00000002)) failed at hal_h264d_init:187


Which in MPP code is written like this :


mpp_assert(p_hal->vcodec_type & (HAVE_RKVDEC | HAVE_VPU1 | HAVE_VPU2));


So I'm trying to check what could cause this issue.


This issue is triggered by MPP when I use the MPP test-suite and when I use MPV.

Link to comment
Share on other sites

Alright, issue resolved, turns out that you have to compile MPP with the options -DHAVE_DRM=ON and -DRKPLATFORM=ON .


The second option appears when you do cmake -L /path/to/mpp but not the first one, of course. If you miss the HAVE_DRM option, MPP will be compiled with ION support instead ! Unusable on Linux systems with DRM ! Isn't that nice ?


Anyway, got it compiled correctly, launched the MPP decoder tests and BOOM !



Now, I'll have to figure out why it crashed like this...

Link to comment
Share on other sites



On the linux-rockchip IRC channel, stdint advised me that the DRM IOMMU code of this driver is unmaintainable and should be removed and rewritten.

At the same time wzyy2 told me to try making this driver a bit more generic, since too much raw API are used, which is asking for bugs and crashes.


Therefore, here I am, creating a branch named DRM_rewrite in my rockchip-vcodec repo, in order to start rewriting the DRM IOMMU code, which is not my specialization, but I'll give it a try.


If anyone with knowledge about IOMMU workings wants to play with the code, it's here. I'm also documenting a few things in the Wiki associated with the repository.

Link to comment
Share on other sites

Sometimes I'm really wondering if these two guys are working for Rockchip, or are just big Rockchip fans. ( ̄〜 ̄;)


Anyway, I'll see if that can be done using the DRM PRIME technology, instead of playing with "Yo Moos Moos" everywhere.

I thought that the whole FD thing would bring issues, but looking at the ION part and how the old ION API worked, it seems that the FD just points to DMA buffers anyway. So getting the *dmabuf from the fd roughly equates to dma_get_buf(fd). DMA Buffers can then be used by DRM Prime calls.


I still need to understand what the vcodec_service expects from these "IOMMU" parts, though.


That said if anyone knows any other ARM oriented VPU driver that integrates with the DRM drivers, let me know. I'll take a look at how they do things and if they use some standard API I neglected.

Link to comment
Share on other sites

Nice catch ! This might prove useful, yeah.


Reading this document, I still don't understand why they're dead focused on having technology specific buffers, where all those techs use DMA buffers in the end though. At least internally.


I'm currently looking at DRM PRIME and trying to understand if you can, as a user-space application, allocate a DMA buffer, pass it to something else (user-space or kernel-space) so that it writes some bitmap in it and then display the bitmap on the screen using DRM calls. If that can be done, I'm throwing out the DRM/ION part of the VPU driver and let the user-space app/MPP do the buffers allocation.

Link to comment
Share on other sites

So, after understanding that half of the identifiers and function names in this VPU driver were very badly chosen, I found that the IOMMU DRM driver has already nothing to do with DRM at all. There's isn't a single DRM API call in this part of the driver.

This part should have been called IOMMU DMA OPS because that's the only thing it's dealing with.


Now, I found that some parts of the failing code were actual copy paste of IOMMU code in the kernel. However, trying to use the right kernel API proved to be quite difficult with Rockchip systems. To say the least, most of the new IOMMU DMA API don't work or will even crash the machine, with Null Pointer Dereference most of the time. I don't know if that's because the VPU IOMMU isn't initialized correctly or something. Playing with IOMMU Groups -- which seem to be "The Right Way" according to different LWN documents I found -- proved to be much worse. By default, the IOMMU Groups affected to the VPU IOMMU do not seem to have any IOMMU Domain preset, failing the iommu_get_domain_for_dev calls and most of all the new API calls. Trying to force a domain to these groups freeze the system...


Now, my current plan is now to generate a toy version of the driver that concentrate on the biggest issues with the real driver :

  • Write a little driver client that tries to pass a DMA Buffer allocated on the GPU through DRM, using PRIME File Descriptors, to the VPU;
  • Try to map the content of the DMA Buffer, represented by that File Descriptor, in the VPU memory using the IOMMU and the DMA API;
  • Then, whether that toy version fails or not, contact Mark Yao or "djkurtz", the Google guy who also worked on the Rockchip IOMMU, and see if they can provide at least some hints on how to perform these operations correctly using the Rockchip IOMMU API of mainline kernels.


Once done correctly, I'll try to reimport the code in the driver and see if that resolves the situation.


At the same time, I'm trying to find the real VPU documentation, that defines all the registers used by the VPU, what they're for, ...

Currently trying to figure out these things from the non-working VPU driver is way to cumbersome. Like stated before, most of the identifiers (function names, variables) are badly chosen and barely reflect their use or real utility, if at all.

The Wiki only provides the MPP documentation, that provides no informations about how to code a driver for the VPU itself, like which addresses should be written, which addresses should be read, ...


There's some informations in the reference documents I got a while ago, but that's it. These documents refer to others documents named : VDPU_SWReg_Map.pdf and VEPU_SWReg_Map.pdf which should provide the VPU registers list and their uses. However I cannot find these documents.



Link to comment
Share on other sites

I started adding some current Rockchip patches to the 4.13 dtsi for rk3288, there are a couple iommu entries added that are related to the vcodec.  I haven't pushed them yet because either adding them in "disabled" state or the 64-bit register values in these entries shut down my HDMI output, I had to status ok them in the dts for the board and fix the 64-bit entries.


For the curious, Rockchip is pushing to use 64-bit values to support lpae: https://patchwork.kernel.org/patch/9878101/


They are making other patches assuming this one will go through, although it hasn't been accepted yet.






Link to comment
Share on other sites

LPAE... If this "thing" is as well-done as it was on X86, that's clearly not a smart move. Managing tiny virtual special windows with a double dose of address translation just to map more than 4GB on systems where you should not need *that* much sounds pretty much useless.


Anyway, the DTS files of Rockchip kernels actually set up the IOMMU to be "okay" directly from the DTSI file. It seems like that the VPU being absent is extremely exceptional on Rockchip 3288 systems. So, I actually did the same in my kernel now.

Link to comment
Share on other sites

Not until I can run Crysis on it. ^_^


Also, the thing is, will my software (or the OS) be able to map more than 4GB without some additional TLB sorcery ? ( ̄〜 ̄ )


Anyway, I guess I'll just rework my patch to set them to "okay" again, then, until I find a RK3288 board without any VPU chip integrated.

Link to comment
Share on other sites

Hmm, it may be about power consumption as well, if the driver doesn't bring it up (like a headless application) you might save some mW.


As it stands I'm going to patch to enable them all in the board DTS's.  I'll take a look at that bluetooth driver tonight as well.


If it could do PlanetSide 2 I'd be happy.  Of course that game makes my i7-6800k cry, and it's a 4-year old game...

Link to comment
Share on other sites

Alright, I think I got something working !


MPV using MPP seems to work and is able to read a 1080p WebM file and output a very fluid image !
However, I only tested this with 1m30s sample files so I'll have to test this more seriously.


Still, having the VPU driver works on 4.13 kernels is nice ! Now, I'll need testers !


So here's a patched kernel build, including the VPU driver in it :


This build include patches that makes logging VERY NOISY when playing files. If everything works nicely, I'll remove the noise.


Here's the repository containing the patches applied :


And here's the patch itself :


Here's the repository containing the working VPU code :




Testing the VPU


Now, in order to test the VPU driver, you'll need something that work with it !

I tested it with MPP and MPV. So if you want to test it like I did, you'll have to recompile MPP and MPV, and also know how to download and use ARM Mali user-space binary drivers and make them work through the DRM interface.



Have fun !


Whether everything works or you got a crash, don't hesitate to reply on this thread.
If you something went wrong, please provide any crash message that might appear in dmesg !

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines