Kwiboo

Members
  • Content Count

    19
  • Joined

  • Last visited

About Kwiboo

  • Rank
    Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Kwiboo

    The VPU driver

    I was familiar with the vpu2 hw regs that needed to be set from creating an experimental mpeg-2 hwaccel for the rk vcodec kernel driver some time ago. After collaboration with @jernej to create a v4l2 request api hwaccel and getting it working with the Allwinner cedrus driver I learned enough v4l2 to get the rockchip vpu MPEG-2 decoder to work on my Rock64. Ezecquiel Garcia was then very helpful and got my initial work (decoder boilerplate was copied from encoder and chromium os) ready for upstream and submitted the MPEG-2 decoder for rk3399 on top of his decoder boilerplate work. RK3288 and RK3328 was left out as they requires clk and drm changes to work properly, patches are being prepared to be submitted. The rockchip-vpu-regtool was created to help set correct hw regs for both vpu1 and vpu2, mpeg2.txt was created based on mpp hal code, some imx-vpu-hantro code along with some docs was also useful to get more insights into the rockchip vpu.
  2. Kwiboo

    The VPU driver

    https://github.com/Kwiboo/rockchip-vpu-regtool may also be interesting, it generated the template vpu code used for the rk mpeg-2 decoder. I will push an update including txt-files with hw regs that needs to be set for other codecs later, buffer and reference frame handling code will need to be created by hand.
  3. Kwiboo

    The VPU driver

    For mpeg-2 on rk3288/rk3328 there are some other patches needed, see my rockchip-5.x-vpu branch for working rk3288/rk3328 mpeg-2 decoding on v5.0-rc6. clk, drm and dts patches will be sent upstream any day now. Also check out https://github.com/mpv-player/mpv/pull/6461 and the linked ffmpeg hwaccel if you want to use mpv or kodi-gbm for testing, I recently pushed dynamic selection of media/video device to hwaccel so should work without forcing decoder to /dev/video0. I pushed two libreelec test images to http://kwiboo.libreelec.tv/test/ for tinker board and rock64 if you want to test mpeg-2 decoder, it includes patches from my rockchip-5.x-rebase and rockchip-5.x-vpu linux branch.
  4. Kwiboo

    Asus Tinkerboard

    For LibreELEC we use latest RK BSP kernel on TinkerBoard, MiQi, RK3328 and RK3399 and it has been pretty "stable" last few months. I do not have a single branch with all changes and instead generate patches based on RK BSP release-4.4 branch. Each branch mainly contain patches picked from mainline to improve multimedia. Linux tree: https://github.com/Kwiboo/linux-rockchip Patches: https://github.com/LibreELEC/LibreELEC.tv/tree/master/projects/Rockchip/patches/linux/rockchip-4.4 We are currently using a release-4.4 tag as base that includes the removed rk1808/rk3399pro commits from BSP tree. I usually just merge latest LSK 4.4 android to include the latest 4.4 patches, but anything newer then current RK BSP seems to cause issues for BT/WiFi on my TinkerBoard.
  5. My build of tinymembench is for armhf and include github PR9 and PR10. Will post my aarch64 numbers once I have run all combos possible with arm/aarch64 + 786/933/1066 mhz + 4.4/4.16 linux + rock64/roc-rk3328-cc and hdmi not connected.
  6. @Da Xue did you run tinymembench with or without hdmi connected? Having hdmi output active makes a rather big impact on memory performance. I have added my ROCK64 arm tinymembench runs at 786Mhz vs 933MHz without hdmi/framebuffer connected below, will add my ROC-RK3328-CC 933MHz/1066MHz arm/aarch64 numbers and ROCK64 aarch64 numbers later. ROCK64 linux v4.16-rc5 rk3328_ddr_786MHz_v1.12.bin NO-HDMI LibreELEC (community): devel-20180315130549-r28356-g63abb08 (RK3328.arm) LibreELEC:~ # cat /sys/kernel/debug/clk/clk_summary |grep clk_ddr clk_ddrmon 0 0 0 24000000 0 0 pclk_ddr 3 3 0 98304000 0 0 pclk_ddr_grf 1 1 0 98304000 0 0 pclk_ddrstdby 0 0 0 98304000 0 0 pclk_ddr_mon 1 1 0 98304000 0 0 pclk_ddr_msch 1 1 0 98304000 0 0 pclk_ddrupctl 0 0 0 98304000 0 0 pclk_ddrphy 1 1 0 75000000 0 0 clk_ddr 2 2 0 1572000000 0 0 aclk_ddrupctl 0 0 0 1572000000 0 0 clk_ddrupctl 1 1 0 1572000000 0 0 clk_ddrmsch 1 1 0 1572000000 0 0 LibreELEC:~ # ./tinymembench tinymembench v0.4.9 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 1642.7 MB/s (1.4%) C copy backwards (32 byte blocks) : 1527.2 MB/s (1.5%) C copy backwards (64 byte blocks) : 1568.9 MB/s (1.0%) C copy : 1582.5 MB/s (0.8%) C copy prefetched (32 bytes step) : 1793.4 MB/s C copy prefetched (64 bytes step) : 1776.3 MB/s C 2-pass copy : 1617.7 MB/s (0.2%) C 2-pass copy prefetched (32 bytes step) : 1619.7 MB/s C 2-pass copy prefetched (64 bytes step) : 1668.2 MB/s C fill : 5843.3 MB/s C fill (shuffle within 16 byte blocks) : 5843.2 MB/s C fill (shuffle within 32 byte blocks) : 5842.9 MB/s C fill (shuffle within 64 byte blocks) : 5842.8 MB/s --- standard memcpy : 1765.6 MB/s (0.9%) standard memset : 3444.7 MB/s --- NEON read : 2797.5 MB/s (0.7%) NEON read prefetched (32 bytes step) : 4143.6 MB/s NEON read prefetched (64 bytes step) : 4430.5 MB/s NEON read 2 data streams : 2511.4 MB/s NEON read 2 data streams prefetched (32 bytes step) : 4163.7 MB/s NEON read 2 data streams prefetched (64 bytes step) : 4399.6 MB/s NEON copy : 1644.8 MB/s (0.3%) NEON copy prefetched (32 bytes step) : 1782.9 MB/s NEON copy prefetched (64 bytes step) : 1785.2 MB/s (0.2%) NEON unrolled copy : 1816.6 MB/s (0.7%) NEON unrolled copy prefetched (32 bytes step) : 2118.3 MB/s NEON unrolled copy prefetched (64 bytes step) : 2067.9 MB/s NEON copy backwards : 1805.6 MB/s (0.3%) NEON copy backwards prefetched (32 bytes step) : 1902.1 MB/s NEON copy backwards prefetched (64 bytes step) : 1893.7 MB/s NEON 2-pass copy : 1770.3 MB/s NEON 2-pass copy prefetched (32 bytes step) : 1874.8 MB/s NEON 2-pass copy prefetched (64 bytes step) : 1898.7 MB/s NEON unrolled 2-pass copy : 1638.9 MB/s NEON unrolled 2-pass copy prefetched (32 bytes step) : 1567.9 MB/s NEON unrolled 2-pass copy prefetched (64 bytes step) : 1664.5 MB/s NEON fill : 5849.6 MB/s NEON fill backwards : 5849.4 MB/s VFP copy : 1762.0 MB/s (1.4%) VFP 2-pass copy : 1767.1 MB/s ARM fill (STRD) : 3444.6 MB/s ARM fill (STM with 8 registers) : 5838.8 MB/s ARM fill (STM with 4 registers) : 5102.5 MB/s ARM copy prefetched (incr pld) : 1773.0 MB/s ARM copy prefetched (wrap pld) : 1762.1 MB/s ARM 2-pass copy prefetched (incr pld) : 1594.7 MB/s ARM 2-pass copy prefetched (wrap pld) : 1592.5 MB/s ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read, [MADV_NOHUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 5.3 ns / 9.0 ns 131072 : 8.1 ns / 12.6 ns 262144 : 10.2 ns / 15.1 ns 524288 : 67.7 ns / 106.7 ns 1048576 : 103.0 ns / 142.6 ns 2097152 : 121.4 ns / 155.5 ns 4194304 : 137.1 ns / 168.1 ns 8388608 : 145.5 ns / 175.4 ns 16777216 : 151.6 ns / 181.5 ns 33554432 : 154.0 ns / 185.8 ns 67108864 : 166.0 ns / 207.5 ns block size : single random read / dual random read, [MADV_HUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 5.3 ns / 8.9 ns 131072 : 8.1 ns / 12.4 ns 262144 : 10.2 ns / 14.7 ns 524288 : 67.6 ns / 106.5 ns 1048576 : 103.0 ns / 142.5 ns 2097152 : 120.9 ns / 154.9 ns 4194304 : 130.4 ns / 159.6 ns 8388608 : 135.6 ns / 161.5 ns 16777216 : 138.3 ns / 162.4 ns 33554432 : 139.7 ns / 162.8 ns 67108864 : 140.3 ns / 163.0 ns ROCK64 linux v4.16-rc5 rk3328_ddr_933MHz_v1.12.bin NO-HDMI LibreELEC (community): devel-20180314065621-r28356-g63abb08 (RK3328.arm) LibreELEC:~ # cat /sys/kernel/debug/clk/clk_summary |grep clk_ddr clk_ddrmon 0 0 0 24000000 0 0 pclk_ddr 3 3 0 98304000 0 0 pclk_ddr_grf 1 1 0 98304000 0 0 pclk_ddrstdby 0 0 0 98304000 0 0 pclk_ddr_mon 1 1 0 98304000 0 0 pclk_ddr_msch 1 1 0 98304000 0 0 pclk_ddrupctl 0 0 0 98304000 0 0 pclk_ddrphy 1 1 0 75000000 0 0 clk_ddr 2 2 0 1848000000 0 0 aclk_ddrupctl 0 0 0 1848000000 0 0 clk_ddrupctl 1 1 0 1848000000 0 0 clk_ddrmsch 1 1 0 1848000000 0 0 LibreELEC:~ # ./tinymembench tinymembench v0.4.9 (simple benchmark for memory throughput and latency) ========================================================================== == Memory bandwidth tests == == == == Note 1: 1MB = 1000000 bytes == == Note 2: Results for 'copy' tests show how many bytes can be == == copied per second (adding together read and writen == == bytes would have provided twice higher numbers) == == Note 3: 2-pass copy means that we are using a small temporary buffer == == to first fetch data into it, and only then write it to the == == destination (source -> L1 cache, L1 cache -> destination) == == Note 4: If sample standard deviation exceeds 0.1%, it is shown in == == brackets == ========================================================================== C copy backwards : 1847.4 MB/s (1.4%) C copy backwards (32 byte blocks) : 1602.5 MB/s (0.9%) C copy backwards (64 byte blocks) : 1693.4 MB/s (1.4%) C copy : 1728.3 MB/s (1.7%) C copy prefetched (32 bytes step) : 1864.7 MB/s C copy prefetched (64 bytes step) : 1881.8 MB/s C 2-pass copy : 1741.5 MB/s C 2-pass copy prefetched (32 bytes step) : 1738.7 MB/s C 2-pass copy prefetched (64 bytes step) : 1782.4 MB/s C fill : 6862.1 MB/s C fill (shuffle within 16 byte blocks) : 6861.9 MB/s C fill (shuffle within 32 byte blocks) : 6862.2 MB/s C fill (shuffle within 64 byte blocks) : 6862.1 MB/s --- standard memcpy : 1780.1 MB/s (1.3%) standard memset : 3444.8 MB/s --- NEON read : 2944.6 MB/s (0.7%) NEON read prefetched (32 bytes step) : 4191.8 MB/s NEON read prefetched (64 bytes step) : 4554.5 MB/s NEON read 2 data streams : 2841.0 MB/s NEON read 2 data streams prefetched (32 bytes step) : 4230.7 MB/s NEON read 2 data streams prefetched (64 bytes step) : 4572.2 MB/s NEON copy : 1836.4 MB/s (0.5%) NEON copy prefetched (32 bytes step) : 1948.9 MB/s (0.2%) NEON copy prefetched (64 bytes step) : 1970.8 MB/s (0.2%) NEON unrolled copy : 2000.7 MB/s (0.5%) NEON unrolled copy prefetched (32 bytes step) : 2345.4 MB/s NEON unrolled copy prefetched (64 bytes step) : 2362.7 MB/s NEON copy backwards : 1997.4 MB/s (0.3%) NEON copy backwards prefetched (32 bytes step) : 2089.9 MB/s NEON copy backwards prefetched (64 bytes step) : 2086.2 MB/s (0.2%) NEON 2-pass copy : 1910.8 MB/s NEON 2-pass copy prefetched (32 bytes step) : 2001.1 MB/s NEON 2-pass copy prefetched (64 bytes step) : 2093.7 MB/s NEON unrolled 2-pass copy : 1744.3 MB/s NEON unrolled 2-pass copy prefetched (32 bytes step) : 1574.2 MB/s NEON unrolled 2-pass copy prefetched (64 bytes step) : 1703.4 MB/s NEON fill : 6876.7 MB/s NEON fill backwards : 6876.4 MB/s VFP copy : 1950.3 MB/s (1.4%) VFP 2-pass copy : 1886.4 MB/s ARM fill (STRD) : 3444.9 MB/s ARM fill (STM with 8 registers) : 6648.4 MB/s ARM fill (STM with 4 registers) : 5115.9 MB/s ARM copy prefetched (incr pld) : 1712.9 MB/s (0.2%) ARM copy prefetched (wrap pld) : 1758.2 MB/s (0.3%) ARM 2-pass copy prefetched (incr pld) : 1575.0 MB/s ARM 2-pass copy prefetched (wrap pld) : 1574.2 MB/s ========================================================================== == Memory latency test == == == == Average time is measured for random memory accesses in the buffers == == of different sizes. The larger is the buffer, the more significant == == are relative contributions of TLB, L1/L2 cache misses and SDRAM == == accesses. For extremely large buffer sizes we are expecting to see == == page table walk with several requests to SDRAM for almost every == == memory access (though 64MiB is not nearly large enough to experience == == this effect to its fullest). == == == == Note 1: All the numbers are representing extra time, which needs to == == be added to L1 cache latency. The cycle timings for L1 cache == == latency can be usually found in the processor documentation. == == Note 2: Dual random read means that we are simultaneously performing == == two independent memory accesses at a time. In the case if == == the memory subsystem can't handle multiple outstanding == == requests, dual random read has the same timings as two == == single reads performed one after another. == ========================================================================== block size : single random read / dual random read, [MADV_NOHUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 5.3 ns / 9.0 ns 131072 : 8.1 ns / 12.6 ns 262144 : 10.1 ns / 14.7 ns 524288 : 63.5 ns / 99.9 ns 1048576 : 96.4 ns / 133.6 ns 2097152 : 113.6 ns / 145.8 ns 4194304 : 128.4 ns / 158.4 ns 8388608 : 136.9 ns / 165.8 ns 16777216 : 141.6 ns / 171.8 ns 33554432 : 145.3 ns / 176.3 ns 67108864 : 155.5 ns / 195.7 ns block size : single random read / dual random read, [MADV_HUGEPAGE] 1024 : 0.0 ns / 0.0 ns 2048 : 0.0 ns / 0.0 ns 4096 : 0.0 ns / 0.0 ns 8192 : 0.0 ns / 0.0 ns 16384 : 0.0 ns / 0.0 ns 32768 : 0.0 ns / 0.0 ns 65536 : 5.3 ns / 8.9 ns 131072 : 8.1 ns / 12.4 ns 262144 : 10.1 ns / 14.7 ns 524288 : 63.4 ns / 99.8 ns 1048576 : 96.5 ns / 133.7 ns 2097152 : 113.2 ns / 145.4 ns 4194304 : 121.8 ns / 149.8 ns 8388608 : 126.4 ns / 151.7 ns 16777216 : 128.9 ns / 152.4 ns 33554432 : 130.2 ns / 152.9 ns 67108864 : 130.9 ns / 153.0 ns
  7. Kwiboo

    The VPU driver

    I build mpv for LibreELEC using https://github.com/Kwiboo/LibreELEC.tv/blob/rockchip/projects/Rockchip/packages/mpv-rockchip/package.mk#L63 along with the configure options defined earlier in the file. We are using a patch to update some of the EGL/GLES include files from RK's libmali repo using https://github.com/Kwiboo/LibreELEC.tv/blob/rockchip/projects/Rockchip/packages/mali-rockchip/patches/mali-rockchip-0001-update-include-files.patch There is also some GL include files included because mpv do not play nice with only GLES headers, see https://github.com/Kwiboo/LibreELEC.tv/tree/rockchip/projects/Rockchip/packages/mpv-rockchip/GL
  8. Kwiboo

    The VPU driver

    Unfortunately it is not possible to link kodi krypton and mpv to the same ffmpeg version, mpv requires a rather new ffmpeg version and uses the newer send_packet/receive_frame api and automatic bit-streaming and kodi krypton uses the older video2_decode api. There was also some audio delays and missed packets if kodi krypton is linked to the newer ffmpeg version, kodi leia will require the new ffmpeg version and the patches will be rebased and updated at a later date. For optimal media playback there have been a few kernel changes required, see my rockchip-4.4 kernel tree at https://github.com/Kwiboo/linux-rockchip/commits/rockchip-4.4 Changed to use of performance gpu governor as a default, the simple_ondemand governor is too slow and limits the gpu rate to ~30fps for gui animations Switched primary and overlay drm planes so we can have gui on top of video Raise the vpu clock to 600Mhz for hevc decoding on rk3288 Allow framebuffer and videomodes not to have same size for scaling of gui from 1080p to 4k (implemented in plexmediaplayer and on my todo list for kodi) Skip waiting on vblank for set plane drm calls, this was recently merged in Rockchip's release-4.4 tree and allows for rendering video and gui planes using legacy drm api without double vsync
  9. Kwiboo

    Tinker Board Sound

    @TonyMac32 on LibreELEC for TinkerBoard we use a custom alsa config to force the use of device 2, unfortunately it also disables use of any other USB-Audio device, see https://github.com/Kwiboo/LibreELEC.tv/blob/rockchip/projects/Rockchip/devices/TinkerBoard/filesystem/usr/share/alsa/cards/USB-Audio.conf We have also seen issues with DMA using Rockchip's release-4.4 kernel, see https://github.com/Kwiboo/linux-rockchip/issues/16 and https://github.com/Kwiboo/linux-rockchip/compare/rockchip-4.4...rockchip-4.4-pl330 for our current solution (use pl330 driver from upstream 4.12). There is also some sort of buffer issue when switching between 2ch PCM, multi-channel PCM or NL-PCM/HBR HDMI modes, sometimes channels might get shifted one channel or garbage gets played depending on HDMI audio mode. We had a similar issue on the Amlogic S905 SoC that uses the same DesignWare HDMI 2.0 TX IP, the hack used for Amlogic was to force a reset by switching to 2ch PCM mode before switching to any other mode, but on Amlogic multi-channel PCM, NL-PCM and HBR all use the PCM audio driver while on Rockchip I2S is used for all HDMI audio modes.
  10. Kwiboo

    The VPU driver

    Cool, I will check how this VPU driver works with a 4.13 kernel on LibreELEC later this weekend. I would suggest testing the VPU driver and mpp using Rockchip's gstreamer plugins or possibly LongChair's mpv/ffmpeg. I don't know much about gstreamer but it seems to be what the Rockchip's developers use to test video decoding. Both mpv/plexmediaplayer and kodi use DRM/KMS, gbm version of libmali and a few kernel hacks on LibreELEC for smooth 1080p/4k video playback, they most likely need modifications to run on x11/wayland. For a version of ffmpeg that use mpp for decoding, see https://github.com/LongChair/FFmpeg/commits/rockchip or the rockchip-new branch. LongChair also have a mpv version for both those branches of ffmpeg at https://github.com/LongChair/mpv/commits/rockchip or the rockchip-new branch. I have a ffmpeg 3.1 backport at https://github.com/Kwiboo/FFmpeg/commits/rockchip-krypton and Kodi Krypton patches at https://github.com/Kwiboo/plex-home-theater/commits/rockchip-krypton. The Kodi Leia patches in the rockchip-leia branch will be updated once we have achieved more stable playback on rk3328 devices using mpv and Kodi Krypton.
  11. Kwiboo

    ROCK64

    I think so, out of the three shipments to LibreELEC developers one made it to its destination in France last week, the second got returned by customs and my never seems to have reached Sweden at all. But I have been told that new shipments will be made next week
  12. Kwiboo

    ROCK64

    Replacing the prebuilt rk3328_bl31_v1.34.bin with a custom built bl31.elf from latest upstream ATF source seems to work, and my device have survived past the old ~15-20 minute lockup limit. The generated bl31.bin ends up being 4GiB for some reason, using the bl31.elf instead of bl31.bin as input to the trust_merger tool seems to work. This was tested on a RK3328 tv box as my ROCK64 shipment seems to have been lost in transit :/
  13. Kwiboo

    ROCK64

    Until Rockchip finished the SPL+DRAM init code for RK3328 we have to use the ddr+miniloader blobs to start u-boot, https://github.com/rockchip-linux/build/blob/debian/mk-uboot.sh#L48-L77 generates working idbloader.img, uboot.img and trust.img using tools and blobs from the rkbin repo. The BootROM will look for idbloader.img at 0x40 on SPI/eMMC/SD, and the miniloader will expect uboot.img+trust.img at 0x4000 and 0x6000 on the same device it loaded idbloader.img from, the miniloader before v2.43 had a bug that would always load uboot+trust from eMMC (or possibly SPI). http://opensource.rock-chips.com/wiki_Boot_option have some details on the boot process.
  14. Kwiboo

    ROCK64

    You should be able to use whatever partition table and filesystem as long as u-boot can recognize it and finds a bootable partition with the extlinux/extlinux.conf file. That is if you use the 'release' branch from https://github.com/rockchip-linux/u-boot, if you use the Android 'rkproduct' branch it have special needs... For the LibreELEC Rockchip endeavor we currently use a MBR partition table with a bootable fat partition and a second ext4 storage partition, but will probably switch to a GPT partition table in next update. We also make the first partition start at 16MiB to match https://github.com/rockchip-linux/build images.
  15. Kwiboo

    Asus Tinkerboard

    Is this still a problem? I have never had any issue with getting persistent ethernet mac address since I started using a u-boot build that included https://github.com/u-boot/u-boot/commit/ecc3bd73b35398d8337096b19493028a29ed038e @Myy @TonyMac32 I was able to get bluetooth working on LibreELEC with Rockchip LSK 4.4.70 kernel using a pre-compiled rtk_hciattach taken from https://github.com/rockchip-linux/rk-rootfs-build/tree/master/overlay-firmware See https://github.com/Kwiboo/LibreELEC.tv/commit/e4af07d34b0a919adb3d1d3ff2f32c08cf7511b7 for the changes I made, still not sure if the chip_enable_h related changes was necessary or not. Another change that is not part of the commit diff is that I now have CONFIG_RFKILL_GPIO enabled in my kernel config.