Jump to content

Change CMA memory allocation size


Go to solution Solved by TonyMac32,

Recommended Posts

Posted
Armbianmonitor:

Hi,

i use a AML-S905X-CC board and since the last kernel update, something start moving on the hardware acceleration side, i got mpv hwdec working. I would like to make some tests and start using kodi for media reproduction, but i'm scared that 256M of default cma memory is not enough.

I have tried to set the cma memory size from kernel parameters using armbian-config.

I have tried to add "cma=350M" and "extraargs=cma=350M" to boot environment.

Both commands does not work.

The strange thing is that i'm sure "extraargs=cma=350M" pass the command to the kernel parameters because i have already used it to disable a faulty composition output.

I have checked the syntax of the cma kernel parameter at the linux kernel documentation and SHOULD be ok, but i'm not sure 100%:

Spoiler

cma=nn[MG]@[start[MG][-end[MG]]] [ARM,X86,KNL] Sets the size of kernel global memory area for contiguous memory allocations and optionally the placement constraint by the physical address range of memory allocations. A value of 0 disables CMA altogether. For more information, see include/linux/dma-contiguous.h

 

I verify the cma allocation from dmesg, that report:

[    0.000000] Reserved memory: created CMA memory pool at 0x000000006b000000, size 256 MiB

 

I also verify it by using "cat /proc/meminfo" and on the end there is:

CmaTotal:         262144 kB
CmaFree:           12772 kB

 

So the question is: how can i change the cma allocation size, without recompile the kernel if possible?

Is it possible that the cma size has been hardwritten at compile time and runtime changes are ignored?

 

Thanks for the attention,

have a good day

Posted (edited)

UniformBuffer

On 8/25/2020 at 4:30 PM, UniformBuffer said:
Armbianmonitor:

Hi,

i use a AML-S905X-CC board and since the last kernel update, something start moving on the hardware acceleration side, i got mpv hwdec working. I would like to make some tests and start using kodi for media reproduction, but i'm scared that 256M of default cma memory is not enough.

I have tried to set the cma memory size from kernel parameters using armbian-config.

I have tried to add "cma=350M" and "extraargs=cma=350M" to boot environment.

Both commands does not work.

The strange thing is that i'm sure "extraargs=cma=350M" pass the command to the kernel parameters because i have already used it to disable a faulty composition output.

I have checked the syntax of the cma kernel parameter at the linux kernel documentation and SHOULD be ok, but i'm not sure 100%:

  Hide contents

cma=nn[MG]@[start[MG][-end[MG]]] [ARM,X86,KNL] Sets the size of kernel global memory area for contiguous memory allocations and optionally the placement constraint by the physical address range of memory allocations. A value of 0 disables CMA altogether. For more information, see include/linux/dma-contiguous.h

 

I verify the cma allocation from dmesg, that report:

[    0.000000] Reserved memory: created CMA memory pool at 0x000000006b000000, size 256 MiB

 

I also verify it by using "cat /proc/meminfo" and on the end there is:

CmaTotal:         262144 kB
CmaFree:           12772 kB

 

So the question is: how can i change the cma allocation size, without recompile the kernel if possible?

Is it possible that the cma size has been hardwritten at compile time and runtime changes are ignored?

 

Thanks for the attention,

have a good day

Hi UniformBuffer, what mpv.conf arguments that are you using?

hwdec =auto?

 

Thanks

Edited by dante6913
Posted
3 hours ago, dante6913 said:

UniformBuffer

Hi UniformBuffer, what mpv.conf arguments that are you using?

hwdec =auto?

 

Thanks

Hi,
i don't got any particular mpv conf, i have only added "ytdl-format=bestvideo[height<=?720][fps<=?60][vcodec!=?vp9]+bestaudio/best" to get video from youtube directly with the resolution of my monitor, nothing more.
Your question make me some tests: unfortunatly i don't get hardware acceleration from youtube videos (maybe i have to change something), but i'm sure to got it from local video.
Running a vp9 video of my self recorded desktop i got from "mpv --hwdec=yes ./myvid.mkv":
```
 (+) Video --vid=1 (*) (vp9 1280x800)
[vaapi] libva: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
Using hardware decoding (v4l2m2m-copy).
VO: [gpu] 1280x800 nv12
V: 00:00:01 / 00:00:18 (10%)
....
```

Youtube videos do not use nv12, so maybe this make mpv do not enable hardware acceleration (i do not got "Using hardware decoding (v4l2m2m-copy).").

"VO: [gpu]" should means that the "video output is gpu accelerated", but i don't think means also hardware acceleration, i have to check this. If i make them accelerated i will send you a message ;)

 

PS: don't get my words for the truth, i'm a beginner on this topic and i'm experimenting to get some knowledge

 

 

EDIT: got it. Using "mpv --hwdec=yes --hwdec-codecs=vp9 *youtube video address*" gives hardware decoding, but unfortunatly the performance are wrost than software decoding. Anyway i'm glad that something start working, good job Armbian team, i know this is not easy!

Posted
23 hours ago, UniformBuffer said:

Hi,
i don't got any particular mpv conf, i have only added "ytdl-format=bestvideo[height<=?720][fps<=?60][vcodec!=?vp9]+bestaudio/best" to get video from youtube directly with the resolution of my monitor, nothing more.
Your question make me some tests: unfortunatly i don't get hardware acceleration from youtube videos (maybe i have to change something), but i'm sure to got it from local video.
Running a vp9 video of my self recorded desktop i got from "mpv --hwdec=yes ./myvid.mkv":
```
 (+) Video --vid=1 (*) (vp9 1280x800)
[vaapi] libva: vaGetDriverNameByIndex() failed with unknown libva error, driver_name = (null)
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
Using hardware decoding (v4l2m2m-copy).
VO: [gpu] 1280x800 nv12
V: 00:00:01 / 00:00:18 (10%)
....
```

Youtube videos do not use nv12, so maybe this make mpv do not enable hardware acceleration (i do not got "Using hardware decoding (v4l2m2m-copy).").

I have to check, if i make them accelerated i will send you a message ;)

I tried with balbes latest image Ver 20200828 kernel 5.7.16 but I only got output  POLLERR so hardware decoding failed with all videos even mp4

Posted
16 hours ago, dante6913 said:

I tried with balbes latest image Ver 20200828 kernel 5.7.16 but I only got output  POLLERR so hardware decoding failed with all videos even mp4

That's strange, what is your mesa version? Mine is 20.1.5 currently. Also my kernel is the 5.7.15 that currently use Armbian by default.

Posted
3 hours ago, UniformBuffer said:

That's strange, what is your mesa version? Mine is 20.1.5 currently. Also my kernel is the 5.7.15 that currently use Armbian by default.

Its from ppa:oibaf/graphics-drivers

Posted
2 hours ago, dante6913 said:

Its from ppa:oibaf/graphics-drivers

I personally don't use ppa repositories, so i don't know what version they are providing. You can check what is your by simply running

glxinfo | grep "OpenGL version"

Mine gives "OpenGL version string: 2.1 Mesa 20.1.5"

 

My drivers are a bit updated than standard because i set Debian Sid as additional repository, so i get very cutting edge updates.

Also, just to be sure, do you have "meson_vdec" module enabled? It is the kernel module that enable the hardware decoding. By default meson_vdec should be enabled, but who knows?

To check you can run "lsmod" and search among them for meson_vdec.

If you don't have it enabled, you can enable with

sudo modprobe meson-vdec

(notice that when enabling, the module is called "meson-vdec", not "meson_vdec")

This will enable it for current session, so do not survive rebooting. To make it load on every boot, add "meson-vdec" to /etc/modules

 

Posted

Hi,

i'm making some test to make meson_vdec working. Currently i got v4l2m2m hardware decoding on both xorg and wayland, but the video stutter a lot. While playing i have noticed that on dmesg i got some errors like:

Spoiler

[   75.219223] alloc_contig_range: [70c00, 70cf0) PFNs busy
[   75.219421] alloc_contig_range: [70d00, 70df0) PFNs busy
[   75.221725] alloc_contig_range: [70c00, 70cf0) PFNs busy
[   75.221952] alloc_contig_range: [70d00, 70df0) PFNs busy
[   75.223993] alloc_contig_range: [70c00, 70c78) PFNs busy
[   75.224134] alloc_contig_range: [70c80, 70cf8) PFNs busy
[   75.224270] alloc_contig_range: [70d00, 70d78) PFNs busy
[   75.224490] alloc_contig_range: [70d80, 70df8) PFNs busy
[   75.230638] alloc_contig_range: [70c00, 70cf0) PFNs busy
[   75.230801] alloc_contig_range: [70d00, 70df0) PFNs busy
[  168.034741] alloc_contig_range: 82 callbacks suppressed
[  168.034748] alloc_contig_range: [6e900, 705a0) PFNs busy
[  168.038374] alloc_contig_range: [6ea00, 706a0) PFNs busy
[  168.041621] alloc_contig_range: [6ea00, 707a0) PFNs busy
[  168.044211] alloc_contig_range: [6ec00, 708a0) PFNs busy
[  168.046937] alloc_contig_range: [6ed00, 709a0) PFNs busy
[  168.050334] alloc_contig_range: [6ee00, 70aa0) PFNs busy
[  168.054546] alloc_contig_range: [6ee00, 70ba0) PFNs busy
[  168.056783] alloc_contig_range: [6f000, 70ca0) PFNs busy
[  168.057812] alloc_contig_range: [6f100, 70da0) PFNs busy
[  168.059887] alloc_contig_range: [6f200, 70ea0) PFNs busy

 

From what i have read, it has something to do with cma and looking at "cat /proc/meminfo" i got:

Spoiler

cat /proc/meminfo
MemTotal:        1939744 kB
MemFree:          467424 kB
MemAvailable:    1017360 kB
Buffers:          103756 kB
Cached:           640368 kB
SwapCached:            0 kB
Active:           949408 kB
Inactive:         287908 kB
Active(anon):     528780 kB
Inactive(anon):    61796 kB
Active(file):     420628 kB
Inactive(file):   226112 kB
Unevictable:       35076 kB
Mlocked:            1280 kB
SwapTotal:        969868 kB
SwapFree:         969868 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        523500 kB
Mapped:           318244 kB
Shmem:             96236 kB
KReclaimable:      61964 kB
Slab:             122056 kB
SReclaimable:      61964 kB
SUnreclaim:        60092 kB
KernelStack:        6076 kB
PageTables:         7564 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1939740 kB
Committed_AS:    1838164 kB
VmallocTotal:   135290159040 kB
VmallocUsed:       22676 kB
VmallocChunk:          0 kB
Percpu:             1616 kB
HardwareCorrupted:     0 kB
AnonHugePages:     67584 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:         262144 kB
CmaFree:            2712 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB

If you notice, i got only 2 mb of free cma memory, so i can suppose that the reason of bad playback performance COULD be not having enough memory, so i would like to ask, again, if someone know a way to change the cma size.

Like i said in the previous posts, setting it from kernel parameters using ArmbianEnv.txt do not work. Maybe the cma size has been hardcoded on kernel config?

Posted

Hello uniformBuffer,

 

   The CMA memory is set by a device tree entry, so that would be the only way to change it to my knowledge.  The drivers are extremely unoptimized presently for the vdec/venc, and the recommended setting is over 800 MB.  This is an obvious problem for a general-purpose distribution, it renders the La Frite 512 MB boards unbootable, and leaves no real room for any working tasks on the 1 GB ones.  It may be possible to implement this as an overlay, but in the meantime modifying the device tree is the only route:

 

https://github.com/armbian/build/blob/cc7ab6a6b1d91977bd9e154245307e85f7f76519/patch/kernel/meson64-current/0302-arm64-dts-meson-set-dma-pool-to-896MB.patch

 

This patch gives you the handle and value to get the required 896 MB CMA pool.

Posted
13 hours ago, TonyMac32 said:

Hello uniformBuffer,

 

   The CMA memory is set by a device tree entry, so that would be the only way to change it to my knowledge.  The drivers are extremely unoptimized presently for the vdec/venc, and the recommended setting is over 800 MB.  This is an obvious problem for a general-purpose distribution, it renders the La Frite 512 MB boards unbootable, and leaves no real room for any working tasks on the 1 GB ones.  It may be possible to implement this as an overlay, but in the meantime modifying the device tree is the only route:

 

https://github.com/armbian/build/blob/cc7ab6a6b1d91977bd9e154245307e85f7f76519/patch/kernel/meson64-current/0302-arm64-dts-meson-set-dma-pool-to-896MB.patch

 

This patch gives you the handle and value to get the required 896 MB CMA pool.

Thanks for the answer, this explain why the using hardware decoding stutter so much. Unfortunately i'm not so skilled to recompile the kernel, i'm scared to make device unbootable (that happened a lot of time). Also i don't have a an x86 machine to use Armbian tools. Since, like you said, increasing so much the cma make the board useless for any other purpose, i will wait for some improvements. Thanks anyway for the answer!

 

PS: I would like to ask a question for my personal curiosity: do you know if exist a board that have hardware decoding enabled on mainline kernel? As far as i know, even allwinner devices, that have been completely reverse engineered, have trouble to enable it because applications need to support v4l2_request api. Seems like find a unicorn is much more easy. This is absolutely not a critique, i know that is very difficult to get it working, i would simply like to know if exist some other board supported by Armbian that have it enabled because i would like to use it as general purpose/media device, and.... i would like to avoid Android as much as possible.

 

Thanks for the attention,

Have a good day

Posted
5 hours ago, UniformBuffer said:

Thanks for the answer, this explain why the using hardware decoding stutter so much. Unfortunately i'm not so skilled to recompile the kernel, i'm scared to make device unbootable (that happened a lot of time). Also i don't have a an x86 machine to use Armbian tools. Since, like you said, increasing so much the cma make the board useless for any other purpose, i will wait for some improvements.

 

Thankfully the kernel does not need recompiled for this, only the device tree.  See the post below, I b elieve this is still accurate as far as device tree decompile/recompile method, can be done on the device (use the correct device tree for your board ;) )

 

  • Solution
Posted

I just did a quick test on Le Potato, increasing CMA to 512 MB was very easy following the above instructions with the /boot/dtb/amlogic/meson-gxl-s905x-libretech-cc.dtb device tree.  The property yo uwant to change is right at the top of the file as a handle under "reserved-memory"

 

Any value that is 0x400000 aligned should be valid, so you can experiment.

 

 

Posted
19 hours ago, TonyMac32 said:

I just did a quick test on Le Potato, increasing CMA to 512 MB was very easy following the above instructions with the /boot/dtb/amlogic/meson-gxl-s905x-libretech-cc.dtb device tree.  The property yo uwant to change is right at the top of the file as a handle under "reserved-memory"

 

Any value that is 0x400000 aligned should be valid, so you can experiment.

 

 

Thanks for the info, like i said i'm not very skilled with kernel things, but thanks to your guide i was able to change cma allocation from 256M to 512M.

I have also tried to set 1GB with 0x40000000, that should be aligned with 0x400000, but after setting it i got no monitor output (from leds pattern i can say it was working, sysreq magic keys also worked, so the kernel was on).

Anyway, that's not foundamental, i have also tried to set the cma with the same value of the patch you linked (0x38000000) and it worked, making the cma 896MB.

 

I have tested some h264 and vp9 videos and the performance it's still choppy (after increasing to 512MB it become a little better).

I have tried mpv with `mpv --hwdec=yes video.mkv` and ffplay with `ffplay -vcodec h264_v4l2m2m video_h264.mkv` and `ffplay -vcodec vp9_v4l2m2m video_vp9.webm` .

For now i got the best performance with vp9 format using ffplay.

A strange thing is that ffplay perform better than mpv, but mpv use ffmpeg like ffmplay, so they should have more or less the same performance. :huh:

Anyway the low performance seems to be related to a VERY single threaded behavior. After increasing cma to 512MB i got some free cma left (40-80MB) and increasing again to 896MB increase the free cma proportionally, so it seems that meson_vdec does not get advantage from memory after ~512MB.

 

Anyway, even if there are some problems, i'm happy to be able to see progresses with my eyes! :D

Thanks again for the help and for your hard work :beer:

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines