LZO-RLE and CONFIG_UACCESS_WITH_MEMCPY

Sven · June 26, 2020

Hello,

I've got two questions for you folks:

1.) Why is LZO being used over LZO-RLE for /etc/default/armbian-zram-config?

2.) Are there any downsides or risks currently known from using the kernel option CONFIG_UACCESS_WITH_MEMCPY?

To 1.) I've been looking at LZO-RLE, which is the improved version of the LZO algorithm, and since it was made part of the kernel a while ago was I expecting some form of kernel configuration option to enable it. It is however included by default when LZO was configured and one just needs to specify "lzo-rle" over "lzo" as algorithm for zramctl. This means one only needs to replace "lzo" with "lzo-rle" in /etc/default/armbian-zram-config to use it. Because it has been a part of the kernel since 5.1 am I wondering why this hasn't been made the default in Armbian. It could have been overlooked of course, but perhaps there is more to it, which is why I'm asking here. zramctl itself doesn't list "lzo-rle" as one of it's algorithms, but it appears to be working:

# zramctl
NAME       ALGORITHM DISKSIZE  DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram1 lzo-rle     246.1M 56.9M  3.5M  5.1M       4 [SWAP]

In /usr/lib/armbian/armbian-zram-config, line 53, is "lzo" used as the default.

To 2.) I'm running Armbian on an Allwinner H2 SBC and the kernel option CONFIG_UACCESS_WITH_MEMCPY gives a minor performance gain, but the option isn't enabled in the default kernel for this SBC. As a reminder, here the kernel help on this option:

Implement faster copy_to_user and clear_user methods for CPU cores where a 8-word STM instruction give significantly higher memory write throughput than a sequence of individual 32bit stores.

A possible side effect is a slight increase in scheduling latency between threads sharing the same address space if they invoke such copy operations with large buffers.

However, if the CPU data cache is using a write-allocate mode, this option is unlikely to provide any performance gain.

I've so far only seen gains, not very large, but still noticeable. One can see it when using "armbianmonitor -z", but also with dd. If anyone has experimented with this options, knows more about it, especially about it's risks and downsides then I'd appreciate it if you could drop a comment.

Without the kernel option:

# dd if=/dev/zero bs=16M count=1024 of=/dev/null
1024+0 records in
1024+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 9.73097 s, 1.8 GB/s

With CONFIG_UACCESS_WITH_MEMCPY enabled:

# dd if=/dev/zero bs=16M count=1024 of=/dev/null
1024+0 records in
1024+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 8.32831 s, 2.1 GB/s

Cheers,

Sven

Edited June 26, 2020 by Sven
Added an example

Sign In

LZO-RLE and CONFIG_UACCESS_WITH_MEMCPY

Recommended Posts

Sven

Link to comment

Share on other sites

Forums

My Activity Streams

Download

Store

Important Information