0
Sven

LZO-RLE and CONFIG_UACCESS_WITH_MEMCPY

Recommended Posts

(edited)

Hello,

 

I've got two questions for you folks:

 

1.) Why is LZO being used over LZO-RLE for /etc/default/armbian-zram-config?

2.) Are there any downsides or risks currently known from using the kernel option CONFIG_UACCESS_WITH_MEMCPY?

 

 

To 1.) I've been looking at LZO-RLE, which is the improved version of the LZO algorithm, and since it was made part of the kernel a while ago was I expecting some form of kernel configuration option to enable it. It is however included by default when LZO was configured and one just needs to specify "lzo-rle" over "lzo" as algorithm for zramctl. This means one only needs to replace "lzo" with "lzo-rle" in /etc/default/armbian-zram-config to use it. Because it has been a part of the kernel since 5.1 am I wondering why this hasn't been made the default in Armbian. It could have been overlooked of course, but perhaps there is more to it, which is why I'm asking here. zramctl itself doesn't list "lzo-rle" as one of it's algorithms, but it appears to be working:

# zramctl
NAME       ALGORITHM DISKSIZE  DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram1 lzo-rle     246.1M 56.9M  3.5M  5.1M       4 [SWAP]

In /usr/lib/armbian/armbian-zram-config, line 53, is "lzo" used as the default.

 

 

To 2.) I'm running Armbian on an Allwinner H2 SBC and the kernel option CONFIG_UACCESS_WITH_MEMCPY gives a minor performance gain, but the option isn't enabled in the default kernel for this SBC. As a reminder, here the kernel help on this option:

 

Implement faster copy_to_user and clear_user methods for CPU cores where a 8-word STM instruction give significantly higher memory write throughput than a sequence of individual 32bit stores.

A possible side effect is a slight increase in scheduling latency between threads sharing the same address space if they invoke such copy operations with large buffers.

However, if the CPU data cache is using a write-allocate mode, this option is unlikely to provide any performance gain.

 

I've so far only seen gains, not very large, but still noticeable. One can see it when using "armbianmonitor -z", but also with dd. If anyone has experimented with this options, knows more about it, especially about it's risks and downsides then I'd appreciate it if you could drop a comment.

 

Without the kernel option:

# dd if=/dev/zero bs=16M count=1024 of=/dev/null
1024+0 records in
1024+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 9.73097 s, 1.8 GB/s


With CONFIG_UACCESS_WITH_MEMCPY enabled:

# dd if=/dev/zero bs=16M count=1024 of=/dev/null
1024+0 records in
1024+0 records out
17179869184 bytes (17 GB, 16 GiB) copied, 8.32831 s, 2.1 GB/s

 

Cheers,

Sven

Edited by Sven
Added an example

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
0