jock Posted March 9, 2022 Posted March 9, 2022 This looks very interesting, kswapd is essentially a cpu hungry process in nearly most common tasks that move memory around, glad to hear there is interesting work on to bring down cpu usage and make page swapping better. I would like to bring the patches in for some kernel and see if things would improve for light/medium desktop workloads. Anyone has some experience about? 0 Quote
hexdump Posted March 10, 2022 Posted March 10, 2022 i have this on my radar and my todo list for a while, but did not yet get to testing it - it looks very primising for lower end systems as the ones we are often dealing with here, especially i assume it will give a big push in useability for 4gb systems (i.e. enough ram to really make something out of it) 0 Quote
jock Posted March 11, 2022 Author Posted March 11, 2022 Well for what I see, even cpu-intensive threads which do a lot of memory pressure will keep kswapd kernel thread very busy, no matter the amount of memory the board has. My first thought goes to heavily taxed network routing jobs with gigabit ethernet interfaces, for example. In fact I often see kswapd very busy on regular x86 machines too. 0 Quote
jock Posted April 25, 2022 Author Posted April 25, 2022 Well, some initial tests didn't provide an acceptable result. Applying the MGLRU v9 patch to rk322x family, which is the lowest end of rockchip offers and way common to get tv boxes/boards with 1GB of RAM, results in unusable system due to frequent crashes due to memory and filesystem mismanagement. I guess the patch has never been really tested on armhf targets... 0 Quote
hexdump Posted April 26, 2022 Posted April 26, 2022 @jock - does it even have 32bit arm support? i thought it was only for x86_64 and arm64 and there are also some extra code parts which are only in there for those two arches. best wishes - hexdump 0 Quote
hexdump Posted May 21, 2022 Posted May 21, 2022 @jock - after running the patches successfully for a while on aarch64 i gave it a try on 32bit arm too and failed in a similar way as you did: those patches seem to simply not work on 32bit arm - i sent an email to the patch series author and will let you know once i'll get a response ... 0 Quote
jock Posted May 21, 2022 Author Posted May 21, 2022 @hexdump Thanks a lot! I later checked out the patch series and in the description armhf architecture is effectively omitted, so probably is untested and not expected to work. Anyway I don't know how it could enter into mainline if it is allowed to compile on armhf but cause heavy issues: the kconfig file should at least be patched to allow compilation only on amd64 and arm64 architectures. I hope they will provide fix and tests for armhf too - hopefully x86 32 bit too. Those 32 bit architectures are going to benefit a lot from this patch since most memory-constrained devices are 32 bit only. 1 Quote
hexdump Posted May 26, 2022 Posted May 26, 2022 @jock - i already sent you a pm about this, but meanwhile i tested it to be working for me, so maybe its better to post it here as well in case others are interested too: i got a response from the mglru patch series author and a patch which seems to fix the problem on 32bit arm for me (tested on rk3288 with 2gb ram so far) - the patch can be found here: https://github.com/hexdump0815/kernel-extra-patches/blob/main/multi-gen-lru/v11/v11-15-extra-patch-from-author-with-armv7l-fix.diff good luck and best wishes - hexdump 1 Quote
hexdump Posted May 30, 2022 Posted May 30, 2022 @jock- just fyi: there is now an updated version of the extra patch from the patch series author against mglru v11: https://github.com/hexdump0815/kernel-extra-patches/commit/87ba91b3503f95d625ab5eed403e47e65986fd89 still got some warnings with the old version 0 Quote
jock Posted May 30, 2022 Author Posted May 30, 2022 46 minutes ago, hexdump said: @jock- just fyi: there is now an updated version of the extra patch from the patch series author against mglru v11: https://github.com/hexdump0815/kernel-extra-patches/commit/87ba91b3503f95d625ab5eed403e47e65986fd89 still got some warnings with the old version Thanks a lot, ended up that yesterday I tested kernel v5.18.0 on rk322x with the old version of the extra patch compiling the whole debian mesa packages ecosystem with success. The box was sporting just 1gb of ram, 512mb of zram swap space and 2gb of extra USB HDD swap file. The conditions were absolutely heavy and unhealthy, but the whole packages rebuilding from sources finally completed without errors, even after extreme swapping and hours of compilation time. The system was always responsive to SSH shells, which is a great achievement by itself! 2 Quote
hexdump Posted June 15, 2022 Posted June 15, 2022 @jock - little update: v12 of the patches is out - it is essentially v11 pus the above mentioned fixes and rebased for v5.19: https://lore.kernel.org/lkml/20220614071650.206064-1-yuzhao@google.com/ https://patchwork.kernel.org/project/linux-mm/list/?series=650073 https://www.phoronix.com/scan.php?page=news_item&px=MGLRU-v12-For-Linux-5.19-rc https://github.com/hexdump0815/kernel-extra-patches/tree/main/multi-gen-lru/v12 best wishes - hexdump 2 Quote
jock Posted June 15, 2022 Author Posted June 15, 2022 Thanks a lot, I saw the news today on phoronix. I would like to let the rk3318 people test it on aarch64 too very soon 0 Quote
usual user Posted June 16, 2022 Posted June 16, 2022 8 hours ago, hexdump said: ittle update: v12 of the patches is out Thank you for the service to track the progress. Now I can pick up MGLRU again for the next 5.19.0-rc3 build. I have been using MGLRU patches since the first post of @jock. I have a habit of opening many pages in tabs in my browser. I use this as a resubmission function for pages I want to follow for a short time. Open tabs seem to have the habit of requesting more and more memory, even if they are not actively viewed. This leads to the fact that sooner or later memory swapping begins. When swapping starts, all swap space is used up very quickly and the system responds very laggy. My current workaround is to close and reopen the browser to free up the used memory. I then only reactivate the tabs of most interest, but over time I reaktivate more tabs and the game starts again. With the MGLRU feature in place, the situation seems to improve. Memory swapping seems to kick in way later and if swapping, filling up the swap space feels way slower. 8 hours ago, jock said: I would like to let the rk3318 people test it on aarch64 too very soon If you want, you can use my kernel build for a quick test. Because it is generic built, it should work for your devices. You probably need to use a suitable DTB, but this should not be a problem as long as it obeys the mainline bindings. You can put it alongside your current running kernel and decide at boot time which to run. If this experiment with my kernel fails in the end, at least you've learned how to keep as many kernels in place at the same time as your persistent space allows. See this thread for more details. For the first test you can use the 5.18.0-0.rc3 build offered there and then switch to my upcoming 5.19.0-rc3 build if suitable. Btw, my kernel is at mainline 5.18.0+ media subsystem wise. 0 Quote
jock Posted June 17, 2022 Author Posted June 17, 2022 @usual user thanks for the offer: it was really straightforward to apply the whole bundled patchset to rockchip and rk322x families that I guess it is just a matter of copying the patch from a directory to another and recompile to get a fresh kernel with the working feature even for rockchip64 family. In case of issues, I will surely ask for help 😉 0 Quote
yuzhaogoogle Posted June 17, 2022 Posted June 17, 2022 @hexdumppointed me to this discussion -- thank you for all the testing, much appreciated! If you have MGLRU related questions, please feel free to shoot me emails. The following option occasionally causes problems, so please set it to zero. The analyses from Ubuntu, Debian and a few others I'm too lazy to quote $ cat /proc/sys/vm/watermark_boost_factor 0 I'll submit a fix later today and hopefully it'll be in v5.20. 2 Quote
jock Posted June 21, 2022 Author Posted June 21, 2022 @yuzhaogoogle Managed to build some v5.18 kernels and give patches v11 a shot, but actually v5.18 seems to have some glitchy behaviour on rockchip devices. I tried both on rk3399 (without MGLRU patches) and rk3318 (with MGLRU) patches and have had severe mmc controller issues. Anyway, after many trials and freezes, I got some stack traces that seems to be related to MGLRU. I attach the full dmesg.log with a stack trace that later turned out as a kernel crash (also attached), along with kernel map for debug. Since v5.18 looks faulty by itself on these devices I don't know if it is worthy to check the logs until I get something stable without MGLRU. update: I double checked and found that a suspect patch was actually making my system very unstable, so these logs and errors are, with high probability, caused by the other patch and not related to MGLRU at all. armbian-kernel-crash.txz 0 Quote
yuzhaogoogle Posted June 22, 2022 Posted June 22, 2022 (edited) Hi @jock According to the following warning from dmesg.log, it seems the first version of the fix @hexdump posted was used -- unfortunately it's also buggy (sorry)... [ 1235.795803] ------------[ cut here ]------------ [ 1235.795827] WARNING: CPU: 2 PID: 55 at mm/vmscan.c:4464 lru_gen_look_around+0x3fc/0x728 I do have the latest MGLRU backported to v5.18, and you can apply it by git fetch https://linux-mm.googlesource.com/page-reclaim refs/changes/17/1617/2 && git cherry-pick FETCH_HEAD~14..FETCH_HEAD I'm also trying to attach the patch file here but it seems I'm too new to be allowed to attach files. Please feel free to send me an email if you need the patch file or anything else. From kernel_panic.log, it seems the bad thing already happened before root@rk3318-box:~# [ 9596.639183] BUG: Bad page state in process kswapd0 pfn:1d8b6 [ 9596.640943] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000a60 Usually the Bad page state warning appears when there is a double-free or free-after-use memory corruption. It'd be helpful if there is a way to grab the log preceding this error. Alternatively, if we can't fix the v5.18 baseline kernel, I can further backport MGLRU to v5.17. Edited June 22, 2022 by yuzhaogoogle 0 Quote
jock Posted June 22, 2022 Author Posted June 22, 2022 @yuzhaogoogle thanks a lot for the answer; I finally I managed to recompile the kernel v5.18 with v11 patch as-is (without the additional fix mentioned by @hexdump). I removed an offending patch that was in the armbian patch set and the system is now just stable as I expected it to be; just doing the usual desktop business for a while (browsing on half dozen firefox tabs, streaming music via bluetooth, moving files via samba, a couple of terminals, ...) and keeping the device busy for a whole night resulted into perfect stability so far, clean dmesg and 662 megabytes of swap file usage (zram) out of 1gb of swap available. I will look forward to contact you for a the new patchset backported to v5.18 with latest fixes, I think the forum community will be happy to test and give feedback 😉 1 Quote
fik Posted July 8, 2022 Posted July 8, 2022 The new MGLRU v13 should fix the "BUG: kernel NULL pointer dereference, address:..." problem introduced in v11 and v12. https://lore.kernel.org/lkml/20220706220022.968789-1-yuzhao@google.com/ 0 Quote
yuzhaogoogle Posted July 11, 2022 Posted July 11, 2022 On 7/8/2022 at 5:19 AM, fik said: The new MGLRU v13 should fix the "BUG: kernel NULL pointer dereference, address:..." problem introduced in v11 and v12. https://lore.kernel.org/lkml/20220706220022.968789-1-yuzhao@google.com/ I guess most users won't hit this problem unless they poke into the debugfs interface. 0 Quote
turkerali Posted July 11, 2022 Posted July 11, 2022 On 6/22/2022 at 9:17 AM, yuzhaogoogle said: Alternatively, if we can't fix the v5.18 baseline kernel, I can further backport MGLRU to v5.17. Hi Yu, If you could setup a git repository and backport latest MGLRU to at least: - Last stable kernel (5.18.10 at the moment) - Last LTS kernel (5.15.53 at the moment) it would be really beneficial for people who want to test MLGRU. Regards. 0 Quote
yuzhaogoogle Posted July 13, 2022 Posted July 13, 2022 Yeah, I'm working on it. Will keep you posted. Thanks. 1 Quote
usual user Posted October 11, 2022 Posted October 11, 2022 Multi-Gen LRU has just landed. In 6.1.0, the support can therefore be used out-of-the-box. 1 Quote
jock Posted October 11, 2022 Author Posted October 11, 2022 1 hour ago, usual user said: Multi-Gen LRU has just landed. In 6.1.0, the support can therefore be used out-of-the-box. Finally! On my personal testing on 5.19.x never had any issue with both armhf and arm64 architectures 2 Quote
yuzhaogoogle Posted October 11, 2022 Posted October 11, 2022 3 hours ago, jock said: Finally! On my personal testing on 5.19.x never had any issue with both armhf and arm64 architectures Thanks. Now we may consider switching all boards to MGLRU on 6.1 😀 2 Quote
hartraft Posted December 23, 2022 Posted December 23, 2022 Hi I'm quite new to MGLRU. I've got the community build of armbian running on 6.1.0. But when using mg-lru-helper, it wants to set /sys/kernel/mm/lru_gen/enabled But the /sys/kernel/mm/lru_gen path doesn't exist. Is there something I need to do to get the kernel module? 0 Quote
jock Posted December 23, 2022 Author Posted December 23, 2022 @hartraft rk322x and rockchip (32 bit) are already shipped with MGLRU active in the kernel config, but I don't know if other families have enabled MGLRU already. What is your board? 0 Quote
hartraft Posted December 23, 2022 Posted December 23, 2022 I have an Odroid XU4. My other devices are a tinkerboard (rk3288) and a Rock 5b (rk3588 but I don't think it's on kernel 6.1 yet) If you point me in the right direction I could take a look at updating it 0 Quote
hexdump Posted December 23, 2022 Posted December 23, 2022 @hartraft - there are two kernel options for mglru: CONFIG_LRU_GEN=y and CONFIG_LRU_GEN_ENABLED=y - the first is to have mglru built into the kernel and the second is to have it enabled by default - if they are not in your kernel config it might be required to rebuild the kernel with them (or at least the first) enabled 1 Quote
jock Posted December 23, 2022 Author Posted December 23, 2022 @hartraft Currently, I can assure you that the tinkerboard with armbian edge kernel (6.1) receives MGLRU compiled and enabled by default because I maintain the rk3288 (rockchip 32 bit) family. The other two boards are not under my maintenance so I can't say anything about, but you could check in the config sample file in your /boot directory to see if a kernel is compiled with the options pointed by @hexdump 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.