eselarm Posted 6 hours ago Posted 6 hours ago I think now for the 3rd time this happens. OS is Armbian Bookworm with vendor ,current, edge kernels installed. it turns out that with vendor kenel there is no issue. But fails for mainline based rockchip64 kernels, at least edge. I am not sure about current (6.6 or 6.12) as I am not really using those. I discovered first time with 6.13.0-rc* kernel, I assumed something else might have been wrong, so more or les ignored and rebooted with vendor kernel. That happened both with nanopi-r6c and rock-3a. I also already rebooted the rock-3a on a later occasion just t make sure problem would not occur. Yesterday was on nanopi-r6c with kernel 6.15.1-edge-rockchip64. I booted with this kernel but Opensuse Tumbleweed userspace (VERSION_ID="20250610"). Following command run-as-root should move Armbian Bookworm rootfs from eMMC to NVMe: btrfs replace start /dev/mmcblk1p2 /dev/nvme0n1p2 /local/armbid Then in dmesg: Jun 19 18:44:38 ranc kernel: BTRFS info (device mmcblk1p2): dev_replace from /dev/mmcblk1p2 (devid 1) to /dev/nvme0n1p2 started Jun 19 18:45:49 ranc kernel: BTRFS warning (device mmcblk1p2): failed setting block group ro: -28 Jun 19 18:45:49 ranc kernel: BTRFS error (device mmcblk1p2): btrfs_scrub_dev(/dev/mmcblk1p2, 1, /dev/nvme0n1p2) failed -28 After 5 minutes or so I discovered that the filesystem was corrupted, would not open: open_ctree failed: -5 As I do (very) frequent backups (differential send|receive to NAS) recreate and play back the latest single subvolume snapshot is easy, but it would become quite a drama if large multi-TB with hundreds of snapshots/subvolumes. As the OS userspace was rolling release Tumbleweed, the btrfs-progs is very new, not the 6.2 from Bookworm. So it seems something mainline kernel rockchip64. I will try to further pinpoint. The NanoPi-R6C is a bit DUT ATM, but I could test as well with various other combinations, ROCK3A or ROCK5B, although those have a dedicated task/use-case in my home. The 5B has 16G RAM, so easy to run VMs and loopdevs etc. I think this is not HW related, but whos knows, maybe some detailed RK35xx optimization or so. 0 Quote
eselarm Posted 5 hours ago Author Posted 5 hours ago (edited) I could not reproduce it on a vanilla OpenSuse Tumbleweed installation (EDK2-UEFI in SPI-flash on ROCK5B) when using a rather newly created filesystem that is only used for 15 minutes or so. However, I can reproduce if the image is my virtual machine armbian64 machine/test/template and between 1 to 2 years old and has had many operations, like resize shrink extend etc. So it seems some complex out-of-space issue for a certain part: rock5b:/local/ssdata/nocow # uname -a ; grep VERSION_ID /etc/os-release Linux rock5b 6.15.1-1-default #1 SMP PREEMPT_DYNAMIC Thu Jun 5 14:29:05 UTC 2025 (75961ad) aarch64 aarch64 aarch64 GNU/Linux VERSION_ID="20250610" - so depends on image as rather fresh image newly constructed serv.img no problem - -28 is no space error AFAIK, check internal of image rock5b:/local/ssdata/nocow # df | grep loop0p2 /dev/loop0p2 12582912 7904480 2852032 74% /local/ssdata/nocow/2 rock5b:/local/ssdata/nocow # btrfs fi df 2 Data, single: total=10.01GiB, used=7.29GiB System, single: total=4.00MiB, used=16.00KiB Metadata, single: total=1.99GiB, used=218.05MiB GlobalReserve, single: total=26.95MiB, used=0.00B rock5b:/local/ssdata/nocow # btrfs de us 2 /dev/loop0p2, ID: 1 Device size: 12.00GiB Device slack: 0.00B Data,single: 10.01GiB Metadata,single: 1.99GiB System,single: 4.00MiB Unallocated: 1.00MiB rock5b:/local/ssdata/nocow # btrfs fi us 2 Overall: Device size: 12.00GiB Device allocated: 12.00GiB Device unallocated: 1.00MiB Device missing: 0.00B Device slack: 0.00B Used: 7.51GiB Free (estimated): 2.72GiB (min: 2.72GiB) Free (statfs, df): 2.72GiB Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 26.95MiB (used: 0.00B) Multiple profiles: no Data,single: Size:10.01GiB, Used:7.29GiB (72.82%) /dev/loop0p2 10.01GiB Metadata,single: Size:1.99GiB, Used:229.44MiB (11.27%) /dev/loop0p2 1.99GiB System,single: Size:4.00MiB, Used:16.00KiB (0.39%) /dev/loop0p2 4.00MiB Unallocated: /dev/loop0p2 1.00MiB From 10+ years Btrfs usage and a quick look at those numbers, I think the problem occurs because it is not possible to easily/directly create a new chunk. The filesystem is still mounted. If I unmount, a new mount fails because of undefined state (replace ongoing but no mounted filesystem and target device has temp ID 0 and cannot be found ?). So will see if balancing first with the proper options gets it done. Edited 5 hours ago by eselarm 0 Quote
eselarm Posted 4 hours ago Author Posted 4 hours ago OK, as I thought, not enough 'headroom' rock5b:/local/ssdata/nocow # btrfs inspect-internal list-chunks 2 btrfs inspect-internal list-chunks 2 Devid PNumber Type/profile PStart Length PEnd LNumber LStart Usage% ----- ------- ----------------- -------- --------- -------- ------- -------- ------ 1 1 System/single 1.00MiB 4.00MiB 5.00MiB 4 2.30GiB 0.39 1 2 Metadata/single 5.00MiB 8.00MiB 13.00MiB 1 5.00MiB 35.74 1 3 Data/single 13.00MiB 8.00MiB 21.00MiB 2 13.00MiB 100.00 1 4 Data/single 21.00MiB 1.00GiB 1.02GiB 3 21.00MiB 53.20 1 5 Metadata/single 1.02GiB 208.00MiB 1.22GiB 5 2.31GiB 23.51 1 6 Data/single 1.22GiB 512.00MiB 1.72GiB 6 2.51GiB 84.35 1 7 Data/single 1.72GiB 512.00MiB 2.22GiB 7 3.01GiB 86.48 1 8 Metadata/single 2.22GiB 256.00MiB 2.47GiB 8 3.51GiB 33.67 1 9 Metadata/single 2.47GiB 256.00MiB 2.72GiB 9 3.76GiB 14.43 1 10 Data/single 2.72GiB 1.00GiB 3.72GiB 10 4.01GiB 83.88 1 11 Data/single 3.72GiB 1.00GiB 4.72GiB 11 5.01GiB 97.23 1 12 Data/single 4.72GiB 1.00GiB 5.72GiB 12 6.01GiB 55.02 1 13 Data/single 5.72GiB 1.00GiB 6.72GiB 13 7.01GiB 85.21 1 14 Data/single 6.72GiB 1.00GiB 7.72GiB 14 8.01GiB 58.04 1 15 Data/single 7.72GiB 1.00GiB 8.72GiB 15 9.01GiB 82.03 1 16 Data/single 8.72GiB 1.00GiB 9.72GiB 16 10.01GiB 28.15 1 17 Metadata/single 9.72GiB 256.00MiB 9.97GiB 17 11.01GiB 10.30 1 18 Data/single 9.97GiB 1.00GiB 10.97GiB 18 11.26GiB 99.83 1 19 Metadata/single 10.97GiB 256.00MiB 11.22GiB 19 12.26GiB 0.58 1 20 Metadata/single 11.22GiB 256.00MiB 11.47GiB 20 12.51GiB 5.22 1 21 Metadata/single 11.47GiB 256.00MiB 11.72GiB 21 12.76GiB 0.49 1 22 Metadata/single 11.72GiB 256.00MiB 11.97GiB 22 13.01GiB 0.27 1 23 Metadata/single 11.97GiB 27.00MiB 12.00GiB 23 13.26GiB 0.06 Then doing various balance actions, then again replace action, see result below: [334356.515225] [ T207531] BTRFS info (device loop0p2): balance: start -mlimit=3 -slimit=3 [334356.516261] [ T207531] BTRFS info (device loop0p2): relocating block group 14236516352 flags metadata [334356.608520] [ T207531] BTRFS info (device loop0p2): found 1 extents, stage: move data extents [334356.655369] [ T207531] BTRFS info (device loop0p2): relocating block group 13968080896 flags metadata [334356.882397] [ T207531] BTRFS info (device loop0p2): found 44 extents, stage: move data extents [334356.974962] [ T207531] BTRFS info (device loop0p2): relocating block group 13699645440 flags metadata [334357.145812] [ T207531] BTRFS info (device loop0p2): found 77 extents, stage: move data extents [334357.223337] [ T207531] BTRFS info (device loop0p2): balance: ended with status: 0 [334402.346978] [ T207580] BTRFS info (device loop0p2): balance: start -mlimit=6 -slimit=6 [334402.347912] [ T207580] BTRFS info (device loop0p2): relocating block group 13431209984 flags metadata [334403.120645] [ T207580] BTRFS info (device loop0p2): found 841 extents, stage: move data extents [334403.372968] [ T207580] BTRFS info (device loop0p2): relocating block group 13162774528 flags metadata [334403.613848] [ T207580] BTRFS info (device loop0p2): found 94 extents, stage: move data extents [334403.743610] [ T207580] BTRFS info (device loop0p2): relocating block group 11820597248 flags metadata [334404.526435] [ T207580] BTRFS info (device loop0p2): found 1666 extents, stage: move data extents [334404.798554] [ T207580] BTRFS info (device loop0p2): relocating block group 4035969024 flags metadata [334406.059117] [ T207580] BTRFS info (device loop0p2): found 2363 extents, stage: move data extents [334406.374011] [ T207580] BTRFS info (device loop0p2): relocating block group 3767533568 flags metadata [334407.879438] [ T207580] BTRFS info (device loop0p2): found 5433 extents, stage: move data extents [334408.284506] [ T207580] BTRFS info (device loop0p2): relocating block group 2475687936 flags metadata [334413.884083] [ T207580] BTRFS info (device loop0p2): found 9463 extents, stage: move data extents [334414.690177] [ T207580] BTRFS info (device loop0p2): balance: ended with status: 0 [334530.694011] [ T207693] BTRFS info (device loop0p2): balance: start -dlimit=2 [334530.694839] [ T207693] BTRFS info (device loop0p2): relocating block group 12089032704 flags data [334534.205097] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: move data extents [334534.491352] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: update data pointers [334534.625126] [ T207693] BTRFS info (device loop0p2): relocating block group 10746855424 flags data [334536.605759] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: move data extents [334537.056028] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: update data pointers [334537.361629] [ T207693] BTRFS info (device loop0p2): balance: ended with status: 0 [334614.951029] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 started [334654.572435] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 finished So vendor I would need to do the same with vendor kernel on same images, I made exact copies, so can be done, however starting with vendor kernel requires system reboot (and other bootloader), so not sure when I can do this. 0 Quote
eselarm Posted 2 hours ago Author Posted 2 hours ago (edited) Restarted with vendor kernel, then same actions: [ 580.015977] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 started [ 599.501402] cma: cma_alloc: cma: alloc failed, req-size: 1024 pages, ret: -12 [ 599.502431] kworker/6:4: page allocation failure: order:10, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0 [ 599.502452] CPU: 6 PID: 239 Comm: kworker/6:4 Tainted: G W 6.1.115-vendor-rk35xx #1 [ 599.502457] Hardware name: FriendlyElec NanoPi R6C/NanoPi R6C, BIOS v1.1 04/09/2025 [ 599.502461] Workqueue: events atomic_pool_work_fn [ 599.502471] Call trace: [ 599.502474] dump_backtrace+0xf0/0x12c [ 599.502480] show_stack+0x20/0x30 [ 599.502484] dump_stack_lvl+0x7c/0xa0 [ 599.502489] dump_stack+0x18/0x34 [ 599.502493] warn_alloc+0xe0/0x17c [ 599.502498] __alloc_pages+0x524/0x854 [ 599.502501] atomic_pool_expand+0x8c/0x268 [ 599.502506] atomic_pool_resize+0x50/0x64 [ 599.502510] atomic_pool_work_fn+0x44/0x54 [ 599.502515] process_one_work+0x1c0/0x274 [ 599.502521] worker_thread+0x1dc/0x274 [ 599.502525] kthread+0xc4/0xd4 [ 599.502529] ret_from_fork+0x10/0x20 [ 599.502534] Mem-Info: [ 599.502537] active_anon:465 inactive_anon:13222 isolated_anon:0 active_file:37602 inactive_file:1830958 isolated_file:0 unevictable:4 dirty:1679 writeback:166667 slab_reclaimable:15924 slab_unreclaimable:27052 mapped:16566 shmem:499 pagetables:621 sec_pagetables:0 bounce:0 kernel_misc_reclaimable:0 free:12582 free_pcp:127 free_cma:0 [ 599.502546] Node 0 active_anon:1860kB inactive_anon:52888kB active_file:150408kB inactive_file:7323832kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:66264kB dirty:6716kB writeback:666668kB shmem:1996kB writeback_tmp:0kB kernel_stack:4416kB pagetables:2484kB sec_pagetables:0kB all_unreclaimable? no [ 599.502554] DMA free:24920kB boost:0kB min:5448kB low:9248kB high:13048kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:956kB inactive_file:3784292kB unevictable:0kB writepending:29608kB present:3929344kB managed:3825428kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [ 599.502561] lowmem_reserve[]: 0 0 3902 3902 [ 599.502567] Normal free:25408kB boost:0kB min:5712kB low:9696kB high:13680kB reserved_highatomic:2048KB active_anon:1860kB inactive_anon:52888kB active_file:149060kB inactive_file:3539556kB unevictable:16kB writepending:643228kB present:4194304kB managed:3996056kB mlocked:16kB bounce:0kB free_pcp:556kB local_pcp:0kB free_cma:0kB [ 599.502574] lowmem_reserve[]: 0 0 0 0 [ 599.502580] DMA: 97*4kB (UM) 103*8kB (UM) 95*16kB (UME) 101*32kB (UME) 61*64kB (UME) 22*128kB (UME) 7*256kB (U) 7*512kB (UME) 5*1024kB (UME) 1*2048kB (U) 0*4096kB = 25228kB [ 599.502601] Normal: 962*4kB (UME) 377*8kB (UME) 140*16kB (UMEH) 43*32kB (UMEH) 22*64kB (UME) 25*128kB (UME) 17*256kB (UM) 4*512kB (ME) 5*1024kB (UME) 0*2048kB 0*4096kB = 26608kB [ 599.502621] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [ 599.502624] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB [ 599.502628] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 599.502631] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB [ 599.502634] 1869148 total pagecache pages [ 599.502636] 0 pages in swap cache [ 599.502639] Free swap = 10873596kB [ 599.502641] Total swap = 10873852kB [ 599.502643] 2030912 pages RAM [ 599.502645] 0 pages HighMem/MovableOnly [ 599.502648] 75541 pages reserved [ 599.502650] 2048 pages cma reserved [ 624.681251] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 finished So although it works, a rather strange side effect with CMA makes me feel that the system could be unstable now. I have seen CMA related errors before on RK35xx, but that was with U-Boot, this is with UEFI firmware, where I maybe forgot to change a setting or so. At least HDMI display is blank, serial console and ssh works. Edited 2 hours ago by eselarm 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.