Jump to content

Recommended Posts

Posted

I think now for the 3rd time this happens. OS is Armbian Bookworm with vendor ,current, edge kernels installed. it turns out that with vendor kenel there is no issue. But fails for mainline based rockchip64 kernels, at least edge. I am not sure about current (6.6 or 6.12) as I am not really using those. I discovered first time with 6.13.0-rc* kernel, I assumed something else might have been wrong, so more or les ignored and rebooted with vendor kernel. That happened both with nanopi-r6c and rock-3a. I also already rebooted the rock-3a on a later occasion just t make sure problem would not occur.

 

Yesterday was on nanopi-r6c with kernel 6.15.1-edge-rockchip64. I booted with this kernel but Opensuse Tumbleweed userspace (VERSION_ID="20250610"). Following command run-as-root should move Armbian Bookworm rootfs from eMMC to NVMe:

btrfs replace start /dev/mmcblk1p2 /dev/nvme0n1p2 /local/armbid

 

Then in dmesg:

Jun 19 18:44:38 ranc kernel: BTRFS info (device mmcblk1p2): dev_replace from /dev/mmcblk1p2 (devid 1) to /dev/nvme0n1p2 started
Jun 19 18:45:49 ranc kernel: BTRFS warning (device mmcblk1p2): failed setting block group ro: -28
Jun 19 18:45:49 ranc kernel: BTRFS error (device mmcblk1p2): btrfs_scrub_dev(/dev/mmcblk1p2, 1, /dev/nvme0n1p2) failed -28
 

After 5 minutes or so I discovered that the filesystem was corrupted, would not open: open_ctree failed: -5

As I do (very) frequent backups (differential send|receive to NAS) recreate and play back the latest single subvolume snapshot is easy, but it would become quite a drama if large multi-TB with hundreds of snapshots/subvolumes.

 

As the OS userspace was rolling release Tumbleweed, the btrfs-progs is very new, not the 6.2 from Bookworm. So it seems something mainline kernel rockchip64.

 

I will try to further pinpoint. The NanoPi-R6C is a bit DUT ATM, but I could test as well with various other combinations, ROCK3A or ROCK5B, although those have a dedicated task/use-case in my home. The 5B has 16G RAM, so easy to run VMs and loopdevs etc. I think this is not HW related, but whos knows, maybe some detailed RK35xx optimization or so.

 

Posted (edited)

I could not reproduce it on a vanilla OpenSuse Tumbleweed installation (EDK2-UEFI in SPI-flash on ROCK5B) when using a rather newly created filesystem that is only used for 15 minutes or so.

 

However, I can reproduce if the image is my virtual machine armbian64 machine/test/template and between 1 to 2 years old and has had many operations, like resize shrink extend etc. So it seems some complex out-of-space issue for a certain part:

 

rock5b:/local/ssdata/nocow # uname -a ; grep VERSION_ID /etc/os-release
Linux rock5b 6.15.1-1-default #1 SMP PREEMPT_DYNAMIC Thu Jun  5 14:29:05 UTC 2025 (75961ad) aarch64 aarch64 aarch64 GNU/Linux
VERSION_ID="20250610"
- so depends on image as rather fresh image newly constructed serv.img no problem
- -28 is no space error AFAIK, check internal of image
rock5b:/local/ssdata/nocow # df | grep loop0p2
/dev/loop0p2                          12582912    7904480    2852032  74% /local/ssdata/nocow/2
rock5b:/local/ssdata/nocow # btrfs fi df 2
Data, single: total=10.01GiB, used=7.29GiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=1.99GiB, used=218.05MiB
GlobalReserve, single: total=26.95MiB, used=0.00B
rock5b:/local/ssdata/nocow # btrfs de us 2
/dev/loop0p2, ID: 1
   Device size:            12.00GiB
   Device slack:              0.00B
   Data,single:            10.01GiB
   Metadata,single:         1.99GiB
   System,single:           4.00MiB
   Unallocated:             1.00MiB

rock5b:/local/ssdata/nocow # btrfs fi us 2
Overall:
    Device size:                  12.00GiB
    Device allocated:             12.00GiB
    Device unallocated:            1.00MiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                          7.51GiB
    Free (estimated):              2.72GiB      (min: 2.72GiB)
    Free (statfs, df):             2.72GiB
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:               26.95MiB      (used: 0.00B)
    Multiple profiles:                  no

Data,single: Size:10.01GiB, Used:7.29GiB (72.82%)
   /dev/loop0p2   10.01GiB

Metadata,single: Size:1.99GiB, Used:229.44MiB (11.27%)
   /dev/loop0p2    1.99GiB

System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
   /dev/loop0p2    4.00MiB

Unallocated:
   /dev/loop0p2    1.00MiB

From 10+ years Btrfs usage and a quick look at those numbers, I think the problem occurs because it is not possible to easily/directly create a new chunk.

The filesystem is still mounted. If I unmount, a new mount fails because of undefined state (replace ongoing but no mounted filesystem and target device has temp ID 0 and cannot be found ?).

 

So will see if balancing first with the proper options gets it done.

 

 

Edited by eselarm
Posted

OK, as I thought, not enough 'headroom'

 

rock5b:/local/ssdata/nocow # btrfs inspect-internal list-chunks 2
btrfs inspect-internal list-chunks 2
Devid PNumber      Type/profile   PStart    Length     PEnd LNumber   LStart Usage%
----- ------- ----------------- -------- --------- -------- ------- -------- ------
    1       1     System/single  1.00MiB   4.00MiB  5.00MiB       4  2.30GiB   0.39
    1       2   Metadata/single  5.00MiB   8.00MiB 13.00MiB       1  5.00MiB  35.74
    1       3       Data/single 13.00MiB   8.00MiB 21.00MiB       2 13.00MiB 100.00
    1       4       Data/single 21.00MiB   1.00GiB  1.02GiB       3 21.00MiB  53.20
    1       5   Metadata/single  1.02GiB 208.00MiB  1.22GiB       5  2.31GiB  23.51
    1       6       Data/single  1.22GiB 512.00MiB  1.72GiB       6  2.51GiB  84.35
    1       7       Data/single  1.72GiB 512.00MiB  2.22GiB       7  3.01GiB  86.48
    1       8   Metadata/single  2.22GiB 256.00MiB  2.47GiB       8  3.51GiB  33.67
    1       9   Metadata/single  2.47GiB 256.00MiB  2.72GiB       9  3.76GiB  14.43
    1      10       Data/single  2.72GiB   1.00GiB  3.72GiB      10  4.01GiB  83.88
    1      11       Data/single  3.72GiB   1.00GiB  4.72GiB      11  5.01GiB  97.23
    1      12       Data/single  4.72GiB   1.00GiB  5.72GiB      12  6.01GiB  55.02
    1      13       Data/single  5.72GiB   1.00GiB  6.72GiB      13  7.01GiB  85.21
    1      14       Data/single  6.72GiB   1.00GiB  7.72GiB      14  8.01GiB  58.04
    1      15       Data/single  7.72GiB   1.00GiB  8.72GiB      15  9.01GiB  82.03
    1      16       Data/single  8.72GiB   1.00GiB  9.72GiB      16 10.01GiB  28.15
    1      17   Metadata/single  9.72GiB 256.00MiB  9.97GiB      17 11.01GiB  10.30
    1      18       Data/single  9.97GiB   1.00GiB 10.97GiB      18 11.26GiB  99.83
    1      19   Metadata/single 10.97GiB 256.00MiB 11.22GiB      19 12.26GiB   0.58
    1      20   Metadata/single 11.22GiB 256.00MiB 11.47GiB      20 12.51GiB   5.22
    1      21   Metadata/single 11.47GiB 256.00MiB 11.72GiB      21 12.76GiB   0.49
    1      22   Metadata/single 11.72GiB 256.00MiB 11.97GiB      22 13.01GiB   0.27
    1      23   Metadata/single 11.97GiB  27.00MiB 12.00GiB      23 13.26GiB   0.06

Then doing various balance actions, then again replace action, see result below:

[334356.515225] [ T207531] BTRFS info (device loop0p2): balance: start -mlimit=3 -slimit=3
[334356.516261] [ T207531] BTRFS info (device loop0p2): relocating block group 14236516352 flags metadata
[334356.608520] [ T207531] BTRFS info (device loop0p2): found 1 extents, stage: move data extents
[334356.655369] [ T207531] BTRFS info (device loop0p2): relocating block group 13968080896 flags metadata
[334356.882397] [ T207531] BTRFS info (device loop0p2): found 44 extents, stage: move data extents
[334356.974962] [ T207531] BTRFS info (device loop0p2): relocating block group 13699645440 flags metadata
[334357.145812] [ T207531] BTRFS info (device loop0p2): found 77 extents, stage: move data extents
[334357.223337] [ T207531] BTRFS info (device loop0p2): balance: ended with status: 0
[334402.346978] [ T207580] BTRFS info (device loop0p2): balance: start -mlimit=6 -slimit=6
[334402.347912] [ T207580] BTRFS info (device loop0p2): relocating block group 13431209984 flags metadata
[334403.120645] [ T207580] BTRFS info (device loop0p2): found 841 extents, stage: move data extents
[334403.372968] [ T207580] BTRFS info (device loop0p2): relocating block group 13162774528 flags metadata
[334403.613848] [ T207580] BTRFS info (device loop0p2): found 94 extents, stage: move data extents
[334403.743610] [ T207580] BTRFS info (device loop0p2): relocating block group 11820597248 flags metadata
[334404.526435] [ T207580] BTRFS info (device loop0p2): found 1666 extents, stage: move data extents
[334404.798554] [ T207580] BTRFS info (device loop0p2): relocating block group 4035969024 flags metadata
[334406.059117] [ T207580] BTRFS info (device loop0p2): found 2363 extents, stage: move data extents
[334406.374011] [ T207580] BTRFS info (device loop0p2): relocating block group 3767533568 flags metadata
[334407.879438] [ T207580] BTRFS info (device loop0p2): found 5433 extents, stage: move data extents
[334408.284506] [ T207580] BTRFS info (device loop0p2): relocating block group 2475687936 flags metadata
[334413.884083] [ T207580] BTRFS info (device loop0p2): found 9463 extents, stage: move data extents
[334414.690177] [ T207580] BTRFS info (device loop0p2): balance: ended with status: 0
[334530.694011] [ T207693] BTRFS info (device loop0p2): balance: start -dlimit=2
[334530.694839] [ T207693] BTRFS info (device loop0p2): relocating block group 12089032704 flags data
[334534.205097] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: move data extents
[334534.491352] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: update data pointers
[334534.625126] [ T207693] BTRFS info (device loop0p2): relocating block group 10746855424 flags data
[334536.605759] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: move data extents
[334537.056028] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: update data pointers
[334537.361629] [ T207693] BTRFS info (device loop0p2): balance: ended with status: 0
[334614.951029] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 started
[334654.572435] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 finished

So vendor I would need to do the same with vendor kernel on same images, I made exact copies, so can be done, however starting with vendor kernel requires system reboot (and other bootloader), so not sure when I can do this. 

Posted (edited)

Restarted with vendor kernel, then same actions:

[  580.015977] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 started
[  599.501402] cma: cma_alloc: cma: alloc failed, req-size: 1024 pages, ret: -12
[  599.502431] kworker/6:4: page allocation failure: order:10, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
[  599.502452] CPU: 6 PID: 239 Comm: kworker/6:4 Tainted: G        W          6.1.115-vendor-rk35xx #1
[  599.502457] Hardware name: FriendlyElec NanoPi R6C/NanoPi R6C, BIOS v1.1 04/09/2025
[  599.502461] Workqueue: events atomic_pool_work_fn
[  599.502471] Call trace:
[  599.502474]  dump_backtrace+0xf0/0x12c
[  599.502480]  show_stack+0x20/0x30
[  599.502484]  dump_stack_lvl+0x7c/0xa0
[  599.502489]  dump_stack+0x18/0x34
[  599.502493]  warn_alloc+0xe0/0x17c
[  599.502498]  __alloc_pages+0x524/0x854
[  599.502501]  atomic_pool_expand+0x8c/0x268
[  599.502506]  atomic_pool_resize+0x50/0x64
[  599.502510]  atomic_pool_work_fn+0x44/0x54
[  599.502515]  process_one_work+0x1c0/0x274
[  599.502521]  worker_thread+0x1dc/0x274
[  599.502525]  kthread+0xc4/0xd4
[  599.502529]  ret_from_fork+0x10/0x20
[  599.502534] Mem-Info:
[  599.502537] active_anon:465 inactive_anon:13222 isolated_anon:0
                active_file:37602 inactive_file:1830958 isolated_file:0
                unevictable:4 dirty:1679 writeback:166667
                slab_reclaimable:15924 slab_unreclaimable:27052
                mapped:16566 shmem:499 pagetables:621
                sec_pagetables:0 bounce:0
                kernel_misc_reclaimable:0
                free:12582 free_pcp:127 free_cma:0
[  599.502546] Node 0 active_anon:1860kB inactive_anon:52888kB active_file:150408kB inactive_file:7323832kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:66264kB dirty:6716kB writeback:666668kB shmem:1996kB writeback_tmp:0kB kernel_stack:4416kB pagetables:2484kB sec_pagetables:0kB all_unreclaimable? no
[  599.502554] DMA free:24920kB boost:0kB min:5448kB low:9248kB high:13048kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:956kB inactive_file:3784292kB unevictable:0kB writepending:29608kB present:3929344kB managed:3825428kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  599.502561] lowmem_reserve[]: 0 0 3902 3902
[  599.502567] Normal free:25408kB boost:0kB min:5712kB low:9696kB high:13680kB reserved_highatomic:2048KB active_anon:1860kB inactive_anon:52888kB active_file:149060kB inactive_file:3539556kB unevictable:16kB writepending:643228kB present:4194304kB managed:3996056kB mlocked:16kB bounce:0kB free_pcp:556kB local_pcp:0kB free_cma:0kB
[  599.502574] lowmem_reserve[]: 0 0 0 0
[  599.502580] DMA: 97*4kB (UM) 103*8kB (UM) 95*16kB (UME) 101*32kB (UME) 61*64kB (UME) 22*128kB (UME) 7*256kB (U) 7*512kB (UME) 5*1024kB (UME) 1*2048kB (U) 0*4096kB = 25228kB
[  599.502601] Normal: 962*4kB (UME) 377*8kB (UME) 140*16kB (UMEH) 43*32kB (UMEH) 22*64kB (UME) 25*128kB (UME) 17*256kB (UM) 4*512kB (ME) 5*1024kB (UME) 0*2048kB 0*4096kB = 26608kB
[  599.502621] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  599.502624] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=32768kB
[  599.502628] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  599.502631] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=64kB
[  599.502634] 1869148 total pagecache pages
[  599.502636] 0 pages in swap cache
[  599.502639] Free swap  = 10873596kB
[  599.502641] Total swap = 10873852kB
[  599.502643] 2030912 pages RAM
[  599.502645] 0 pages HighMem/MovableOnly
[  599.502648] 75541 pages reserved
[  599.502650] 2048 pages cma reserved
[  624.681251] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 finished

So although it works, a rather strange side effect with CMA makes me feel that the system could be unstable now. 

 

I have seen CMA related errors before on RK35xx, but that was with U-Boot, this is with UEFI firmware, where I maybe forgot to change a setting or so. At least HDMI display is blank, serial console and ssh works.

Edited by eselarm

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines