Jump to content

Recommended Posts

Posted

I think now for the 3rd time this happens. OS is Armbian Bookworm with vendor ,current, edge kernels installed. it turns out that with vendor kenel there is no issue. But fails for mainline based rockchip64 kernels, at least edge. I am not sure about current (6.6 or 6.12) as I am not really using those. I discovered first time with 6.13.0-rc* kernel, I assumed something else might have been wrong, so more or les ignored and rebooted with vendor kernel. That happened both with nanopi-r6c and rock-3a. I also already rebooted the rock-3a on a later occasion just t make sure problem would not occur.

 

Yesterday was on nanopi-r6c with kernel 6.15.1-edge-rockchip64. I booted with this kernel but Opensuse Tumbleweed userspace (VERSION_ID="20250610"). Following command run-as-root should move Armbian Bookworm rootfs from eMMC to NVMe:

btrfs replace start /dev/mmcblk1p2 /dev/nvme0n1p2 /local/armbid

 

Then in dmesg:

Jun 19 18:44:38 ranc kernel: BTRFS info (device mmcblk1p2): dev_replace from /dev/mmcblk1p2 (devid 1) to /dev/nvme0n1p2 started
Jun 19 18:45:49 ranc kernel: BTRFS warning (device mmcblk1p2): failed setting block group ro: -28
Jun 19 18:45:49 ranc kernel: BTRFS error (device mmcblk1p2): btrfs_scrub_dev(/dev/mmcblk1p2, 1, /dev/nvme0n1p2) failed -28
 

After 5 minutes or so I discovered that the filesystem was corrupted, would not open: open_ctree failed: -5

As I do (very) frequent backups (differential send|receive to NAS) recreate and play back the latest single subvolume snapshot is easy, but it would become quite a drama if large multi-TB with hundreds of snapshots/subvolumes.

 

As the OS userspace was rolling release Tumbleweed, the btrfs-progs is very new, not the 6.2 from Bookworm. So it seems something mainline kernel rockchip64.

 

I will try to further pinpoint. The NanoPi-R6C is a bit DUT ATM, but I could test as well with various other combinations, ROCK3A or ROCK5B, although those have a dedicated task/use-case in my home. The 5B has 16G RAM, so easy to run VMs and loopdevs etc. I think this is not HW related, but whos knows, maybe some detailed RK35xx optimization or so.

 

Posted (edited)

I could not reproduce it on a vanilla OpenSuse Tumbleweed installation (EDK2-UEFI in SPI-flash on ROCK5B) when using a rather newly created filesystem that is only used for 15 minutes or so.

 

However, I can reproduce if the image is my virtual machine armbian64 machine/test/template and between 1 to 2 years old and has had many operations, like resize shrink extend etc. So it seems some complex out-of-space issue for a certain part:

 

rock5b:/local/ssdata/nocow # uname -a ; grep VERSION_ID /etc/os-release
Linux rock5b 6.15.1-1-default #1 SMP PREEMPT_DYNAMIC Thu Jun  5 14:29:05 UTC 2025 (75961ad) aarch64 aarch64 aarch64 GNU/Linux
VERSION_ID="20250610"
- so depends on image as rather fresh image newly constructed serv.img no problem
- -28 is no space error AFAIK, check internal of image
rock5b:/local/ssdata/nocow # df | grep loop0p2
/dev/loop0p2                          12582912    7904480    2852032  74% /local/ssdata/nocow/2
rock5b:/local/ssdata/nocow # btrfs fi df 2
Data, single: total=10.01GiB, used=7.29GiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=1.99GiB, used=218.05MiB
GlobalReserve, single: total=26.95MiB, used=0.00B
rock5b:/local/ssdata/nocow # btrfs de us 2
/dev/loop0p2, ID: 1
   Device size:            12.00GiB
   Device slack:              0.00B
   Data,single:            10.01GiB
   Metadata,single:         1.99GiB
   System,single:           4.00MiB
   Unallocated:             1.00MiB

rock5b:/local/ssdata/nocow # btrfs fi us 2
Overall:
    Device size:                  12.00GiB
    Device allocated:             12.00GiB
    Device unallocated:            1.00MiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                          7.51GiB
    Free (estimated):              2.72GiB      (min: 2.72GiB)
    Free (statfs, df):             2.72GiB
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:               26.95MiB      (used: 0.00B)
    Multiple profiles:                  no

Data,single: Size:10.01GiB, Used:7.29GiB (72.82%)
   /dev/loop0p2   10.01GiB

Metadata,single: Size:1.99GiB, Used:229.44MiB (11.27%)
   /dev/loop0p2    1.99GiB

System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
   /dev/loop0p2    4.00MiB

Unallocated:
   /dev/loop0p2    1.00MiB

From 10+ years Btrfs usage and a quick look at those numbers, I think the problem occurs because it is not possible to easily/directly create a new chunk.

The filesystem is still mounted. If I unmount, a new mount fails because of undefined state (replace ongoing but no mounted filesystem and target device has temp ID 0 and cannot be found ?).

 

So will see if balancing first with the proper options gets it done.

 

 

Edited by eselarm
Posted

OK, as I thought, not enough 'headroom'

 

rock5b:/local/ssdata/nocow # btrfs inspect-internal list-chunks 2
btrfs inspect-internal list-chunks 2
Devid PNumber      Type/profile   PStart    Length     PEnd LNumber   LStart Usage%
----- ------- ----------------- -------- --------- -------- ------- -------- ------
    1       1     System/single  1.00MiB   4.00MiB  5.00MiB       4  2.30GiB   0.39
    1       2   Metadata/single  5.00MiB   8.00MiB 13.00MiB       1  5.00MiB  35.74
    1       3       Data/single 13.00MiB   8.00MiB 21.00MiB       2 13.00MiB 100.00
    1       4       Data/single 21.00MiB   1.00GiB  1.02GiB       3 21.00MiB  53.20
    1       5   Metadata/single  1.02GiB 208.00MiB  1.22GiB       5  2.31GiB  23.51
    1       6       Data/single  1.22GiB 512.00MiB  1.72GiB       6  2.51GiB  84.35
    1       7       Data/single  1.72GiB 512.00MiB  2.22GiB       7  3.01GiB  86.48
    1       8   Metadata/single  2.22GiB 256.00MiB  2.47GiB       8  3.51GiB  33.67
    1       9   Metadata/single  2.47GiB 256.00MiB  2.72GiB       9  3.76GiB  14.43
    1      10       Data/single  2.72GiB   1.00GiB  3.72GiB      10  4.01GiB  83.88
    1      11       Data/single  3.72GiB   1.00GiB  4.72GiB      11  5.01GiB  97.23
    1      12       Data/single  4.72GiB   1.00GiB  5.72GiB      12  6.01GiB  55.02
    1      13       Data/single  5.72GiB   1.00GiB  6.72GiB      13  7.01GiB  85.21
    1      14       Data/single  6.72GiB   1.00GiB  7.72GiB      14  8.01GiB  58.04
    1      15       Data/single  7.72GiB   1.00GiB  8.72GiB      15  9.01GiB  82.03
    1      16       Data/single  8.72GiB   1.00GiB  9.72GiB      16 10.01GiB  28.15
    1      17   Metadata/single  9.72GiB 256.00MiB  9.97GiB      17 11.01GiB  10.30
    1      18       Data/single  9.97GiB   1.00GiB 10.97GiB      18 11.26GiB  99.83
    1      19   Metadata/single 10.97GiB 256.00MiB 11.22GiB      19 12.26GiB   0.58
    1      20   Metadata/single 11.22GiB 256.00MiB 11.47GiB      20 12.51GiB   5.22
    1      21   Metadata/single 11.47GiB 256.00MiB 11.72GiB      21 12.76GiB   0.49
    1      22   Metadata/single 11.72GiB 256.00MiB 11.97GiB      22 13.01GiB   0.27
    1      23   Metadata/single 11.97GiB  27.00MiB 12.00GiB      23 13.26GiB   0.06

Then doing various balance actions, then again replace action, see result below:

[334356.515225] [ T207531] BTRFS info (device loop0p2): balance: start -mlimit=3 -slimit=3
[334356.516261] [ T207531] BTRFS info (device loop0p2): relocating block group 14236516352 flags metadata
[334356.608520] [ T207531] BTRFS info (device loop0p2): found 1 extents, stage: move data extents
[334356.655369] [ T207531] BTRFS info (device loop0p2): relocating block group 13968080896 flags metadata
[334356.882397] [ T207531] BTRFS info (device loop0p2): found 44 extents, stage: move data extents
[334356.974962] [ T207531] BTRFS info (device loop0p2): relocating block group 13699645440 flags metadata
[334357.145812] [ T207531] BTRFS info (device loop0p2): found 77 extents, stage: move data extents
[334357.223337] [ T207531] BTRFS info (device loop0p2): balance: ended with status: 0
[334402.346978] [ T207580] BTRFS info (device loop0p2): balance: start -mlimit=6 -slimit=6
[334402.347912] [ T207580] BTRFS info (device loop0p2): relocating block group 13431209984 flags metadata
[334403.120645] [ T207580] BTRFS info (device loop0p2): found 841 extents, stage: move data extents
[334403.372968] [ T207580] BTRFS info (device loop0p2): relocating block group 13162774528 flags metadata
[334403.613848] [ T207580] BTRFS info (device loop0p2): found 94 extents, stage: move data extents
[334403.743610] [ T207580] BTRFS info (device loop0p2): relocating block group 11820597248 flags metadata
[334404.526435] [ T207580] BTRFS info (device loop0p2): found 1666 extents, stage: move data extents
[334404.798554] [ T207580] BTRFS info (device loop0p2): relocating block group 4035969024 flags metadata
[334406.059117] [ T207580] BTRFS info (device loop0p2): found 2363 extents, stage: move data extents
[334406.374011] [ T207580] BTRFS info (device loop0p2): relocating block group 3767533568 flags metadata
[334407.879438] [ T207580] BTRFS info (device loop0p2): found 5433 extents, stage: move data extents
[334408.284506] [ T207580] BTRFS info (device loop0p2): relocating block group 2475687936 flags metadata
[334413.884083] [ T207580] BTRFS info (device loop0p2): found 9463 extents, stage: move data extents
[334414.690177] [ T207580] BTRFS info (device loop0p2): balance: ended with status: 0
[334530.694011] [ T207693] BTRFS info (device loop0p2): balance: start -dlimit=2
[334530.694839] [ T207693] BTRFS info (device loop0p2): relocating block group 12089032704 flags data
[334534.205097] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: move data extents
[334534.491352] [ T207693] BTRFS info (device loop0p2): found 1897 extents, stage: update data pointers
[334534.625126] [ T207693] BTRFS info (device loop0p2): relocating block group 10746855424 flags data
[334536.605759] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: move data extents
[334537.056028] [ T207693] BTRFS info (device loop0p2): found 16316 extents, stage: update data pointers
[334537.361629] [ T207693] BTRFS info (device loop0p2): balance: ended with status: 0
[334614.951029] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 started
[334654.572435] [ T207803] BTRFS info (device loop0p2): dev_replace from /dev/loop0p2 (devid 1) to /dev/loop1p2 finished

So vendor I would need to do the same with vendor kernel on same images, I made exact copies, so can be done, however starting with vendor kernel requires system reboot (and other bootloader), so not sure when I can do this. 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines