abugher Posted 1 hour ago Posted 1 hour ago I attempted to file a bug using `reportbug`, but since the kernel package is from Armbian and the report went to Debian, I assume that is a dead end. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1137721 I will mostly copy/paste from that report, here, but I also ran `armbianmonitor -u`. https://paste.armbian.com/wenegucita I run a home backup server with a 1TB SSD attached to an ODroid HC1 board. There is no redundancy, but I do keep a limited history of snapshots, taken hourly and trimmed down to a small number of hourly, daily, weekly, monthly, and yearly snapshots. There should be at most about 20 snapshots existing at a time. I have a daily cron job to run a balance operation. The mount point is `/backup` . Relevant line from /etc/fstab: /dev/mapper/backup /backup btrfs noatime,noauto,nodev,nosuid,noexec,compress=zlib:9 0 0 Command used for daily balance: btrfs balance start --full-balance '/backup' The filesystem only seems to be about half full: ansible@backup:~ $ sudo btrfs filesystem df --si /backup Data, single: total=430.57GB, used=428.24GB System, DUP: total=67.11MB, used=81.92kB Metadata, DUP: total=3.22GB, used=2.48GB GlobalReserve, single: total=536.87MB, used=49.15kB ansible@backup:~ $ My best understanding is that the messages that follow indicate that something is attempting to address space beyond 16TB, and that seems like a bug because my disk is only 1TB total. I would appreciate anyone with a firmer grasp of the topic at least confirming whether that is an accurate reading of the warnings and errors. If not, I might suggest that the warnings and errors could be improved, but I am open to correction if I just failed to understand what they are telling me. The problem seems to be triggered by any attempt to rebalance metadata or system data with a usage limit of 1 or higher. Excerpt from a command session: ansible@backup:~ $ time sudo btrfs balance start -musage=1 /backup; echo $? ERROR: error during balancing '/backup': Read-only file system There may be more info in syslog - try dmesg | tail real 0m0.304s user 0m0.040s sys 0m0.082s 1 ansible@backup:~ $ sudo umount /backup && sudo mount -o skip_balance /backup && sudo btrfs balance cancel /backup ansible@backup:~ $ time sudo btrfs balance start -f -susage=1 /backup; echo $? ERROR: error during balancing '/backup': Read-only file system There may be more info in syslog - try dmesg | tail real 0m0.321s user 0m0.023s sys 0m0.096s 1 ansible@backup:~ $ Excerpt from dmesg from that time: [114023.693265] BTRFS info (device dm-0): balance: start -musage=1 -susage=1 [114023.695656] BTRFS error (device dm-0): extent buffer 18905857064960 is beyond 32bit page cache limit [114023.695669] BTRFS error (device dm-0): reached 32bit limit for logical addresses [114023.695677] BTRFS error (device dm-0): due to page cache limit on 32bit systems, metadata beyond 16T can't be accessed [114023.695686] BTRFS error (device dm-0): please consider upgrading to 64bit kernel/hardware [114023.695699] ------------[ cut here ]------------ [114023.695706] WARNING: CPU: 4 PID: 30371 at fs/btrfs/space-info.h:208 btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs] [114023.695813] Modules linked in: aes_arm_bs crypto_simd dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs blake2b_generic xor xor_neon raid6_pq li bcrc32c sd_mod t10_pi crc64_rocksoft uas usb_storage scsi_mod scsi_common gpio_keys [114023.695959] CPU: 4 PID: 30371 Comm: btrfs Tainted: G W 6.6.122-current-odroidxu4 #1 [114023.695969] Hardware name: Samsung Exynos (Flattened Device Tree) [114023.695980] unwind_backtrace from show_stack+0x10/0x14 [114023.696000] show_stack from dump_stack_lvl+0x40/0x4c [114023.696014] dump_stack_lvl from __warn+0x78/0x154 [114023.696025] __warn from warn_slowpath_fmt+0x1b4/0x1bc [114023.696035] warn_slowpath_fmt from btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs] [114023.696127] btrfs_space_info_update_bytes_may_use [btrfs] from btrfs_block_rsv_release+0x1f4/0x2f4 [btrfs] [114023.696300] btrfs_block_rsv_release [btrfs] from btrfs_alloc_tree_block+0x118/0x6b4 [btrfs] [114023.696465] btrfs_alloc_tree_block [btrfs] from btrfs_force_cow_block+0x148/0xa60 [btrfs] [114023.696628] btrfs_force_cow_block [btrfs] from btrfs_cow_block+0xe4/0x2bc [btrfs] [114023.696792] btrfs_cow_block [btrfs] from btrfs_search_slot+0x6dc/0xc08 [btrfs] [114023.696957] btrfs_search_slot [btrfs] from btrfs_update_device+0xa4/0x248 [btrfs] [114023.697122] btrfs_update_device [btrfs] from btrfs_chunk_alloc_add_chunk_item+0xcc/0x638 [btrfs] [114023.697287] btrfs_chunk_alloc_add_chunk_item [btrfs] from reserve_chunk_space+0xec/0x180 [btrfs] [114023.697451] reserve_chunk_space [btrfs] from check_system_chunk+0x6c/0x74 [btrfs] [114023.697615] check_system_chunk [btrfs] from btrfs_inc_block_group_ro+0x228/0x234 [btrfs] [114023.697780] btrfs_inc_block_group_ro [btrfs] from btrfs_relocate_block_group+0x94/0x48c [btrfs] [114023.697945] btrfs_relocate_block_group [btrfs] from btrfs_relocate_chunk+0x3c/0x180 [btrfs] [114023.698108] btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 [btrfs] [114023.698272] btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs] [114023.698436] btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60 [114023.698533] sys_ioctl from ret_fast_syscall+0x0/0x54 [114023.698547] Exception stack(0xf2575fa8 to 0xf2575ff0) [114023.698556] 5fa0: 00000000 00000003 00000003 c4009420 bebe7fd0 bebe7f70 [114023.698565] 5fc0: 00000000 00000003 bebe7fd0 00000036 005bfb78 bebe86f4 00000001 0058b190 [114023.698572] 5fe0: 00000036 bebe7f58 b6d718b1 b6cdf736 [114023.698580] ---[ end trace 0000000000000000 ]--- [114023.698802] BTRFS error (device dm-0): extent buffer 18905857064960 is beyond 32bit page cache limit [114023.698821] ------------[ cut here ]------------ [114023.698827] WARNING: CPU: 4 PID: 30371 at fs/btrfs/block-group.c:2783 btrfs_create_pending_block_groups+0x674/0x67c [btrfs] [114023.698925] BTRFS: Transaction aborted (error -75) [114064.918054] Modules linked in: aes_arm_bs crypto_simd dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs blake2b_generic xor xor_neon raid6_pq libcrc32c sd_mod t10_pi crc64_rocksoft uas usb_storage scsi_mod scsi_common gpio_keys [114064.918234] CPU: 5 PID: 30410 Comm: btrfs Tainted: G W 6.6.122-current-odroidxu4 #1 [114064.918245] Hardware name: Samsung Exynos (Flattened Device Tree) [114064.918256] unwind_backtrace from show_stack+0x10/0x14 [114064.918272] show_stack from dump_stack_lvl+0x40/0x4c [114064.918284] dump_stack_lvl from __warn+0x78/0x154 [114064.918296] __warn from warn_slowpath_fmt+0x130/0x1bc [114064.918307] warn_slowpath_fmt from btrfs_create_pending_block_groups+0x674/0x67c [btrfs] [114064.918399] btrfs_create_pending_block_groups [btrfs] from __btrfs_end_transaction+0x38/0x2a0 [btrfs] [114064.918567] __btrfs_end_transaction [btrfs] from btrfs_inc_block_group_ro+0x1f0/0x234 [btrfs] [114064.918733] btrfs_inc_block_group_ro [btrfs] from btrfs_relocate_block_group+0x94/0x48c [btrfs] [114064.918899] btrfs_relocate_block_group [btrfs] from btrfs_relocate_chunk+0x3c/0x180 [btrfs] [114064.919065] btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 [btrfs] [114064.919231] btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs] [114064.919397] btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60 [114064.919491] sys_ioctl from ret_fast_syscall+0x0/0x54 [114064.919505] Exception stack(0xf26b9fa8 to 0xf26b9ff0) [114064.919516] 9fa0: 00000000 00000003 00000003 c4009420 beae9fc0 beae9f60 [114064.919527] 9fc0: 00000000 00000003 beae9fc0 00000036 0057fb78 beaea6f1 00000001 0054b190 [114064.919537] 9fe0: 00000036 beae9f48 b6d118b1 b6c7f736 [114064.919614] ---[ end trace 0000000000000000 ]--- [114064.919632] BTRFS: error (device dm-0: state A) in btrfs_create_pending_block_groups:2783: errno=-75 unknown [114064.919647] BTRFS info (device dm-0: state EA): forced readonly [114064.920025] ------------[ cut here ]------------ [114064.920039] WARNING: CPU: 5 PID: 30410 at fs/btrfs/space-info.h:208 btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs] [114064.920182] Modules linked in: aes_arm_bs crypto_simd dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs blake2b_generic xor xor_neon raid6_pq libcrc32c sd_mod t10_pi crc64_rocksoft uas usb_storage scsi_mod scsi_common gpio_keys [114064.920385] CPU: 5 PID: 30410 Comm: btrfs Tainted: G W 6.6.122-current-odroidxu4 #1 [114064.920398] Hardware name: Samsung Exynos (Flattened Device Tree) [114064.920409] unwind_backtrace from show_stack+0x10/0x14 [114064.920426] show_stack from dump_stack_lvl+0x40/0x4c [114064.920441] dump_stack_lvl from __warn+0x78/0x154 [114064.920454] __warn from warn_slowpath_fmt+0x1b4/0x1bc [114064.920466] warn_slowpath_fmt from btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs] [114064.920559] btrfs_space_info_update_bytes_may_use [btrfs] from btrfs_block_rsv_release+0x1f4/0x2f4 [btrfs] [114064.920729] btrfs_block_rsv_release [btrfs] from btrfs_trans_release_chunk_metadata+0x2c/0x40 [btrfs] [114064.920900] btrfs_trans_release_chunk_metadata [btrfs] from __btrfs_end_transaction+0x38/0x2a0 [btrfs] [114064.921070] __btrfs_end_transaction [btrfs] from btrfs_inc_block_group_ro+0x1f0/0x234 [btrfs] [114064.921239] btrfs_inc_block_group_ro [btrfs] from btrfs_relocate_block_group+0x94/0x48c [btrfs] [114064.921406] btrfs_relocate_block_group [btrfs] from btrfs_relocate_chunk+0x3c/0x180 [btrfs] [114064.921572] btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 [btrfs] [114064.921741] btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs] [114064.921913] btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60 [114064.922009] sys_ioctl from ret_fast_syscall+0x0/0x54 [114064.922024] Exception stack(0xf26b9fa8 to 0xf26b9ff0) [114064.922036] 9fa0: 00000000 00000003 00000003 c4009420 beae9fc0 beae9f60 [114064.922046] 9fc0: 00000000 00000003 beae9fc0 00000036 0057fb78 beaea6f1 00000001 0054b190 [114064.922057] 9fe0: 00000036 beae9f48 b6d118b1 b6c7f736 [114064.922067] ---[ end trace 0000000000000000 ]--- [114064.922111] BTRFS info (device dm-0: state EA): balance: ended with status: -30 I tried balancing data, starting with low usage limits and moving up to no limit. That all went smoothly. The problem still occurred when trying to balance metadata, but the extent buffer number named in dmesg errors changed. I do not know whether that is significant or whether balancing data should be expected to be of any benefit in this situation. I did not keep full context from dmesg, but these are the commands issued: btrfs balance start -dusage=1 /backup btrfs balance start -dusage=10 /backup btrfs balance start -dusage=50 /backup btrfs balance start -dusage=75 /backup btrfs balance start -dusage=85 /backup btrfs balance start -dusage=95 /backup btrfs balance start -d /backup Most of those were pretty fast. The last one took several hours. I also checked my SSD for SMART errors, running a short self test and then a long one. No errors are reported. I do not know whether this is an architecture specific bug. It may be specific to 32 bit systems, or to ARM 32 bit, or it may not. I do not know whether this is an upstream bug, but it seems likely. The behavior in question seems to be related to the btrfs kernel component. The recommended fix is to attach the SSD to a 64 bit system to complete any operations requiring addresses that cannot be handled by the 32 bit system. I have not done that, yet, and I will hold off for a while, as it seems like that will only be a temporary fix. If I can gather any more relevant information while the system is still in this state, please let me know. -- System Information: Debian Release: 13.4 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable') Architecture: armhf (armv7l) Kernel: Linux 6.6.122-current-odroidxu4 (SMP w/8 CPU threads; PREEMPT) Kernel taint flags: TAINT_WARN Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8), LANGUAGE=en_US.UTF-8 Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) -- no debconf information 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.