RockBian

  • Posts

    15
  • Joined

  • Last visited

About RockBian

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

RockBian's Achievements

  1. # smartctl -a /dev/sdb smartctl 6.6 2017-11-05 r4594 [aarch64-linux-5.10.35-rockchip64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: TOSHIBA HDWD240 Serial Number: Z9J1S0I9S5HH LU WWN Device Id: 5 000039 9b560c2c2 Firmware Version: KQ000A User Capacity: 4,000,787,030,016 bytes [4.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 3.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Jul 26 22:46:35 2021 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 120) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 502) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0 3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 8060 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 62 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 8809 10 Spin_Retry_Count 0x0033 101 100 030 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 10 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 244 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 18 (Min/Max 12/31) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always - 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0 222 Loaded_Hours 0x0032 099 099 000 Old_age Always - 562 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0 224 Load_Friction 0x0022 100 100 000 Old_age Always - 0 226 Load-in_Time 0x0026 100 100 000 Old_age Always - 803 240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0 SMART Error Log Version: 1 ATA Error Count: 3 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 3 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 98 28 0e c0 40 Error: UNC at LBA = 0x00c00e28 = 12586536 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 28 98 10 0e c0 40 00 22d+20:33:25.353 READ FPDMA QUEUED ef 10 02 00 00 00 a0 00 22d+20:33:25.353 SET FEATURES [Enable SATA feature] 27 00 00 00 00 00 e0 00 22d+20:33:25.352 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 22d+20:33:25.352 IDENTIFY DEVICE ef 03 45 00 00 00 a0 00 22d+20:33:25.351 SET FEATURES [Set transfer mode] Error 2 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 28 08 0e c0 40 Error: UNC at LBA = 0x00c00e08 = 12586504 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 30 28 08 0e c0 40 00 22d+20:33:22.242 READ FPDMA QUEUED 60 08 20 60 0d c1 40 00 22d+20:33:22.241 READ FPDMA QUEUED 60 08 18 78 0d c1 40 00 22d+20:33:22.240 READ FPDMA QUEUED 60 08 10 70 0d c1 40 00 22d+20:33:22.239 READ FPDMA QUEUED 60 08 08 68 0d c1 40 00 22d+20:33:22.238 READ FPDMA QUEUED Error 1 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 28 00 0d c0 40 Error: UNC at LBA = 0x00c00d00 = 12586240 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 08 28 00 0d c0 40 00 22d+20:32:39.403 READ FPDMA QUEUED 60 08 20 80 08 c0 40 00 22d+20:32:39.292 READ FPDMA QUEUED 60 08 18 10 17 01 40 00 22d+20:32:39.291 READ FPDMA QUEUED 60 08 10 08 17 01 40 00 22d+20:32:39.290 READ FPDMA QUEUED 60 08 08 f8 16 01 40 00 22d+20:32:39.289 READ FPDMA QUEUED SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
  2. ? The problem begins with a READ FPDMA QUEUED error, and ends with a ro filesystem due to an I/O errors. The result may be different (maybe due to no raid/zfs), but the source looks the same to me. It worked for 8 months without problems, and now it works fine in slot 3. (Where slot 1 is the farthest from the mobo) And solder it on the power connectors of the new cables? Hm. I hoped for a simpler solution.
  3. @meymarce: How did you do the power? Don't know if the cable on Ali is still the same, for me it's this one: After my box has been going strong since November, this morning I got a mail that a backup failed, because the target filesystem was read-only. A look in dmesg told it was the dreaded READ FPDMA QUEUED: [Sun Jul 25 00:57:14 2021] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: errors=remount-ro [Sun Jul 25 00:58:55 2021] ata4.00: exception Emask 0x0 SAct 0x20 SErr 0x0 action 0x0 [Sun Jul 25 00:58:55 2021] ata4.00: irq_stat 0x40000000 [Sun Jul 25 00:58:55 2021] ata4.00: failed command: READ FPDMA QUEUED [Sun Jul 25 00:58:55 2021] ata4.00: cmd 60/08:28:00:0d:c0/01:00:83:00:00/40 tag 5 ncq dma 135168 in res 43/40:08:00:0d:c0/00:01:83:00:00/40 Emask 0x409 (media error) <F> [Sun Jul 25 00:58:55 2021] ata4.00: status: { DRDY SENSE ERR } [Sun Jul 25 00:58:55 2021] ata4.00: error: { UNC } [Sun Jul 25 00:58:55 2021] ata4.00: configured for UDMA/100 [Sun Jul 25 00:58:55 2021] sd 3:0:0:0: [sdb] tag#5 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=3s [Sun Jul 25 00:58:55 2021] sd 3:0:0:0: [sdb] tag#5 Sense Key : 0x3 [current] [Sun Jul 25 00:58:55 2021] sd 3:0:0:0: [sdb] tag#5 ASC=0x11 ASCQ=0x4 [Sun Jul 25 00:58:55 2021] sd 3:0:0:0: [sdb] tag#5 CDB: opcode=0x88 88 00 00 00 00 00 83 c0 0d 00 00 00 01 08 00 00 [Sun Jul 25 00:58:55 2021] blk_update_request: I/O error, dev sdb, sector 2210401536 op 0x0:(READ) flags 0x80700 phys_seg 33 prio class 0 <snip> [Sun Jul 25 00:59:42 2021] Aborting journal on device sdb1-8. [Sun Jul 25 00:59:42 2021] EXT4-fs (sdb1): Remounting filesystem read-only [Sun Jul 25 00:59:42 2021] EXT4-fs error (device sdb1) in ext4_reserve_inode_write:5751: IO failure [Sun Jul 25 00:59:42 2021] EXT4-fs error (device sdb1): ext4_ext_tree_init:828: inode #69077613: comm rsnapshot: mark_inode_dirty error [Sun Jul 25 00:59:44 2021] EXT4-fs error (device sdb1) in ext4_reserve_inode_write:5751: Journal has aborted [Sun Jul 25 00:59:44 2021] EXT4-fs error (device sdb1): __ext4_new_inode:1335: inode #69077613: comm rsnapshot: mark_inode_dirty error [Sun Jul 25 00:59:44 2021] EXT4-fs error (device sdb1) in __ext4_new_inode:1337: Journal has aborted [Sun Jul 25 00:59:44 2021] EXT4-fs error (device sdb1) in ext4_evict_inode:255: Journal has aborted [Sun Jul 25 00:58:55 2021] ata4: EH complete After repairing the filesystem I retried, and got the same result. Now I hotplugged the disk from slot 4 to slot 3, and the backup succeeded without problems. I am running Armbian 21.05.1 Buster with Linux 5.10.35-rockchip64, from a sata disk in the M.2 slot, and I had two 4TB disks in slot 4 and 5. The disk in slot 4 is normally not mounted, the backup script mounts it, and unmounts it after backup is done, once a week. Both disks have a simple ext4 filesystem without raid.
  4. Have a look here: https://www.linuxquestions.org/questions/linux-general-1/raid-arrays-not-assembling-4175662774/
  5. I'm not sure, but maybe this will give more info: https://forum.armbian.com/topic/16185-helios64-freeze-whatever-the-kernel-is/?do=findComment&comment=121024
  6. @tommitytom: My system is solid too. Running kernel 5.10.21-rockchip64 on Buster. No ZFS, no raid, no OMV. Booting from an m2 sata disk. Since I got the box in november it has only rebooted for kernel updates.
  7. The problem is not that it's slow, but it doesn't work at all. Or did you mean speed up the solution?
  8. Today I updated and upgraded my system, and after a reboot 'apt list' did no longer work. It shows 'Listing... done', and then uses 100% CPU on a single core, without producing anything. According to strace it's looping on reading a file: openat(AT_FDCWD, "/var/lib/apt/lists/deb.debian.org_debian_dists_buster_main_binary-arm64_Packages.lz4", O_RDONLY) = 28 read(28, "\4\"M\30@@\300\206z\0\0\370\377\31Package: 0ad-data\n"..., 65536) = 65536 read(28, "ta$E\17p\2+\377\21c376d3f140540d9424753d"..., 65536) = 65536 read(28, "a5180e3e2d3f503ddcb5dd7b93\201\2\2\275\0\17"..., 65536) = 65536 read(28, "\2\3700071880bc23623a4ea3664b773a1b6"..., 65536) = 65536 read(28, "\f1'da?\2Jdan_\207\1\v=\2f215438\272\t\365\0213e4c"..., 65536) = 65536 read(28, "a6f6005b836871a9cc032472bbece3b7"..., 65536) = 65536 read(28, "3783a1460d749c6f887843fdd95ec981"..., 65536) = 65536 read(28, "\4\0032 - \16\1\0022\32\t?\27\177foreign\270\0041\370\21f1e69"..., 65536) = 65536 read(28, "dR\203\1\6\4\6T\37\3\324\0\",\ng_zplaying\2759\vt\203\1\233"..., 65536) = 65536 read(28, "\20u://ftp.\24'_nu/bc\2\3\0\363\20b8da7e3f11"..., 65536) = 65536 close(28) = 0 openat(AT_FDCWD, "/var/lib/apt/lists/deb.debian.org_debian_dists_buster_main_binary-arm64_Packages.lz4", O_RDONLY) = 28 read(28, "\4\"M\30@@\300\206z\0\0\370\377\31Package: 0ad-data\n"..., 65536) = 65536 read(28, "ta$E\17p\2+\377\21c376d3f140540d9424753d"..., 65536) = 65536 read(28, "a5180e3e2d3f503ddcb5dd7b93\201\2\2\275\0\17"..., 65536) = 65536 read(28, "\2\3700071880bc23623a4ea3664b773a1b6"..., 65536) = 65536 read(28, "\f1'da?\2Jdan_\207\1\v=\2f215438\272\t\365\0213e4c"..., 65536) = 65536 read(28, "a6f6005b836871a9cc032472bbece3b7"..., 65536) = 65536 read(28, "3783a1460d749c6f887843fdd95ec981"..., 65536) = 65536 read(28, "\4\0032 - \16\1\0022\32\t?\27\177foreign\270\0041\370\21f1e69"..., 65536) = 65536 read(28, "dR\203\1\6\4\6T\37\3\324\0\",\ng_zplaying\2759\vt\203\1\233"..., 65536) = 65536 read(28, "\20u://ftp.\24'_nu/bc\2\3\0\363\20b8da7e3f11"..., 65536) = 65536 close(28) = 0 openat(AT_FDCWD, "/var/lib/apt/lists/deb.debian.org_debian_dists_buster_main_binary-arm64_Packages.lz4", O_RDONLY) = 28 read(28, "\4\"M\30@@\300\206z\0\0\370\377\31Package: 0ad-data\n"..., 65536) = 65536 read(28, "ta$E\17p\2+\377\21c376d3f140540d9424753d"..., 65536) = 65536 read(28, "a5180e3e2d3f503ddcb5dd7b93\201\2\2\275\0\17"..., 65536) = 65536 read(28, "\2\3700071880bc23623a4ea3664b773a1b6"..., 65536) = 65536 read(28, "\f1'da?\2Jdan_\207\1\v=\2f215438\272\t\365\0213e4c"..., 65536) = 65536 read(28, "a6f6005b836871a9cc032472bbece3b7"..., 65536) = 65536 read(28, "3783a1460d749c6f887843fdd95ec981"..., 65536) = 65536 read(28, "\4\0032 - \16\1\0022\32\t?\27\177foreign\270\0041\370\21f1e69"..., 65536) = 65536 read(28, "dR\203\1\6\4\6T\37\3\324\0\",\ng_zplaying\2759\vt\203\1\233"..., 65536) = 65536 read(28, "\20u://ftp.\24'_nu/bc\2\3\0\363\20b8da7e3f11"..., 65536) = 65536 close(28) = 0 This repeats over and over. I checked the file /var/lib/apt/lists/deb.debian.org_debian_dists_buster_main_binary-arm64_Packages.lz4, and it seems valid, as in, lz4 decompresses it without complaints. When I put the decompressed /var/lib/apt/lists/deb.debian.org_debian_dists_buster_main_binary-arm64_Packages in the same directory, 'apt list' works. What is going on? It might be size related. The file is the biggest *_Packages.lz4 file in that directory. There are bigger lz4 files, but according to strace they are not accessed. The timestamp of deb.debian.org_debian_dists_buster_main_binary-arm64_Packages.lz4 is Dec 5 11:03, so I don't think it grew, at last update. The upgraded files are: Start-Date: 2021-01-17 14:07:30 Commandline: apt upgrade Requested-By: rockbian (1000) Install: javascript-common:arm64 (11, automatic), php-symfony-yaml:arm64 (3.4.22+dfsg-2+deb10u1, automatic), libjs-codemirror:arm64 (5.43.0-1+deb10u1, automatic), libjs-popper.js:arm64 (1.14.6+ds2-1, automatic), libjs-jquery-mousewheel:arm64 (1:3.1.13-2, automatic), php-symfony-filesystem:arm64 (3.4.22+dfsg-2+deb10u1, automatic), libjs-jquery-timepicker:arm64 (1.2-1, automatic), php-symfony-dependency-injection:arm64 (3.4.22+dfsg-2+deb10u1, automatic), libjs-jquery-ui:arm64 (1.12.1+dfsg-5, automatic), libjs-bootstrap4:arm64 (4.3.1+dfsg2-1, automatic), php-symfony-config:arm64 (3.4.22+dfsg-2+deb10u1, automatic), php-twig-i18n-extension:arm64 (3.0.0-2~bpo10+1, automatic) Upgrade: nodejs:arm64 (10.21.0~dfsg-1~deb10u1, 10.23.1~dfsg-1~deb10u1), python-apt-common:arm64 (1.8.4.2, 1.8.4.3), phpmyadmin:arm64 (4:4.9.7+dfsg1-1~bpo10+1, 4:5.0.4+dfsg2-1~bpo10+1), php-google-recaptcha:arm64 (1.2.4-1~bpo10+1, 1.2.4-3~bpo10+1), linux-buster-root-current-helios64:arm64 (20.11.4, 20.11.6), armbian-config:arm64 (20.11.3, 20.11.6), libnode64:arm64 (10.21.0~dfsg-1~deb10u1, 10.23.1~dfsg-1~deb10u1), libp11-kit0:arm64 (0.23.15-2, 0.23.15-2+deb10u1), php-mariadb-mysql-kbs:arm64 (1.2.11-1~bpo10+1, 1.2.12-1~bpo10+1), python3-apt:arm64 (1.8.4.2, 1.8.4.3), tzdata:arm64 (2020d-0+deb10u1, 2020e-0+deb10u1), linux-u-boot-helios64-current:arm64 (20.11.4, 20.11.6) End-Date: 2021-01-17 14:08:10
  9. $ cat /etc/apt/sources.list deb http://deb.debian.org/debian buster main contrib non-free #deb-src http://deb.debian.org/debian buster main contrib non-free deb http://deb.debian.org/debian buster-updates main contrib non-free #deb-src http://deb.debian.org/debian buster-updates main contrib non-free deb http://deb.debian.org/debian buster-backports main contrib non-free #deb-src http://deb.debian.org/debian buster-backports main contrib non-free deb http://security.debian.org/ buster/updates main contrib non-free #deb-src http://security.debian.org/ buster/updates main contrib non-free $ cat /etc/apt/sources.list.d/armbian.list deb http://apt.armbian.com buster main buster-utils buster-desktop
  10. What is a normal CPU temperature on idle? Mine is never lower than 40°C, at an ambient temperature of 19°C, which makes me wonder how much electricity it draws. Is it possible that the GPU is actively doing nothing, or something like that? If so, can it be switched off? The NAS has been placed vertically, with the CPU on bottom. The disk in slot 5 is 22°C.
  11. The boot.scr script is damaged. You can try to recreate it. Boot from sd once again, mount the emmc filesystem and recreate mkdir /tmp/mmc mount /dev/mmc/mmcblk1p1 /tmp/mmc chroot /tmp/mmc mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr
  12. As far as I can see it booted from sd, or at least the root file system is an 64GB sd card. To boot from emmc you have to remove the sd card. OK, thanks.
  13. So far I have found that the I2C bus generates ~20k interrupts sec: root@helios64:/proc# cat interrupts && sleep 10 && cat interrupts <snip> 43: 4107025 0 0 0 0 0 GICv3 88 Level ff3d0000.i2c <snip> 43: 4311805 0 0 0 0 0 GICv3 88 Level ff3d0000.i2c but I don't know what exactly that means. A simple i2cdetect on that bus generates 351 interrupts. So it's not bits. Yes I did. I wrote about it in the comments section of sata in the wiki: https://wiki.kobol.io/helios64/sata/
  14. On my system (kernel 5.8.17) a thread kworker/fusb302_wq was running, permanently eating 18% of one core, and causing the load average to be always at least 1.0. The module fusb30x could not be unloaded, rmmod just hung, although it's use count was zero. After blacklisting the module and rebooting the problem was solved. Now is the question: what did I disable? Some googling gave me this document: So it has only to do with the USB-C port? The serial connection is still working fine.