18 18

About This Club

Dedicated section for talk & support for the Helios4 and Helios64 open source NAS. Lead and moderated by the Kobol Team.
  1. What's new in this club
  2. FYI, I am running a custom kernel based on chosen armbian patches. Seems I need to add some more... Re-patching and re-compiling until I get the kernel running (some arbmian patches don't compile on 5.10.55) EDIT: reverted to an older kernel (5.10.34) and eMMC is operational :-)
  3. Hi, I got write errors on the eMMC, and the filesystem becomes read-only. Dmesg shows that: [ 304.982737] mmc2: running CQE recovery [ 304.987677] mmc2: running CQE recovery [ 304.989762] blk_update_request: I/O error, dev mmcblk2, sector 26826008 op 0x1:(WRITE) flags 0x4000 phys_seg 127 prio class 0 [ 304.989791] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789645 starting block 3353569) [ 305.021383] mmc2: running CQE recovery [ 305.026213] mmc2: running CQE recovery [ 305.030808] mmc2: running CQE recovery [ 305.032702] blk_update_request: I/O error, dev mmcblk2, sector 26265544 op 0x1:(WRITE) flags 0x4000 phys_seg 111 prio class 0 [ 305.033382] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 793979 starting block 3283345) [ 305.109165] mmc2: running CQE recovery [ 305.114772] mmc2: running CQE recovery [ 305.120284] mmc2: running CQE recovery [ 305.125405] mmc2: running CQE recovery [ 305.126960] blk_update_request: I/O error, dev mmcblk2, sector 25560376 op 0x1:(WRITE) flags 0x0 phys_seg 15 prio class 0 [ 305.126976] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789517 starting block 3195094) [ 305.128183] mmc2: running CQE recovery [ 305.132052] mmc2: running CQE recovery [ 305.138255] mmc2: running CQE recovery [ 305.143172] mmc2: running CQE recovery [ 305.143705] blk_update_request: I/O error, dev mmcblk2, sector 25568256 op 0x1:(WRITE) flags 0x0 phys_seg 11 prio class 0 [ 305.143719] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789519 starting block 3196066) [ 305.146059] mmc2: running CQE recovery [ 305.150856] mmc2: running CQE recovery [ 305.153652] blk_update_request: I/O error, dev mmcblk2, sector 25566136 op 0x1:(WRITE) flags 0x0 phys_seg 14 prio class 0 [ 305.153674] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789524 starting block 3195813) [ 305.156107] mmc2: running CQE recovery [ 305.161761] mmc2: running CQE recovery [ 305.166281] mmc2: running CQE recovery [ 305.173900] mmc2: running CQE recovery [ 305.179678] mmc2: running CQE recovery [ 305.184108] mmc2: running CQE recovery [ 305.189973] mmc2: running CQE recovery [ 305.192654] blk_update_request: I/O error, dev mmcblk2, sector 25550264 op 0x1:(WRITE) flags 0x0 phys_seg 6 prio class 0 [ 305.192674] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789534 starting block 3193811) [ 305.195645] mmc2: running CQE recovery [ 305.201382] mmc2: running CQE recovery [ 305.206397] mmc2: running CQE recovery [ 305.209876] mmc2: running CQE recovery [ 305.212505] EXT4-fs warning (device mmcblk2p1): ext4_end_bio:345: I/O error 10 writing to inode 789547 starting block 3194323) [ 305.215282] mmc2: running CQE recovery [ 305.267254] mmc2: running CQE recovery [ 305.272126] mmc2: running CQE recovery [ 305.276368] mmc2: running CQE recovery [ 305.282479] mmc2: running CQE recovery [ 305.287200] mmc2: running CQE recovery [ 305.291543] mmc2: running CQE recovery [ 305.315011] JBD2: Detected IO errors while flushing file data on mmcblk2p1-8 [ 305.317983] mmc2: running CQE recovery [ 305.322443] mmc2: running CQE recovery [ 305.329227] mmc2: running CQE recovery [ 305.330168] Aborting journal on device mmcblk2p1-8. [ 305.332721] mmc2: running CQE recovery [ 305.333640] EXT4-fs error (device mmcblk2p1): ext4_journal_check_start:83: Detected aborted journal [ 305.334031] EXT4-fs (mmcblk2p1): Remounting filesystem read-only [ 305.334048] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794035, error -30) [ 305.334380] EXT4-fs error (device mmcblk2p1): ext4_journal_check_start:83: Detected aborted journal [ 305.335183] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794033, error -30) [ 305.336335] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794038, error -30) [ 305.337472] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794037, error -30) [ 305.337497] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794036, error -30) [ 305.339676] EXT4-fs (mmcblk2p1): failed to convert unwritten extents to written extents -- potential data loss! (inode 794039, error -30) I did a fsck, and rebooted, but it gives the same kind of errors after that, It seems I cannot write on eMMC anymore. I use eMMC to store the boot script + a rescue subsystem (I boot it by renaming /boot.rescue to /boot). It is formatted as a ext4 partition. What can I do to get back to a sane state ? Kind regards, Xavier Miller.
  4. Thank you Sir! I thought that was the case, but needed confirmation.
  5. @jsfrederick You can just write the image to the SD card and boot. For SSH, as long as you know what IP address the Helios64 gets than you can just use that to connect. The serial terminal is just to check if things booted correctly or in case of an error that needs to be diagnosed. You should be fine with your usual process.
  6. I finally assembled my Helios 64 today. Went well, just took a while. I'm reading the Kobol Wiki page for installing the OS on the internal EMMC and am a bit confused. Hoping someone can point me in the right direction. I usually boot a SBC the first time from an SD card to do some testing. Then I'll use Armbian config to install the OS onto the EMMC when I am ready. The Kobol Wiki talks about first installing uboot from an SD card. Is that actually needed? Also, do I really need to set up a serial terminal session? I usually just SSH into the box (I know find the IP address via my DHCP server) and configure everything from there.
  7. @RockBian hmm, that output look more like there's a problem with the disk itself. At least none of the SATA errors I've experienced have been logged by the disks themselves, but could be difference in manufacturers. I'd recommend running a short and a long scan (via smartctl) on the disk.
  8. Thank you for clarifying and pointing that out. Logging to a USB drive as a swap seems like a neat workaround. It would be better though, if we could work out how to avoid the kernel panics without having to change when zfs mounts.
  9. Sorry, the line about disabling was just to make sure you disable the armbian-zram-config service by setting ENABLED to false. As a warning, though, I found some problems with this if you log to a zfs share. It seems you have to make sure zfs gets loaded before the logging starts up, otherwise you can get a kernel panic on occasion. I didn't really dig into how to fix this, so I just log and swap to a usb drive instead now.
  10. Hi, I had that same idea and I'm just setting this up based on your suggestions. Thank you for starting the thread. Just one thing: There is no option regarding compression in /etc/default/armbian-zram-config, or did you mean to disable another option?
  11. Thanks! Digged a little bi deeper in - the rules are actually mostly from openmediavault, did not realize that there were some set up before! Thanks to both on you for pointing me in the right direction!
  12. Armbian has no default iptables rules configured. So all rules are from either Docker or OpenMediaVault.
  13. Hi, sorry, prop. a stupid question - but I was so far not figuring out the answer. I have a helios64 running - pretty standard installation, including openmediavault and docker, as described on the kobol help page. I realized that there are some firewall/iptables rules are set, as example: tester@helios64:/etc# sudo iptables -L Chain INPUT (policy DROP) target prot opt source destination ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere state NEW,RELATED,ESTABLISHED ACCEPT tcp -- anywhere anywhere tcp dpt:http ACCEPT tcp -- anywhere anywhere tcp dpt:https ACCEPT tcp -- anywhere anywhere tcp dpt:8384 ACCEPT tcp -- anywhere anywhere tcp dpt:3128 ACCEPT tcp -- anywhere anywhere tcp dpt:1443 ACCEPT udp -- anywhere anywhere udp dpt:domain ACCEPT tcp -- anywhere anywhere tcp dpt:domain ACCEPT tcp -- anywhere anywhere tcp dpt:microsoft-ds ACCEPT tcp -- anywhere anywhere tcp dpt:netbios-ssn ACCEPT udp -- anywhere anywhere udp dpt:netbios-dgm ACCEPT udp -- anywhere anywhere udp dpt:netbios-ns ACCEPT tcp -- anywhere anywhere tcp dpt:ftp ACCEPT tcp -- anywhere anywhere tcp dpt:49152 ACCEPT tcp -- anywhere anywhere tcp dpt:22000 ACCEPT udp -- anywhere anywhere udp dpt:1900 Chain FORWARD (policy ACCEPT) target prot opt source destination DOCKER-USER all -- anywhere anywhere DOCKER-ISOLATION-STAGE-1 all -- anywhere anywhere ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED DOCKER all -- anywhere anywhere ACCEPT all -- anywhere anywhere ACCEPT all -- anywhere anywhere ... What I was not able to figure out is, where these rules come from!? In openmediafault no firewall rules are set. I also was not able to find any iptables settings in /etc/network, etc. Anyone an idea where these rules are configured? And with which service they are setup? Thanks!
  14. Im getting exact same kernel oops with Linux helios64 5.10.43-rockchip64 #21.05.4 SMP PREEMPT Wed Jun 16 08:02:12 UTC 2021 aarch64 GNU/Linux does your unit successfully reboot? mine fails to find the SD card at automated soft reboot refer
  15. For future reference, the *-u-boot* packages contain the uboot (bootloader). When installed it puts some files into /usr/lib/u-boot and /usr/lib/<packagename> These files can be used to install / update the actual uboot on the box/board, although the process is not started automatically as updating uboot is mostly unneccessary and might lead to problems only fixable with serial uart access. For actually updating the bootloader the armbian-config utility can be used. (Or check what the platform install file in /usr/lib/u-boot does and do it manually). If you receive updates from future version you might be on 'nightly' branch / mode.
  16. So Ive had the crash happen again after allmost exactly 24hrs, this time with logging verbose 5 so i actually got a bit of data. this was under no load at all, no one logged on just a single PC on the network running USB-C serial logging. I have no idea how to interoperate the data but I'm instigating what it means to the best of my ability, any pointer would be great
  17. Probably a mistake since Armbian 21.08 has not been released yet. Should not cause harm though.
  18. I checked for any available updates yesterday, and this (linux-u-boot-helios64-current 21.08.1) was applied. Can't find what was new or if it was only a version bump. By the way, nothing broke after the update.
  19. apparently my searching skills are rubbish, found a thread with the FPDMA QUEUED fault, will investigate that, but seems thats a seperate isue to the reboot.
  20. Hi all, been trying to sort out a stability problem with what was a rock solid helios64, i moved from running armbian on the eMMC to a 256GB sandisk extreme SD card due to plex continually filling it up. currently running Armbian 21.05.6 Buster with Linux 5.10.43-rockchip64, running OMV, Plex and ZFS not much more it appears this is where things go pear shaped: I've aready given i a bump in voltage as recommended in the config file but i cant tell where to check if it took the settting, frequency is set in armbian-config to ondemand. does anyone have any ideas? Thanks in advance Krita
  21. # smartctl -a /dev/sdb smartctl 6.6 2017-11-05 r4594 [aarch64-linux-5.10.35-rockchip64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: TOSHIBA HDWD240 Serial Number: Z9J1S0I9S5HH LU WWN Device Id: 5 000039 9b560c2c2 Firmware Version: KQ000A User Capacity: 4,000,787,030,016 bytes [4.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 3.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Jul 26 22:46:35 2021 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 120) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 502) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0 3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 8060 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 62 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0 9 Power_On_Hours 0x0032 078 078 000 Old_age Always - 8809 10 Spin_Retry_Count 0x0033 101 100 030 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 10 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 244 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 18 (Min/Max 12/31) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always - 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0 222 Loaded_Hours 0x0032 099 099 000 Old_age Always - 562 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0 224 Load_Friction 0x0022 100 100 000 Old_age Always - 0 226 Load-in_Time 0x0026 100 100 000 Old_age Always - 803 240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0 SMART Error Log Version: 1 ATA Error Count: 3 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 3 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 98 28 0e c0 40 Error: UNC at LBA = 0x00c00e28 = 12586536 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 28 98 10 0e c0 40 00 22d+20:33:25.353 READ FPDMA QUEUED ef 10 02 00 00 00 a0 00 22d+20:33:25.353 SET FEATURES [Enable SATA feature] 27 00 00 00 00 00 e0 00 22d+20:33:25.352 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3] ec 00 00 00 00 00 a0 00 22d+20:33:25.352 IDENTIFY DEVICE ef 03 45 00 00 00 a0 00 22d+20:33:25.351 SET FEATURES [Set transfer mode] Error 2 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 28 08 0e c0 40 Error: UNC at LBA = 0x00c00e08 = 12586504 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 30 28 08 0e c0 40 00 22d+20:33:22.242 READ FPDMA QUEUED 60 08 20 60 0d c1 40 00 22d+20:33:22.241 READ FPDMA QUEUED 60 08 18 78 0d c1 40 00 22d+20:33:22.240 READ FPDMA QUEUED 60 08 10 70 0d c1 40 00 22d+20:33:22.239 READ FPDMA QUEUED 60 08 08 68 0d c1 40 00 22d+20:33:22.238 READ FPDMA QUEUED Error 1 occurred at disk power-on lifetime: 8764 hours (365 days + 4 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 41 28 00 0d c0 40 Error: UNC at LBA = 0x00c00d00 = 12586240 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 08 28 00 0d c0 40 00 22d+20:32:39.403 READ FPDMA QUEUED 60 08 20 80 08 c0 40 00 22d+20:32:39.292 READ FPDMA QUEUED 60 08 18 10 17 01 40 00 22d+20:32:39.291 READ FPDMA QUEUED 60 08 10 08 17 01 40 00 22d+20:32:39.290 READ FPDMA QUEUED 60 08 08 f8 16 01 40 00 22d+20:32:39.289 READ FPDMA QUEUED SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
  22. It does look like the same issue to me and filesystem does not matter, these SATA errors can present themselves simply by reading the disk. And as a result, filesystem corruption is not unexpected, ZFS does protect us from it though. But it can’t be discounted that this could be a disk error as well, perhaps it’s nearing its end-of-life. Which brand / model of disk do you have?
  23. Nah, no soldering involved. Just cut of the data part of the connector. It destroys that bit irreversibly, but then you are just left with the power connector and you can plug that on the back of new connector I took the assembly image from the wiki and used my amazing paint.net skills to draw lines where to cut; just make sure you leave the power connector intact
  24. ? The problem begins with a READ FPDMA QUEUED error, and ends with a ro filesystem due to an I/O errors. The result may be different (maybe due to no raid/zfs), but the source looks the same to me. It worked for 8 months without problems, and now it works fine in slot 3. (Where slot 1 is the farthest from the mobo) And solder it on the power connectors of the new cables? Hm. I hoped for a simpler solution.
  25.