SR-G

  • Posts

    36
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

SR-G's Achievements

  1. cancelled - power unit cable was not correctly plugged in and NAS was running on battery, hence the automated shutdowns, i think (but it's not clearly printed / displayed anywhere, i would say that it could have helped).
  2. NAS was on during the night and this morning : no more RAID5 (with 5 WD HDD). After reboot, i see that two disks are missing : ``` [Sun Jul 18 11:49:11 2021] ata1: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:49:14 2021] ata2: softreset failed (device not ready) [Sun Jul 18 11:49:23 2021] ata1: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:49:24 2021] ata2: softreset failed (device not ready) [Sun Jul 18 11:49:38 2021] ata1: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:49:38 2021] ata1.00: failed to set xfermode (err_mask=0x40) [Sun Jul 18 11:49:56 2021] ata2: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:49:56 2021] ata2: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:49:57 2021] ata1: illegal qc_active transition (00000000->00000001) [Sun Jul 18 11:50:06 2021] ata1: illegal qc_active transition (00000000->00000001) ``` ``` 11:49 root@helios64 /mnt/internal# ll /dev/sd* brw-rw---- 1 root disk 8, 32 2021-07-18 11:48 /dev/sdc brw-rw---- 1 root disk 8, 33 2021-07-18 11:48 /dev/sdc1 brw-rw---- 1 root disk 8, 48 2021-07-18 11:48 /dev/sdd brw-rw---- 1 root disk 8, 49 2021-07-18 11:48 /dev/sdd1 brw-rw---- 1 root disk 8, 64 2021-07-18 11:48 /dev/sde brw-rw---- 1 root disk 8, 65 2021-07-18 11:48 /dev/sde1 ``` What's happening ? Is it also a PSU that failed (like for HELIOS4 NAS ...) ? PSU unit light seems OK (blue led on the alimentation itself, not flashing)
  3. So with a new PSU (30€ on amazon) my HELIOS4 NAS is working again.
  4. Hello, HELIOS4 system was running (and doing nothing), when it suddenly crashed. After restart, nothing automatically loaded (system stuck in emergency mode), once USB cable plugged i managed to log only once into the system, and discovered that no disks where mounted anymore (not even available / detect by the board) (it's not so easy as it seems there are some freezes, maybe due to the hardware errors related to he missing hdd links) 13:00 root@helios4 ~# ll /dev/sd* zsh: no matches found: /dev/sd* Is this a failing PSU (like in other threads) ? To be noted : the PSU has a blinking green light ... i can't remember if it was like that before ? In journalctl : May 26 12:57:03 helios4 kernel: ata1: SATA link down (SStatus 0 SControl 300) May 26 12:57:04 helios4 kernel: ata1: SATA link down (SStatus 0 SControl 300) May 26 12:57:06 helios4 kernel: ata1: COMRESET failed (errno=-32) May 26 12:57:06 helios4 kernel: ata1: reset failed (errno=-32), retrying in 8 se May 26 12:57:06 helios4 kernel: ata2: SATA link down (SStatus 0 SControl 300) May 26 12:57:07 helios4 kernel: ata2: SATA link down (SStatus 0 SControl 300) May 26 12:57:08 helios4 kernel: ata2: SATA link down (SStatus 0 SControl 300) May 26 12:57:11 helios4 kernel: ata2: COMRESET failed (errno=-32) May 26 12:57:11 helios4 kernel: ata2: reset failed (errno=-32), retrying in 8 se May 26 12:57:12 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:12 helios4 kernel: ata4: COMRESET failed (errno=-32) May 26 12:57:12 helios4 kernel: ata4: reset failed (errno=-32), retrying in 8 se May 26 12:57:13 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:14 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:15 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:15 helios4 kernel: ata1: SATA link down (SStatus 0 SControl 300) May 26 12:57:16 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:16 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:17 helios4 kernel: ata1: SATA link down (SStatus 0 SControl 300) May 26 12:57:17 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:17 helios4 kernel: ata1: SATA link down (SStatus 0 SControl 300) May 26 12:57:18 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:19 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) May 26 12:57:19 helios4 kernel: ata3: SATA link down (SStatus 0 SControl 300) Full boot log :
  5. So on my side, after my latest reinstallation (due to corrupted OS) : - with default installation / configuration out of the box, i had one freeze every 24h - by switching to "powersave", or to "performance" mode, and with min CPU frequency = max CPU frequency = either 1.8Ghz either 1.6Ghz : still the same (one freeze per day) - by switching to "performance" mode, with min CPU frequency = max CPU frequency = 1.4Ghz, it seems now more stable (uptime = 5 days for now) So my guts feeling is really that these issues : - are mainly related to the cpufreq mechanizm - and probably related to what has been nicely spotted before (by Vin), the fact that 2 core have different max frequency range (as expected per the specs, but maybe with a corner case in the cpufreq governance)
  6. I'm a bit confused to not have the same values on my side for all policies ... Whereas : (> i set min = max = 1.6Ghz through armbian-config) What are these two different policies in /cpufreq/ ? (policy0 + policy4 in /cpufreq/ on my side) Is it like "policy0" is used by "performance" governor mode and policy4 by "powersave" ? (in which case it would make sense for me to have different values)
  7. On my side : - OS (debian helios64 image) installed on SD card, SD card is a samsung one (128G) - 5x Western Digital HDD (all the same) WDBBGB0140HBK-EESN 14TB, plugged in a regular way sda > sde (and so obviously no M.2 plugged in) - I have the internal battery plugged in At OS level : - docker with netdata container (and nothing else for now) - mdadm activated for the RAID-5 array - SMBFS with a few shares - NFS with shares mounted on other servers (crashes where already happening before switching to NFS instead of SSHFS for regular accesses) At this time at "workaround" side : - latest kernel 5.10.21 - with patched boot.scr - governor = powersave, min speed = max speed = 1.6Ghz (and not 1.8Ghz) it seems to be the "least problematic" configuration (one crash every two days and not every day ...) About load : - rclone each day during a few hours to mirror everything in the cloud (limited to 750GB per day so it takes quite some days) - nearly no freezes during this - borgbackup fetching from another server and through NFS some file to backup each night - i suspect some freezes there - some NFS shares being accessed for various tasks all the times (sometimes with a lot of IO) - i suspect some freezes there - all this is quite reasonable about load and is not generating a lot of IO in the end Helios64 board ordered on 2020, jan 12 (order 1312), sent on 2020, sept 21 and received sometime around beginning of october (NAS not installed before december) By the way i also have a Helios4 board since a long time, and i never got any freeze with it.
  8. I've had a stable system (with previous kernel) for 30 days, then one freeze, then system corrupted, then reinstall everything, then now several freezes per day (at first with vanilla armbian config) Same kernel than you : Linux helios64 5.10.21-rockchip64 #21.02.3 SMP PREEMPT Mon Mar 8 01:05:08 UTC 2021 aarch64 GNU/Linux I can't test different drives, i've 5 WD digital plugged in as a RAID5 array.
  9. Many additional freezes in the meanwhile. Now (with latest kernel) i'm unable to have a stable situation whatever i do : - latest kernel - boot.scr put back - same min and max freq - governor on "performance" or "schedutil" or whatever I always have freeze. I'm at the point i'm about to be DISGUSTED by this NAS - i've never lost so much time with an electronic device. What is the expected delay before having something stable for this NAS ? Is it only worked on by KOBOL ? How many people have a stable NAS versus an unstable NAS ? Is my device faulty in any way ? What is the refund policy on KOBOL ?
  10. And another freeze this night (still with fresh install / latest kernel + modified boor.scr put back in place).
  11. And (after having lost 2 hours yesterday to reinstall the system), today : yet another freeze (this time with the latest image / kernel and default out-of-the-box configuration). This really starts to be insane and nearly unusable.
  12. And a second freeze one hour after the first one (blinking red light), while upgrading the kernel. Now of course nothing boots up.
  13. Okay so i got a freeze today, so even in my previous situation (as described in previous posts) it was not 100% stable (but still way better than at first).
  14. So 26 days as uptime now - it seems better.