FredK Posted July 23, 2020 Posted July 23, 2020 I have a similar problem as @bigbrovar regarding the two dual port USB3.0. If I want to attach a 2,5" USB3 disk to the Helios4 I have to use the lower USB port. The upper one "seems" to be dead, but I'm able to connect a thumb drive in parallel at the upper port. I can live with the situation.
aprayoga Posted July 27, 2020 Posted July 27, 2020 @FredK and @bigbrovar, thanks for the info. on Friday, i did some initial investigation and as you said upper USB port seems only work at 2.0 (high-speed USB device) not 3.0. [ 42.581478] usb 4-1: new high-speed USB device number 2 using xhci-hcd [ 42.747232] usb 4-1: New USB device found, idVendor=0480, idProduct=0820, bcdDevice= 3.15 [ 42.755452] usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 42.762635] usb 4-1: Product: External USB 3.0 [ 42.767095] usb 4-1: Manufacturer: TOSHIBA [ 42.771225] usb 4-1: SerialNumber: XXXXXXXXXXX6F [ 42.777009] usb-storage 4-1:1.0: USB Mass Storage device detected [ 42.785088] scsi host4: usb-storage 4-1:1.0 [ 42.839846] usbcore: registered new interface driver uas [ 49.139357] scsi 4:0:0:0: Direct-Access TOSHIBA External USB 3.0 5438 PQ: 0 ANSI: 6 [ 49.149674] sd 4:0:0:0: Attached scsi generic sg2 type 0 [ 49.161946] sd 4:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16). [ 49.171592] sd 4:0:0:0: [sdc] 7814037164 512-byte logical blocks: (4.00 TB/3.64 TiB) [ 49.179392] sd 4:0:0:0: [sdc] 4096-byte physical blocks [ 49.187768] sd 4:0:0:0: [sdc] Write Protect is off [ 49.195302] sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 49.334260] sdc: sdc1 sdc2 [ 49.339876] sd 4:0:0:0: [sdc] Attached SCSI disk this morning i tried, and it works as 3.0 (SuperSpeed Gen 1 USB device) [ 90.585477] usb 4-1: new high-speed USB device number 2 using xhci-hcd [ 91.477501] usb 5-1: new SuperSpeed Gen 1 USB device number 2 using xhci-hcd [ 91.503435] usb 5-1: New USB device found, idVendor=0480, idProduct=0820, bcdDevice= 3.15 [ 91.511642] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 91.518807] usb 5-1: Product: External USB 3.0 [ 91.523274] usb 5-1: Manufacturer: TOSHIBA [ 91.527390] usb 5-1: SerialNumber: XXXXXXXXXXX6F [ 91.534376] usb-storage 5-1:1.0: USB Mass Storage device detected [ 91.540743] scsi host4: usb-storage 5-1:1.0 [ 91.583403] usbcore: registered new interface driver uas [ 97.211091] scsi 4:0:0:0: Direct-Access TOSHIBA External USB 3.0 5438 PQ: 0 ANSI: 6 [ 97.221100] sd 4:0:0:0: Attached scsi generic sg2 type 0 [ 97.229812] sd 4:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16). [ 97.238524] sd 4:0:0:0: [sdc] 7814037164 512-byte logical blocks: (4.00 TB/3.64 TiB) [ 97.246315] sd 4:0:0:0: [sdc] 4096-byte physical blocks [ 97.253361] sd 4:0:0:0: [sdc] Write Protect is off [ 97.258537] sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 97.394956] sdc: sdc1 sdc2 [ 97.402343] sd 4:0:0:0: [sdc] Attached SCSI disk we will need to investigate further
Jeremy Carter Posted August 1, 2020 Posted August 1, 2020 So today my Helios4 has died. The power supply is not supplying the correct 5v and 12v voltages through the molex and the drives are sadly just sitting there making a ticking noise. One drive has now failed completely. I'll be honest ive been running this non stop since September 2017 with 99.999% uptime. I am unable to stabling power 1 SATA drive.
gprovost Posted August 3, 2020 Author Posted August 3, 2020 @Jeremy Carter First you should stop using the PSU if it's unable to supply to correct voltage otherwise you might damage the drive. The ticking noise is drives trying to spine up but then PSU fails to absorb current spike and then voltage drops then drives reset. You can try to look for a replacement online, but I know in some country not easy to find. An alternative would be to find a 12V / 8A PSU with the normal DC-IN jack (easier to find), then cut the cable with DC connector and swap it with the one of the failed PSU. If none of the above works for you, you can PM an we can sell you one of the Helios4 spare PSU we still have in stock.
Jeremy Carter Posted August 3, 2020 Posted August 3, 2020 1 hour ago, gprovost said: @Jeremy Carter First you should stop using the PSU if it's unable to supply to correct voltage otherwise you might damage the drive. The ticking noise is drives trying to spine up but then PSU fails to absorb current spike and then voltage drops then drives reset. You can try to look for a replacement online, but I know in some country not easy to find. An alternative would be to find a 12V / 8A PSU with the normal DC-IN jack (easier to find), then cut the cable with DC connector and swap it with the one of the failed PSU. If none of the above works for you, you can PM an we can sell you one of the Helios4 spare PSU we still have in stock. I am in Australia. Do you think that its a problem with the PSU brick or the Helios4 board? I should be able to source a Synology PSU if that fits?
gprovost Posted August 3, 2020 Author Posted August 3, 2020 That's for sure a failing PSU, not a board problem. Yes we are using the same 4-pin pinout than Synology, so it will be ok.
tbendiksen Posted August 4, 2020 Posted August 4, 2020 It seems I will be joining the league of failing PSU owners. This is on my second Helios4 from batch3, and it has been crashing lately. Roughly every second day it freezes completely. Oled frozen and nothing new on the serial port. The system had only 2 drives in it at the time, and one was failing so for a long time I assumed it was the dying drive that was making it crash. The drive has been replaced with a new and working drive, but crashing still occours. Started looking this up and found others with similar symptoms here. I have tried to limit the CPU speed to 800MHz as mentioned by others. This was yesterday, while no crashes yet we'll see how that goes. But that is not a permanent solution of course. My concern is that even with only 2 drives, and the system being at roughly half-load doing Glusterfs replication with other helios4 (batch1 with 4 drives. No unintended downtime since initial install) the PSU gets very warm. This is sitting in a cupboard without a lot of air circulation (ambient temp ~28C), but the PSU is uncomfotable to touch for more than 5 seconds. No smell of burning plastics or anything thankfully, but doesn't make me too happy. As mentioned this is with only 2 drives, and the CPU throttled to 800MHz. The second helios4 is not physically reachable at the moment, so I cannot confirm if this is the same situation. Also these are both in use 24/7 so testing is possible but limitied. So it seems that I need to get a new PSU asap. Any recommendations in NL / EU?
gprovost Posted August 4, 2020 Author Posted August 4, 2020 50 minutes ago, tbendiksen said: So it seems that I need to get a new PSU asap. Any recommendations in NL / EU? Go on amazon and look for TAIFU or KFD 12V 8,33A. https://www.amazon.de/dp/B07Q72NBQK https://www.amazon.de/dp/B07P9PR5VB 1 hour ago, tbendiksen said: I have tried to limit the CPU speed to 800MHz as mentioned by others. This was yesterday, while no crashes yet we'll see how that goes. But that is not a permanent solution of course. Ok keep us updated.
Jeremy Carter Posted August 5, 2020 Posted August 5, 2020 On 8/3/2020 at 2:36 PM, gprovost said: That's for sure a failing PSU, not a board problem. Yes we are using the same 4-pin pinout than Synology, so it will be ok. Got my new PSU today, everything back to normal. Now onwards for another 3 years of continuous usage! For the record everyone I got a 100W Synology 4 PIN power supply for $89 AUD. You'll need 100W if you're going to be powering all 4 SATA drives at once.
sandeepnd Posted August 6, 2020 Posted August 6, 2020 Hello Team, I have been using a Helios4 NAS for the last year or so. It had been working great until now. I use 4 Seagate hard disks with 4TB each (ST4000DM004). I have always used a UPS with the device and it was safe. More recently we had some problems with the UPS and I had to connect it directly to the power source. About two days ago, I tried doing an apt-upgrade/apt-update and the NAS came crashing down. It now no longer recognizes my hard disk. After a lot of troubleshooting, we found out that the SATA controller is not recognizing the hard disks. There are times (very rare) where the disks are recognized, but most of the time, it fails to detect the drives and mount the drive. I had used Open Media Vault to configure the RAID5 array. The NAS boots off the Armbian OS from an SDCard (16GB). We tried using lsscsi and it doesn’t detect any drives. OpenMediaVault detects the drives sometimes but other times it does not. But this morning it loaded up all the drives and as I tried to backup the data, it started crashing and rebooting. Has anybody faced similar problems in the past? I have a feeling the on-board SATA controller is not working. What can I do to troubleshoot the problem and look for an easy fix? Any suggestions would be greatly appreciated. PS: I have attached all logs from /var/log and the output of my connection to putty. Thanks, Sandeep Archive.zip putty.zip
Guest Posted August 6, 2020 Posted August 6, 2020 Hi everyone. I just updated my system and it seems that `/dev/thermal-cpu` has vanished. Meaning that fancontrol cannot find it. How exactly is this directory created and how can I fix it? edit: I was able to solve the issue by changing `armada_thermal` to `f10e4078.thermal` in `/etc/udev/rules.d/90-helios4-hwmon.rules`
Igor Posted August 6, 2020 Posted August 6, 2020 53 minutes ago, sandeepnd said: I have a feeling the on-board SATA controller is not working. Does the board works fine with a clean latest Armbian and some other drive?
sandeepnd Posted August 6, 2020 Posted August 6, 2020 35 minutes ago, Igor said: Does the board works fine with a clean latest Armbian and some other drive? I havent tried that. My first priority is to recover some of my data :(. Is there any way i can connect the hard disks to a desktop and use a software to recover the data? It is RAID5 with 4 hard disks.
Igor Posted August 6, 2020 Posted August 6, 2020 2 hours ago, sandeepnd said: My first priority is to recover some of my data :(. I do understand that. Just trying to narrow the problem down. 2 hours ago, sandeepnd said: Is there any way i can connect the hard disks to a desktop and use a software to recover the data? It is RAID5 with 4 hard disks. Sorry, I am not a hard disk recovery expert. No idea.
lanefu Posted August 6, 2020 Posted August 6, 2020 7 hours ago, sandeepnd said: I havent tried that. My first priority is to recover some of my data :(. Is there any way i can connect the hard disks to a desktop and use a software to recover the data? It is RAID5 with 4 hard disks. Yes plug them into another linux box and read up on mdraid / mdadm recovery. https://raid.wiki.kernel.org/index.php/RAID_Recovery 1
gprovost Posted August 7, 2020 Author Posted August 7, 2020 @sandeepnd Most probably it's again a dying PSU not providing enough voltage for the HDD to spin-up, it you read this thread you will see that it has been a recurrent problem. We provided instruction in the past posts to check the voltage on the Molex connector. Which country are you located ? Can provide you a link of a good replacement. Yes you can plug your HDD on any other machine that run Linux, you might need to install mdadm package if not present already : https://wiki.kobol.io/helios4/mdadm/#import-an-existing-raid-array @bramirez Good to see you fixed it. Out of curiosity when you say update, you mean upgrade from what version to what version ? 1
sandeepnd Posted August 10, 2020 Posted August 10, 2020 @gprovostThank you for your recommendations. We were able to recover the data. Could you please give recommendations for a good power adapter to purchase in India? Also, we found something local that is 12V and 12Amps. Would that be ok? I assume the higher Amp wouldn't be a problem?
gprovost Posted August 11, 2020 Author Posted August 11, 2020 @sandeepnd Hard to find a replacement in India. I will contact you by PM to find another solution.
fri.K Posted August 28, 2020 Posted August 28, 2020 Hi, I'm struggling with Helios4 rebooting on hight SATA load. My Helios4 works with 4x2TB HGST SATA3 disk running in raid5 and Samsung Evo SD card Since beginning there were some problems with stability, but quite rare ~1per 3 months, but I didn't find it as a big problem. But when I upgraded OS to $ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 10 (buster)" NAME="Debian GNU/Linux" VERSION_ID="10" then issue came to be a problem because it can handle about a 1 minute or less 40-50MB/s transfer, then all LEDs stops blinking, and after short while it reboots. I connected via serial console, but tail -f /var/log/syslog didn't show anything at the moment of freeze, serial seems to freeze also. How can I debug it further, any ideas? There is no difference if I use current or legacy kernel, system is updated. I do not use SPI nor flash. My PSU was replaced about a month ago because original one died like many here. Unfortunately I had to buy used PSU as there was no other fast solution in my country at that time, should I consider it as an potential problem? Quote EDAC 12V - 12.5A Model: EA11803A-120 Input: 100-240V , 50-60Hz Output: 12V - 12,5A - max 150W I made some more tests and I don't think that PSU is to blame. I ran stress in many configurations, stress -c 2 -i 2 -m 2 -d 4 but Helios4 was stable for 20 minutes, then I ran iperf3 -s and stressed it with client for 20 minutes, then all together and no problem could be observed. Then I downloaded 16GB file over http and there was no problem, what is more speed was quite good ~75MB/s. At the end I started download of the same file over NFS with speed ~35MB/s and after 30 seconds it freezed. So problem is with NFS nfs-kernel-server 1:1.3.4-2.5+deb10u1 armhf Any idea what can be wrong with it? $ cat /etc/exports (...) /path/Data 192.168.19.0/24(fsid=10,rw,subtree_check,crossmnt) /path 192.168.19.0/24(ro,fsid=0,root_squash,no_subtree_check,hide) (...) both standard PCs (x86_64) and Odroid N2 (aarch64) are clients to this NFS server
gprovost Posted September 1, 2020 Author Posted September 1, 2020 @fri.K That's very interesting info and not the first time NFS has been pointed as a possible root cause : Have you tried some different NFS block sizes (wsize and rsize) settings on your client side ? or maybe reduce the number of NFS daemon ? Because most probably under NFS load with default settings the system reach a unresponsive state and the hw watchdog kicks in and reset the system. Check if watchdog service is running (systemctl status watchdog.service), if yes then you could also disable it to remove it from the equation during your troubleshooting.
Mangix Posted September 5, 2020 Posted September 5, 2020 I wonder if I will join the same failed PSU club... I have two power supplies that are the same as the one included EXCEPT for the fact that they are female connectors, not male. Any way to adapt them so that they work? Also, does anyone know if kernel 5.7 is stable? I'm running on 4.19.63 as 4.19.108 or w/e it was had stability problems (random reboots). Same with some of the earlier 5.4 kernels.
Werner Posted September 6, 2020 Posted September 6, 2020 4 hours ago, Mangix said: I have two power supplies that are the same as the one included EXCEPT for the fact that they are female connectors, not male. Any way to adapt them so that they work? If they have the have the exact same ratings in voltage and current I'd say (and assumed you are a bit familiar with the basics) pickup a couple fitting connectors and solder them to the wires.
PEW Posted September 9, 2020 Posted September 9, 2020 I recently upgraded from omv4 to omv5 and after about two weeks into the upgrade i noticed the helios was rebooting randomly. It has been doing this since late June. I decided to replaced the hard disks due to the smartctl showing up red and replaced all 4 with good drives. During the raid rebuild process, I was getting random reboots as well until the final drive. I did run a fschk on the raid volume and fixed the file system. Random reboot are still happening. I looked at all the logs and could not find anything. I have not tried any stress tests or io tests. i was wondering if this has been an issue with others or is it best to revert to the legacy kernel(4.19). I am noticing a ERROR in u-boot in reserving fdt memory region failed (addr=2040000 size=6c000) then loading device tree ... OK then starting kernel. Any other ideas? Thanks
gprovost Posted September 10, 2020 Author Posted September 10, 2020 @PEW Can you check if watchdog service is running (systemctl status watchdog.service), if yes then you could also disable it to remove and see if you still experience reboot.
PEW Posted September 10, 2020 Posted September 10, 2020 Yes. It is running. I will disable. watchdog.service - watchdog daemon Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-09-09 09:24:25 EDT; 1 day 1h ago Process: 3723 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (co Process: 3726 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS) Main PID: 3737 (watchdog) Tasks: 1 (limit: 4503) Memory: 752.0K CGroup: /system.slice/watchdog.service └─3737 /usr/sbin/watchdog Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable. patrick@helios4-PW:~$ sudo systemctl status watchdog.service | more ● watchdog.service - watchdog daemon Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-09-09 09:24:25 EDT; 1 day 1h ago Process: 3723 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (co de=exited, status=0/SUCCESS) Process: 3726 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS) Main PID: 3737 (watchdog) Tasks: 1 (limit: 4503) Memory: 752.0K CGroup: /system.slice/watchdog.service └─3737 /usr/sbin/watchdog Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Nova Posted September 10, 2020 Posted September 10, 2020 Problem with Helios4 and read-only filesystem. rec_len=4, name_len=1, size=4096 [22707940.214601] Aborting journal on device mmcblk0p1-8. [22707940.224332] EXT4-fs (mmcblk0p1): Remounting filesystem read-only [22707940.230639] EXT4-fs error (device mmcblk0p1) in ext4_do_update_inode:5337: Journal has aborted [22707940.244161] EXT4-fs error (device mmcblk0p1) in ext4_evict_inode:258: Journal has aborted [22707940.257757] EXT4-fs error (device mmcblk0p1) in ext4_mkdir:2683: IO failure This is the second SD with the same problem. It's possible to repair or it's a problem with hardware or quality of my SD. Thanks
gprovost Posted September 11, 2020 Author Posted September 11, 2020 13 hours ago, Nova said: Problem with Helios4 and read-only filesystem. Were there new microSD cards ? which model ? How long before it starts to show this issue ?
gprovost Posted September 11, 2020 Author Posted September 11, 2020 @PEW Don't forget to reboot after disabling the watchdog.
Nova Posted September 11, 2020 Posted September 11, 2020 33 minutes ago, gprovost said: Were there new microSD cards ? which model ? How long before it starts to show this issue ? Kingston Canvas Select Plus microsd HC I V10 A1 4 mouths. This issue before update system.
PEW Posted September 11, 2020 Posted September 11, 2020 @gprovost Yes. I did. i was also fixing other issues as well. Please read: https://forum.openmediavault.org/index.php?thread/31517-log2ram/&postID=250089#post250172 Trying to determine armbian issues vs. omv5 with cron and armbian scripts. Thanks
Recommended Posts