Helios4 Support


Recommended Posts

I have a similar problem as @bigbrovar regarding the two dual port USB3.0. If I want to attach a 2,5" USB3 disk to the Helios4 I have to use the lower USB port. The upper one "seems" to be dead, but I'm able to connect a thumb drive in parallel at the upper port.

 

I can live with the situation.

Link to post
Share on other sites
Armbian is a community driven open source project. Do you like to contribute your code?

@FredK and @bigbrovar, thanks for the info.
on Friday, i did some initial investigation and as you said upper USB port seems only work at 2.0 (high-speed USB device) not 3.0.

[   42.581478] usb 4-1: new high-speed USB device number 2 using xhci-hcd
[   42.747232] usb 4-1: New USB device found, idVendor=0480, idProduct=0820, bcdDevice= 3.15
[   42.755452] usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   42.762635] usb 4-1: Product: External USB 3.0
[   42.767095] usb 4-1: Manufacturer: TOSHIBA
[   42.771225] usb 4-1: SerialNumber: XXXXXXXXXXX6F
[   42.777009] usb-storage 4-1:1.0: USB Mass Storage device detected
[   42.785088] scsi host4: usb-storage 4-1:1.0
[   42.839846] usbcore: registered new interface driver uas
[   49.139357] scsi 4:0:0:0: Direct-Access     TOSHIBA  External USB 3.0 5438 PQ: 0 ANSI: 6
[   49.149674] sd 4:0:0:0: Attached scsi generic sg2 type 0
[   49.161946] sd 4:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
[   49.171592] sd 4:0:0:0: [sdc] 7814037164 512-byte logical blocks: (4.00 TB/3.64 TiB)
[   49.179392] sd 4:0:0:0: [sdc] 4096-byte physical blocks
[   49.187768] sd 4:0:0:0: [sdc] Write Protect is off
[   49.195302] sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   49.334260]  sdc: sdc1 sdc2
[   49.339876] sd 4:0:0:0: [sdc] Attached SCSI disk

this morning i tried, and it works as 3.0 (SuperSpeed Gen 1 USB device)

[   90.585477] usb 4-1: new high-speed USB device number 2 using xhci-hcd
[   91.477501] usb 5-1: new SuperSpeed Gen 1 USB device number 2 using xhci-hcd
[   91.503435] usb 5-1: New USB device found, idVendor=0480, idProduct=0820, bcdDevice= 3.15
[   91.511642] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   91.518807] usb 5-1: Product: External USB 3.0
[   91.523274] usb 5-1: Manufacturer: TOSHIBA
[   91.527390] usb 5-1: SerialNumber: XXXXXXXXXXX6F
[   91.534376] usb-storage 5-1:1.0: USB Mass Storage device detected
[   91.540743] scsi host4: usb-storage 5-1:1.0
[   91.583403] usbcore: registered new interface driver uas
[   97.211091] scsi 4:0:0:0: Direct-Access     TOSHIBA  External USB 3.0 5438 PQ: 0 ANSI: 6
[   97.221100] sd 4:0:0:0: Attached scsi generic sg2 type 0
[   97.229812] sd 4:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
[   97.238524] sd 4:0:0:0: [sdc] 7814037164 512-byte logical blocks: (4.00 TB/3.64 TiB)
[   97.246315] sd 4:0:0:0: [sdc] 4096-byte physical blocks
[   97.253361] sd 4:0:0:0: [sdc] Write Protect is off
[   97.258537] sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   97.394956]  sdc: sdc1 sdc2
[   97.402343] sd 4:0:0:0: [sdc] Attached SCSI disk

we will need to investigate further

 

 

Link to post
Share on other sites

So today my Helios4 has died. The power supply is not supplying the correct 5v and 12v voltages through the molex and the drives are sadly just sitting there making a ticking noise. One drive has now failed completely. I'll be honest ive been running this non stop since September 2017 with 99.999% uptime. I am unable to stabling power 1 SATA drive.

Link to post
Share on other sites

@Jeremy Carter First you should stop using the PSU if it's unable to supply to correct voltage otherwise you might damage the drive. The ticking noise is drives trying to spine up but then PSU fails to absorb current spike and then voltage drops then drives reset.

 

You can try to look for a replacement online, but I know in some country not easy to find. An alternative would be to find a 12V / 8A PSU with the normal DC-IN jack (easier to find), then cut the cable with DC connector and swap it with the one of the failed PSU. If none of the above works for you, you can PM an we can sell you one of the Helios4 spare PSU we still have in stock.

Link to post
Share on other sites
1 hour ago, gprovost said:

@Jeremy Carter First you should stop using the PSU if it's unable to supply to correct voltage otherwise you might damage the drive. The ticking noise is drives trying to spine up but then PSU fails to absorb current spike and then voltage drops then drives reset.

 

You can try to look for a replacement online, but I know in some country not easy to find. An alternative would be to find a 12V / 8A PSU with the normal DC-IN jack (easier to find), then cut the cable with DC connector and swap it with the one of the failed PSU. If none of the above works for you, you can PM an we can sell you one of the Helios4 spare PSU we still have in stock.

 

I am in Australia. Do you think that its a problem with the PSU brick or the Helios4 board? I should be able to source a Synology PSU if that fits?

Link to post
Share on other sites

It seems I will be joining the league of failing PSU owners.

This is on my second Helios4 from batch3, and it has been crashing lately. Roughly every second day it freezes completely. Oled frozen and nothing new on the serial port.

The system had only 2 drives in it at the time, and one was failing so for a long time I assumed it was the dying drive that was making it crash. The drive has been replaced with a new and working drive, but crashing still occours. Started looking this up and found others with similar symptoms here.

I have tried to limit the CPU speed to 800MHz as mentioned by others. This was yesterday, while no crashes yet we'll see how that goes. But that is not a permanent solution of course.

My concern is that even with only 2 drives, and the system being at roughly half-load doing Glusterfs replication with other helios4 (batch1 with 4 drives. No unintended downtime since initial install) the PSU gets very warm. This is sitting in a cupboard without a lot of air circulation (ambient temp ~28C), but the PSU is uncomfotable to touch for more than 5 seconds. No smell of burning plastics or anything thankfully, but doesn't make me too happy. As mentioned this is with only 2 drives, and the CPU throttled to 800MHz.

The second helios4 is not physically reachable at the moment, so I cannot confirm if this is the same situation.

Also these are both in use 24/7 so testing is possible but limitied.

 

So it seems that I need to get a new PSU asap. Any recommendations in NL / EU?

Link to post
Share on other sites
50 minutes ago, tbendiksen said:

So it seems that I need to get a new PSU asap. Any recommendations in NL / EU?

 

Go on amazon and look for TAIFU or KFD 12V 8,33A.

https://www.amazon.de/dp/B07Q72NBQK

https://www.amazon.de/dp/B07P9PR5VB

 

1 hour ago, tbendiksen said:

I have tried to limit the CPU speed to 800MHz as mentioned by others. This was yesterday, while no crashes yet we'll see how that goes. But that is not a permanent solution of course.

 Ok keep us updated.

Link to post
Share on other sites
On 8/3/2020 at 2:36 PM, gprovost said:

That's for sure a failing PSU, not a board problem. Yes we are using the same 4-pin pinout than Synology, so it will be ok.

 

Got my new PSU today, everything back to normal. Now onwards for another 3 years of continuous usage!

 

For the record everyone I got a 100W Synology 4 PIN power supply for $89 AUD.  You'll need 100W if you're going to be powering all 4 SATA drives at once.

Link to post
Share on other sites

Hello Team,

 

I have been using a Helios4 NAS for the last year or so. It had been working great until now. I use 4 Seagate hard disks with 4TB each (ST4000DM004). I have always used a UPS with the device and it was safe. More recently we had some problems with the UPS and I had to connect it directly to the power source. About two days ago, I tried doing an apt-upgrade/apt-update and the NAS came crashing down. It now no longer recognizes my hard disk. After a lot of troubleshooting, we found out that the SATA controller is not recognizing the hard disks. There are times (very rare) where the disks are recognized, but most of the time, it fails to detect the drives and mount the drive.

I had used Open Media Vault to configure the RAID5 array. The NAS boots off the Armbian OS from an SDCard (16GB). We tried using lsscsi and it doesn’t detect any drives. OpenMediaVault detects the drives sometimes but other times it does not. But this morning it loaded up all the drives and as I tried to backup the data, it started crashing and rebooting.

Has anybody faced similar problems in the past? I have a feeling the on-board SATA controller is not working. What can I do to troubleshoot the problem and look for an easy fix? Any suggestions would be greatly appreciated. 

 

PS: I have attached all logs from /var/log and the output of my connection to putty.

 

Thanks,

Sandeep

Archive.zip putty.zip

Link to post
Share on other sites

Hi everyone. I just updated my system and it seems that `/dev/thermal-cpu` has vanished. Meaning that fancontrol cannot find it. How exactly is this directory created and how can I fix it?

 

edit: I was able to solve the issue by changing `armada_thermal` to `f10e4078.thermal` in `/etc/udev/rules.d/90-helios4-hwmon.rules`

 

 

Link to post
Share on other sites
35 minutes ago, Igor said:

Does the board works fine with a clean latest Armbian and some other drive?

 

I havent tried that. My first priority is to recover some of my data :(.

Is there any way i can connect the hard disks to a desktop and use a software to recover the data? It is RAID5 with 4 hard disks.

Link to post
Share on other sites
2 hours ago, sandeepnd said:

My first priority is to recover some of my data :(.


I do understand that. Just trying to narrow the problem down.

 

2 hours ago, sandeepnd said:

Is there any way i can connect the hard disks to a desktop and use a software to recover the data? It is RAID5 with 4 hard disks.


Sorry, I am not a hard disk recovery expert. No idea.

Link to post
Share on other sites

@sandeepnd Most probably it's again a dying PSU not providing enough voltage for the HDD to spin-up, it you read this thread you will see that it has been a recurrent problem. We provided instruction in the past posts to check the voltage on the Molex connector. Which country are you located ? Can provide you a link of a good replacement.

 

Yes you can plug your HDD on any other machine that run Linux, you might need to install mdadm package if not present already : https://wiki.kobol.io/helios4/mdadm/#import-an-existing-raid-array

 

 

@bramirez Good to see you fixed it. Out of curiosity when you say update, you mean upgrade from what version to what version ?

 

Link to post
Share on other sites

Hi,

I'm struggling with Helios4 rebooting on hight SATA load. My Helios4 works with 4x2TB HGST SATA3 disk running in raid5 and Samsung Evo SD card

Since beginning there were some problems with stability, but quite rare ~1per 3 months, but I didn't find it as a big problem. But when I upgraded OS to

$ cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"

then issue came to be a problem because it can handle about a 1 minute or less 40-50MB/s transfer, then all LEDs stops blinking, and after short while it reboots.

I connected via serial console, but

tail -f /var/log/syslog

didn't show anything at the moment of freeze, serial seems to freeze also.

How can I debug it further, any ideas?

There is no difference if I use current or legacy kernel, system is updated. I do not use SPI nor flash. My PSU was replaced about a month ago because original one died like many here. Unfortunately I had to buy used PSU as there was no other fast solution in my country at that time, should I consider it as an potential problem?
 

Quote

EDAC 12V - 12.5A
Model: EA11803A-120

Input: 100-240V , 50-60Hz

Output: 12V - 12,5A - max 150W

 

 

I made some more tests and I don't think that PSU is to blame. I ran stress in many configurations,

stress -c 2 -i 2 -m 2 -d 4

but Helios4 was stable for 20 minutes, then I ran

iperf3 -s

and stressed it with client for 20 minutes, then all together and no problem could be observed.

Then I downloaded 16GB file over http and there was no problem, what is more speed was quite good ~75MB/s. At the end I started download of the same file over NFS with speed ~35MB/s and after 30 seconds it freezed. So problem is with NFS

nfs-kernel-server 1:1.3.4-2.5+deb10u1 armhf

Any idea what can be wrong with it?

$ cat /etc/exports
(...)
/path/Data 192.168.19.0/24(fsid=10,rw,subtree_check,crossmnt)
/path 192.168.19.0/24(ro,fsid=0,root_squash,no_subtree_check,hide)
(...)

both standard PCs (x86_64) and Odroid N2 (aarch64) are clients to this NFS server

Link to post
Share on other sites

@fri.K That's very interesting info and not the first time NFS has been pointed as a possible root cause :

 

Have you tried some different NFS block sizes (wsize and rsize) settings on your client side ? or maybe reduce the number of NFS daemon ?

 

Because most probably under NFS load with default settings the system reach a unresponsive state and the hw watchdog kicks in and reset the system.

Check if watchdog service is running (systemctl status watchdog.service), if yes then you could also disable it to remove it from the equation during your troubleshooting.

 

 

 

Link to post
Share on other sites

I wonder if I will join the same failed PSU club...

 

I have two power supplies that are the same as the one included EXCEPT for the fact that they are female connectors, not male. Any way to adapt them so that they work?

 

Also, does anyone know if kernel 5.7 is stable? I'm running on 4.19.63 as 4.19.108 or w/e it was had stability problems (random reboots). Same with some of the earlier 5.4 kernels.

Link to post
Share on other sites
4 hours ago, Mangix said:

I have two power supplies that are the same as the one included EXCEPT for the fact that they are female connectors, not male. Any way to adapt them so that they work?

If they have the have the exact same ratings in voltage and current I'd say (and assumed you are a bit familiar with the basics) pickup a couple fitting connectors and solder them to the wires.

Link to post
Share on other sites

I recently upgraded from omv4 to omv5 and after about two weeks into the upgrade i noticed the helios was rebooting randomly. It has been doing this since late June.

 

I  decided to replaced the hard disks due to the smartctl showing up red and replaced all 4 with good drives. During the raid rebuild process, I was getting random reboots as well until the final drive.  I did run a fschk on the raid volume and fixed the file system.  Random reboot are still happening. I looked at all the logs and could not find anything.

 

I have not tried any stress tests or io tests.

 

 i was wondering if this has been an issue with others or is it best to revert to the legacy kernel(4.19).

 

I am noticing a ERROR in u-boot in reserving fdt memory region failed (addr=2040000 size=6c000) then loading device tree ... OK then starting kernel.

 

Any other ideas?

 

Thanks

Link to post
Share on other sites

Yes. It is running. I will disable.

 

watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-09-09 09:24:25 EDT; 1 day 1h ago
  Process: 3723 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (co
  Process: 3726 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
 Main PID: 3737 (watchdog)
    Tasks: 1 (limit: 4503)
   Memory: 752.0K
   CGroup: /system.slice/watchdog.service
           └─3737 /usr/sbin/watchdog

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
patrick@helios4-PW:~$ sudo systemctl status watchdog.service | more
● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-09-09 09:24:25 EDT; 1 day 1h ago
  Process: 3723 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (co
de=exited, status=0/SUCCESS)
  Process: 3726 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
 Main PID: 3737 (watchdog)
    Tasks: 1 (limit: 4503)
   Memory: 752.0K
   CGroup: /system.slice/watchdog.service
           └─3737 /usr/sbin/watchdog

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
 

Link to post
Share on other sites

Problem with Helios4 and read-only filesystem.

 

rec_len=4, name_len=1, size=4096
[22707940.214601] Aborting journal on device mmcblk0p1-8.
[22707940.224332] EXT4-fs (mmcblk0p1): Remounting filesystem read-only
[22707940.230639] EXT4-fs error (device mmcblk0p1) in ext4_do_update_inode:5337: Journal has aborted
[22707940.244161] EXT4-fs error (device mmcblk0p1) in ext4_evict_inode:258: Journal has aborted
[22707940.257757] EXT4-fs error (device mmcblk0p1) in ext4_mkdir:2683: IO failure

 

This is the second SD with the same problem. It's possible to repair or it's a problem with hardware or quality of my SD.

 

Thanks

 

 

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...