Jump to content

Helios64 Support


gprovost

Recommended Posts

3 minutes ago, dancgn said:

I tried to install it throu armbian-config, but it doesn't work.

You don't have to use armbian-config for such tasks. You always can install stuff the usual way or however it is documented at the targeted software ;)

armbian-config might be a bit neclegted in the past due to lack of ressources

Link to comment
Share on other sites

vor 48 Minuten schrieb Werner:

You don't have to use armbian-config for such tasks. You always can install stuff the usual way or however it is documented at the targeted software ;)

armbian-config might be a bit neclegted in the past due to lack of ressources

 

I understand, but there is no official docker image for arm64. So i tried it with armbian-config.

Link to comment
Share on other sites

1 hour ago, dancgn said:

 

I understand, but there is no official docker image for arm64. So i tried it with armbian-config.

the official ones work with arm64 without any problems. only the reverse proxy doesnt. 

Link to comment
Share on other sites

vor 19 Minuten schrieb flower:

the official ones work with arm64 without any problems. only the reverse proxy doesnt. 

 

Hm, so I have to google a little bit. Thx

 

@flower

standard_init_linux.go:211: exec user process caused "exec format error"

 

This is what i get...

Edited by dancgn
error message
Link to comment
Share on other sites

On 10/26/2020 at 10:35 PM, gprovost said:

 

You right. No excuse on our side, we are behind schedule and not up to expectation on the software maturity, maybe we should have stick LK4.4 (from rockchip) and forget about Linux mainline for now :-/

However we are still working at improving the stability and we are optimistic that very soon, it will get better.

 

Right now as you know LK5.8.16 is for some reason (we still can't figure out) unstable vs 5.8.14

We also realized that OMV install was removing some tx offloading tweak. So a lot of little things here and there that we only discover along the way.

 

BTW regarding this crash are you sure tx offload on eth1 was disable ?

I respect your transparency and your vision, do you have a GitHub where I can contribute?

 

Link to comment
Share on other sites

1 hour ago, dancgn said:

 

Hm, so I have to google a little bit. Thx

 

@flower

standard_init_linux.go:211: exec user process caused "exec format error"

 

This is what i get...

try this one: https://github.com/nextcloud/docker/tree/master/.examples/docker-compose/with-nginx-proxy-self-signed-ssl/mariadb/fpm

 

omgwtfssl doesnt work though. you have to do the cert stuff yourself.

 

here is my reverse proxy which handles certs: https://github.com/flower1024/proxy

(but you have to tweak it... it also does dyndns updates with spdyn.de and has a fixed Europe/Berlin Timezone. its not really built for sharing)

Link to comment
Share on other sites

vor 3 Minuten schrieb flower:

try this one: https://github.com/nextcloud/docker/tree/master/.examples/docker-compose/with-nginx-proxy-self-signed-ssl/mariadb/fpm

 

omgwtfssl doesnt work though. you have to do the cert stuff yourself.

 

here is my reverse proxy which handles certs: https://github.com/flower1024/proxy

(but you have to tweak it... it also does dyndns updates with spdyn.de and has a fixed Europe/Berlin Timezone. its not really built for sharing)

 

But, aren't that for nextcloud? I'm searching for a document Programm with full text search and ocr. I'm scanning the letters i'm recieving. 

 

Also i'm fighting with an own bitwarden-Server for passwords. :wacko:

Link to comment
Share on other sites

2 minutes ago, dancgn said:

 

But, aren't that for nextcloud? I'm searching for a document Programm with full text search and ocr. I'm scanning the letters i'm recieving. 

 

Also i'm fighting with an own bitwarden-Server for passwords. :wacko:

oh sorry. i was confused and though you are talking about nc.

Link to comment
Share on other sites

Hi,

After fixing the LED issue, I started to try out if snapraid is working. On the Helios4 snapraid ran into some issues due to the amount of files available on the snapraid "array"; 32bit addressing constraints caused snapraid to bork out regularly. No matter the snapraid configuration tweaking/trial & error applied, it kept on requiring more than 4GB of addressing space.

 

After running "sync" and "scrub" for the first time on the Helios64, I noticed a more than comfortable amount of alleged ata I/O errors like below:

ata1.00: failed command: READ FPDMA QUEUED
Spoiler

Oct 29 21:13:46 localhost kernel: [642537.722453] ata1.00: exception Emask 0x2 SAct 0x80018000 SErr 0x400 action 0x6
Oct 29 21:13:46 localhost kernel: [642537.723136] ata1.00: irq_stat 0x08000000
Oct 29 21:13:46 localhost kernel: [642537.723514] ata1: SError: { Proto }
Oct 29 21:13:46 localhost kernel: [642537.723854] ata1.00: failed command: READ FPDMA QUEUED
Oct 29 21:13:46 localhost kernel: [642537.724351] ata1.00: cmd 60/00:78:40:bf:22/02:00:1e:00:00/40 tag 15 ncq dma 262144 in
Oct 29 21:13:46 localhost kernel: [642537.724351]          res 40/00:f8:58:23:61/00:00:00:00:00/40 Emask 0x2 (HSM violation)
Oct 29 21:13:46 localhost kernel: [642537.725771] ata1.00: status: { DRDY }
Oct 29 21:13:46 localhost kernel: [642537.726120] ata1.00: failed command: READ FPDMA QUEUED
Oct 29 21:13:46 localhost kernel: [642537.726685] ata1.00: cmd 60/00:80:40:c1:22/02:00:1e:00:00/40 tag 16 ncq dma 262144 in
Oct 29 21:13:46 localhost kernel: [642537.726685]          res 40/00:f8:58:23:61/00:00:00:00:00/40 Emask 0x2 (HSM violation)
Oct 29 21:13:46 localhost kernel: [642537.728134] ata1.00: status: { DRDY }
Oct 29 21:13:46 localhost kernel: [642537.728485] ata1.00: failed command: READ FPDMA QUEUED
Oct 29 21:13:46 localhost kernel: [642537.728989] ata1.00: cmd 60/08:f8:58:23:61/00:00:00:00:00/40 tag 31 ncq dma 4096 in
Oct 29 21:13:46 localhost kernel: [642537.728989]          res 40/00:f8:58:23:61/00:00:00:00:00/40 Emask 0x2 (HSM violation)
Oct 29 21:13:46 localhost kernel: [642537.730432] ata1.00: status: { DRDY }
Oct 29 21:13:46 localhost kernel: [642537.730792] ata1: hard resetting link
Oct 29 21:13:47 localhost kernel: [642538.206413] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 29 21:13:47 localhost kernel: [642538.207993] ata1.00: configured for UDMA/133
Oct 29 21:13:47 localhost kernel: [642538.208300] sd 0:0:0:0: [sdb] tag#15 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s
Oct 29 21:13:47 localhost kernel: [642538.208319] sd 0:0:0:0: [sdb] tag#15 Sense Key : 0x5 [current] 
Oct 29 21:13:47 localhost kernel: [642538.208336] sd 0:0:0:0: [sdb] tag#15 ASC=0x21 ASCQ=0x4 
Oct 29 21:13:47 localhost kernel: [642538.208355] sd 0:0:0:0: [sdb] tag#15 CDB: opcode=0x88 88 00 00 00 00 00 1e 22 bf 40 00 00 02 00 00 00
Oct 29 21:13:47 localhost kernel: [642538.208373] blk_update_request: I/O error, dev sdb, sector 505593664 op 0x0:(READ) flags 0x80700 phys_seg 64 prio class 0
Oct 29 21:13:47 localhost kernel: [642538.209577] sd 0:0:0:0: [sdb] tag#16 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s
Oct 29 21:13:47 localhost kernel: [642538.209595] sd 0:0:0:0: [sdb] tag#16 Sense Key : 0x5 [current] 
Oct 29 21:13:47 localhost kernel: [642538.209610] sd 0:0:0:0: [sdb] tag#16 ASC=0x21 ASCQ=0x4 
Oct 29 21:13:47 localhost kernel: [642538.209626] sd 0:0:0:0: [sdb] tag#16 CDB: opcode=0x88 88 00 00 00 00 00 1e 22 c1 40 00 00 02 00 00 00
Oct 29 21:13:47 localhost kernel: [642538.209642] blk_update_request: I/O error, dev sdb, sector 505594176 op 0x0:(READ) flags 0x80700 phys_seg 63 prio class 0
Oct 29 21:13:47 localhost kernel: [642538.210826] ata1: EH complete

 

After some searching around on the internet, it appeared that limiting SATA link speed, these errors can be prevented. Checking other server deployments, this behavior was also seen in a 8 disk mdadm RAID setup, whre [new] WD blue disks also show these READ FPDMA QUEUED erros, which disappeared after ata error handling starts to turn down SATA link speeds to 3Gbps.

 

To test this out, I added the following to /boot/armbianEnv.txt:

extraargs=libata.force=3.0

Upon rebooting the box, it appears that libata indeed limited the SATA link speed for all drives to 3Gbps:

Oct 29 22:01:59 localhost kernel: [    3.143259] ata1: FORCE: PHY spd limit set to 3.0Gbps
Oct 29 22:01:59 localhost kernel: [    3.143728] ata1: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010100 irq 238
Oct 29 22:01:59 localhost kernel: [    3.143736] ata2: FORCE: PHY spd limit set to 3.0Gbps
Oct 29 22:01:59 localhost kernel: [    3.144192] ata2: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010180 irq 239
Oct 29 22:01:59 localhost kernel: [    3.144199] ata3: FORCE: PHY spd limit set to 3.0Gbps
Oct 29 22:01:59 localhost kernel: [    3.144654] ata3: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010200 irq 240
Oct 29 22:01:59 localhost kernel: [    3.144661] ata4: FORCE: PHY spd limit set to 3.0Gbps
Oct 29 22:01:59 localhost kernel: [    3.145115] ata4: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010280 irq 241
Oct 29 22:01:59 localhost kernel: [    3.145122] ata5: FORCE: PHY spd limit set to 3.0Gbps
Oct 29 22:01:59 localhost kernel: [    3.145603] ata5: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010300 irq 242

Redoing the snapraid scrub, the READ FPDMA QUEUED errors indeed had disappeared. As the disks in the box are WD red HDDs, there is not really a point of having 6Gbps (~600MB/s) SATA linkspeed anyway, disk performance is rated at less than 300MB/s throughput. (Occasionally it tips sustained sequential reads around 130MiB/s for large files.)

 

Note that YMMV.

 

Groetjes,


 

Link to comment
Share on other sites

16 hours ago, usual user said:

Out of curiosity, with mainline kernel and ondemand governor in place.
Apply this:




echo 40000 > /sys/devices/system/cpu/cpufreq/policy0/ondemand/sampling_rate
echo 465000 > /sys/devices/system/cpu/cpufreq/policy4/ondemand/sampling_rate

Does it still crash in your use case?
If it does not crash any longer I will explain what is going on.

I just switched to "ondemand' and your settings:

Quote

sh -x /usr/local/bin/switchcpuondemand
+ cpufreq-set -c 0 -g ondemand
+ cpufreq-set -c 4 -g ondemand
+ echo 40000
+ echo 465000

 

Now wait

 

*UPDATE*

 

Crashed within _minutes_

On the serial console

Quote

root@filer1:~# [70583.943650] BUG: spinlock lockup suspected on CPU#0, kswapd0/70
[70583.947652]  lock: 0xffffffc0f7f3f200, .magic: dead4ead, .owner: swapper/5/0, .owner_cpu: 5
[70583.974491] BUG: spinlock lockup suspected on CPU#4, kworker/4:0/4392
[70583.977415]  lock: 0xffffffc0f7f3f200, .magic: dead4ead, .owner: swapper/5/0, .owner_cpu: 5
[70583.978539] BUG: spinlock lockup suspected on CPU#2, kworker/2:1/12393
[70583.978545]  lock: 0xffffffc0f7f3f200, .magic: dead4ead, .owner: swapper/5/0, .owner_cpu: 5

 

Link to comment
Share on other sites

Hi All, we finally figured out the issue with the latest build (20.08.13) and actually it wasn't related to the Kernel. There was a silly mistake in Armbian hardware optimization script that ended up skipping applying all the required performance and stability tweak for Helios64. This has been fixed. Now we need to wait new image and dpkg are generated and propagated on the repos.

 

Actually the instructions we gave on our latest blog post, were a bit pointless since it wasn't bound to kernel version :unsure: We will post something on our blog as soon as new Armbian version is ready.

 

 

 

@djurny Thank for the Snapraid feedback and suggested tweak on SATA speed to avoid I/O errors. We will to look that closely to confirm it's not rather an issue with your harness cable. SATA 1 port is the one with the longest cable could be look closely here.

 

@fromport Yeah I think at the end it wasn't really a DVFS issue but just missing hw optimization tweak as stated above. For now stick to performance governor and once new armbian dpkg for Helios64 are ready and you upgraded then you can try to revert to on-demand governor (default settings) and let us now how it works.

 

@eClapton Noted on the instructions / clarification required. We are improving our Wiki every week.

Hot swap procedure is a good point ;-) Didn't have this one on our TODO list.

We also welcome any contribution to your wiki : https://github.com/kobol-io/wiki

 

@Dmitry Antipov Thanks for the pointer we will look at it. Doesn't seem anyone try to port rockchip RNG engine to mainline. I guess we could add it via patch to Armbiam @piter75

In a kind of similar topic people asked us how crypto engine works on RK3399, it's something we will address soon.

 

@Jaques-Ludwig Yeah as pointed by @Werner you can refer to official install doc

You can also look for some pointer to an old guide we did for our previous product, it's the same steps, just that the components version are older : https://wiki.kobol.io/helios4/nextcloud/

But anyhow would be great that the install via armbian-config works properky. Let's put that on the TODO list.

 

@dancgn You should create a dedicated Armbian thread on Mayan install via armband-config since it's not really related to hardware. Would be easier to troubleshot other it's going to get lost here.

You can also rise an issue on here with detailed logs.   Armbian-config tool is in big need of contributors, so don't expect it will be fix so soon, maybe you can try to figure it out yourself and fix it in armbian-config ;-)

 

 

 

 

Link to comment
Share on other sites

9 hours ago, fromport said:

Crashed within _minutes_

It was only a shot in the dark. I was inspired by this post. Attempting to trigger a next frequency change while a previous one is still in progress would have been a good explanation, as the crash appears to occur through dynamic frequency scaling. But having this configuration right is a good idea in any case. It's not always about it working or not. Most of the time, it's about not working as well as possible and all the small drawbacks add up in overall performance.

Link to comment
Share on other sites

5 hours ago, Igor said:

Updated images and packages via apt that affect Helios64

 

Update to Armbian 20.08.21 Buster with Linux 5.8.17-rockchip64 was without any issues. The only little glitch I could find in syslog relates to Armbian hardware optimization:

 

Oct 31 19:53:46 helios64 armbian-hardware-optimization[701]: /usr/lib/armbian/armbian-hardware-optimization: line 214: /proc/irq/$(awk -F":" "/eth0/ {print \$1}" </proc/interrupts | sed 's/\ //g')/smp_affinity: No such file or directory

 

Link to comment
Share on other sites

19 minutes ago, ebin-dev said:

 

Update to Armbian 20.08.21 Buster with Linux 5.8.17-rockchip64 was without any issues. The only little glitch I could find in syslog relates to Armbian hardware optimization:

 



Oct 31 19:53:46 helios64 armbian-hardware-optimization[701]: /usr/lib/armbian/armbian-hardware-optimization: line 214: /proc/irq/$(awk -F":" "/eth0/ {print \$1}" </proc/interrupts | sed 's/\ //g')/smp_affinity: No such file or directory

 

Hm shouldn't be the optimizations for rockchip64 used here?

 

https://github.com/armbian/build/blob/5b43356d17d4e28d4f0453cc65529e7d73dffc7f/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization#L214

Link to comment
Share on other sites

1 hour ago, ebin-dev said:

The only little glitch I could find in syslog relates to Armbian hardware optimization:


Oct 31 19:53:46 helios64 armbian-hardware-optimization[701]: /usr/lib/armbian/armbian-hardware-optimization: line 214: /proc/irq/$(awk -F":" "/eth0/ {print \$1}" </proc/interrupts | sed 's/\ //g')/smp_affinity: No such file or directory

 

There seems to be a race condition involving eth0 device discovery and armbian-hardware-optimize service.

 

Restarting the service after logging in with:

systemctl restart armbian-hardware-optimize

does not exhibit this issue.

 

49 minutes ago, Werner said:

Hm shouldn't be the optimizations for rockchip64 used here?

Nope, the board is based on rk3399 and it uses separate optimisations - this is part of the ongoing rockchip64 / rk3399 mixup saga ;-)

Link to comment
Share on other sites

I'm having a problem where ata5 on my device isn't responding. From dmesg I get the following:

[    3.499599] ata1: SATA link down (SStatus 0 SControl 300)
[    3.811439] ata2: SATA link down (SStatus 0 SControl 300)
[    4.123422] ata3: SATA link down (SStatus 0 SControl 300)
[    4.435415] ata4: SATA link down (SStatus 0 SControl 300)
[    4.747416] ata5: SATA link down (SStatus 0 SControl 300)
[    5.461077] ata5: SATA link down (SStatus 0 SControl 300)
[    6.185068] ata5: SATA link down (SStatus 0 SControl 300)
[    6.913043] ata5: SATA link down (SStatus 0 SControl 300)
[    7.637054] ata5: SATA link down (SStatus 0 SControl 300)
[    8.361129] ata5: SATA link down (SStatus 0 SControl 300)
[    9.081132] ata5: SATA link down (SStatus 0 SControl 300)
[    9.805046] ata5: SATA link down (SStatus 0 SControl 300)
[   10.517028] ata5: SATA link down (SStatus 0 SControl 300)
[   11.237033] ata5: SATA link down (SStatus 0 SControl 300)
[   11.952977] ata5: SATA link down (SStatus 0 SControl 300)
[   12.677074] ata5: SATA link down (SStatus 0 SControl 300)

 

That will keep going as long as I have a drive plugged into ata5. However, the same drive plugged into any other spot works just fine. To help narrow down the problem, I disconnected the SATA cables from spots 4 and 5 and switched them. Now spot 5 (plugged into 4) works. It looks like it's not a problem with the connection or the drive, but rather with the ata5 connection on the board. Any ideas on what I should do?

 

Link to comment
Share on other sites

Hi,

Are there any plans to make a toddler-proof version of the front grille,  that will cover the buttons? Currently I just applied some lofi containment by simply flipping the front grille so it covers the front panel. Perhaps some snap-in plexiglass for the panel cutout, with a little doorknob type of thing?

Have not checked if the buttons can be disabled in software yet (https://wiki.kobol.io/helios64/button/), perhaps the PMIC can be programmed in user space?

Groetjes,

P_20201101_113102.jpg

Link to comment
Share on other sites

9 hours ago, wurmfood said:

I'm having a problem where ata5 on my device isn't responding. From dmesg I get the following:


[    3.499599] ata1: SATA link down (SStatus 0 SControl 300)
[    3.811439] ata2: SATA link down (SStatus 0 SControl 300)
[    4.123422] ata3: SATA link down (SStatus 0 SControl 300)
[    4.435415] ata4: SATA link down (SStatus 0 SControl 300)
[    4.747416] ata5: SATA link down (SStatus 0 SControl 300)
[    5.461077] ata5: SATA link down (SStatus 0 SControl 300)
[    6.185068] ata5: SATA link down (SStatus 0 SControl 300)
[    6.913043] ata5: SATA link down (SStatus 0 SControl 300)
[    7.637054] ata5: SATA link down (SStatus 0 SControl 300)
[    8.361129] ata5: SATA link down (SStatus 0 SControl 300)
[    9.081132] ata5: SATA link down (SStatus 0 SControl 300)
[    9.805046] ata5: SATA link down (SStatus 0 SControl 300)
[   10.517028] ata5: SATA link down (SStatus 0 SControl 300)
[   11.237033] ata5: SATA link down (SStatus 0 SControl 300)
[   11.952977] ata5: SATA link down (SStatus 0 SControl 300)
[   12.677074] ata5: SATA link down (SStatus 0 SControl 300)

 

That will keep going as long as I have a drive plugged into ata5. However, the same drive plugged into any other spot works just fine. To help narrow down the problem, I disconnected the SATA cables from spots 4 and 5 and switched them. Now spot 5 (plugged into 4) works. It looks like it's not a problem with the connection or the drive, but rather with the ata5 connection on the board. Any ideas on what I should do?

 

I'm having the same issue with SATA 2 port.
 

Spoiler



[    3.995846] ata2: SATA link down (SStatus 0 SControl 300)
[  610.677957] ata2: SATA link down (SStatus 0 SControl 300)
[  616.008066] ata2: SATA link down (SStatus 0 SControl 300)
[  621.383941] ata2: SATA link down (SStatus 0 SControl 300)
[ 8270.418638] ata2: SATA link down (SStatus 0 SControl 300)
[ 8275.824876] ata2: SATA link down (SStatus 0 SControl 300)
[ 8281.200684] ata2: SATA link down (SStatus 0 SControl 300)
[10315.187401] ata2: SATA link down (SStatus 0 SControl 300)


 

Disk is absolutely fine. Could the port be blown?

Link to comment
Share on other sites

Apparently Kobol team recommended to stay off the new kernels and to use their recommended ones. The problem is that every few days I get the Red System error led flashing and the box is not reachable so I have to force restart it so I can have access to it again 

 

so far I get this : 

while seeing the dmessages I get : 

[ 0.017590] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[ 2.656869] Serial: AMBA driver
[ 2.658516] cacheinfo: Unable to detect cache hierarchy for CPU 0
[ 2.670600] loop: module loaded
[ 3.034951] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[ 3.040546] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 3.040753] pci 0000:01:00.0: [197b:0585] type 00 class 0x010601
[ 3.040874] pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
[ 3.040882] pci 0000:01:00.0: reg 0x10: [io size 0x0080]
[ 3.040917] pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
[ 3.040925] pci 0000:01:00.0: reg 0x14: [io size 0x0080]
[ 3.040959] pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
[ 3.040966] pci 0000:01:00.0: reg 0x18: [io size 0x0080]
[ 3.041000] pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
[ 3.041007] pci 0000:01:00.0: reg 0x1c: [io size 0x0080]
[ 3.041041] pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
[ 3.041048] pci 0000:01:00.0: reg 0x20: [io size 0x0080]
[ 3.041083] pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x00001fff]
[ 3.041118] pci 0000:01:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref]
[ 3.041156] pci 0000:01:00.0: Max Payload Size set to 256 (was 128, max 512
[ 3.047177] pci_bus 0000:01: busn_res: [bus 01-1f] end is updated to 01
[ 3.047208] pci 0000:00:00.0: BAR 14: assigned [mem 0xfa000000-0xfa0fffff]
[ 3.047232] pci 0000:01:00.0: BAR 6: assigned [mem 0xfa000000-0xfa00ffff pref]
[ 3.047242] pci 0000:01:00.0: BAR 5: assigned [mem 0xfa010000-0xfa011fff]
[ 3.047262] pci 0000:01:00.0: BAR 0: no space for [io size 0x0080]
[ 3.047270] pci 0000:01:00.0: BAR 0: failed to assign [io size 0x0080]
[ 3.047278] pci 0000:01:00.0: BAR 1: no space for [io size 0x0080]
[ 3.047285] pci 0000:01:00.0: BAR 1: failed to assign [io size 0x0080]
[ 3.047292] pci 0000:01:00.0: BAR 2: no space for [io size 0x0080]
[ 3.047299] pci 0000:01:00.0: BAR 2: failed to assign [io size 0x0080]
[ 3.047306] pci 0000:01:00.0: BAR 3: no space for [io size 0x0080]
[ 3.047313] pci 0000:01:00.0: BAR 3: failed to assign [io size 0x0080]
[ 3.047320] pci 0000:01:00.0: BAR 4: no space for [io size 0x0080]
[ 3.047327] pci 0000:01:00.0: BAR 4: failed to assign [io size 0x0080]
[ 3.047338] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 9.989921] input: adc-keys as /devices/platform/adc-keys/input/input1
[ 10.059999] rk_gmac-dwmac fe300000.ethernet: IRQ eth_wake_irq not found
[ 10.060011] rk_gmac-dwmac fe300000.ethernet: IRQ eth_lpi not found
[ 10.060164] rk_gmac-dwmac fe300000.ethernet: PTP uses main clock
[ 10.060327] rk_gmac-dwmac fe300000.ethernet: clock input or output? (input).
[ 10.060339] rk_gmac-dwmac fe300000.ethernet: TX delay(0x28).
[ 10.060349] rk_gmac-dwmac fe300000.ethernet: RX delay(0x20).
[ 10.060364] rk_gmac-dwmac fe300000.ethernet: integrated PHY? (no).
[ 10.060450] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed
[ 10.060458] rk_gmac-dwmac fe300000.ethernet: clock input from PHY
[ 10.065534] rk_gmac-dwmac fe300000.ethernet: init for RGMII

11.571344] dw-apb-uart ff1a0000.serial: forbid DMA for kernel console
[ 13.327166] usb 2-1.4: reset SuperSpeed Gen 1 USB device number 3 using xhci-hcd
[ 13.730681] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: user_xattr,acl
[ 13.875498] zram: Added device: zram0
[ 13.876283] zram: Added device: zram1
[ 13.876953] zram: Added device: zram2
[ 13.916911] zram1: detected capacity change from 0 to 1992331264
[ 14.049513] cdn-dp fec00000.dp: Direct firmware load for rockchip/dptx.bin failed with error -2
[ 14.049764] rockchip-drm display-subsystem: [drm] Cannot find any crtc or sizes
[ 14.977397] Adding 1945632k swap on /dev/zram1. Priority:5 extents:1 across:1945632k SSFS
[ 15.109777] zram0: detected capacity change from 0 to 52428800
[ 16.065460] cdn-dp fec00000.dp: Direct firmware load for rockchip/dptx.bin failed with error -2
[ 17.158086] systemd[1]: Started Armbian ZRAM config.
[ 17.163966] systemd[1]: Starting Armbian memory supported logging...
[ 17.224816] EXT4-fs (zram0): mounted filesystem without journal. Opts: discard
[ 17.224855] ext4 filesystem being mounted at /var/log supports timestamps until 2038 (0x7fffffff)
[ 19.275437] systemd[1]: Started Armbian memory supported logging.
[ 19.280814] systemd[1]: Starting Journal Service...
[ 19.430698] systemd[1]: Started Journal Service.
[ 19.478887] systemd-journald[727]: Received request to flush runtime journal from PID 1
[ 19.603930] random: crng init done
[ 19.603939] random: 7 urandom warning(s) missed due to ratelimiting
[ 19.852787] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 19.857488] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 19.858097] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 19.858111] cfg80211: failed to load regulatory.db
[ 19.928741] rk_gmac-dwmac fe300000.ethernet eth0: PHY [stmmac-0:00] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
[ 19.934085] rk_gmac-dwmac fe300000.ethernet eth0: No Safety Features support found
[ 19.934109] rk_gmac-dwmac fe300000.ethernet eth0: PTP not supported by HW
[ 19.934122] rk_gmac-dwmac fe300000.ethernet eth0: configuring for phy/rgmii link mode
[ 20.112591] r8152 2-1.4:1.0 eth1: carrier on
[ 20.193411] cdn-dp fec00000.dp: Direct firmware load for rockchip/dptx.bin failed with error -2
[ 21.368123] NFSD: Using UMH upcall client tracking operations.
[ 21.368136] NFSD: starting 90-second grace period (net f00000a1)
[ 13.462038] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256


and many many others. I don't see network activity as well on the front panel, is that pending as well ? It would be nice to have something functional that does not crash . Is this a hardware issue or just software issue ? 

Running your image 5.8.14-rockchip64 #20.08.10 SMP PREEMPT Tue Oct 13 16:58:01 CEST 2020 aarch64 GNU/Linux

Link to comment
Share on other sites

On 10/19/2020 at 9:47 AM, gprovost said:

 

We will do an announcement soon but we realized we made a design mistake on the Helios64 2.5GbE port (LAN2 | eth1) which makes it not perform well in 1000Mb/s mode, however no impact in 2500baseT mode.

 

Regarding your iperf test, seems a bit funny. You sure your routing table is correct for traffic to go really to the expected interface since you are on the same subnet.

Can you do the same tests, but for each test disconnect the cable of the interface you are not testing.... or at least use different subnet.

Also can you check interface speed with ethtool (ethtool eth0 and ethtool eth1).

Could we please know what is the design mistake and at least that means that we cannot use the second port ? so we have a 2.5G port and if we want to use it as 1Gbps it fails I assume, is there any fix for this or not ? as I'm currently using this as 1Gbps connection and it seems that every few days I get a kernel crash or something as the system error led gets blinking and I do have to keep on force restart it from the button to be able to have access to it. 

 

The funny part is that I have both connections on but once it fails I cannot reach to it from any network so completely dead .

Link to comment
Share on other sites

14 minutes ago, AurelianRQ said:

Could we please know what is the design mistake and at least that means that we cannot use the second port ? so we have a 2.5G port and if we want to use it as 1Gbps it fails I assume, is there any fix for this or not ? as I'm currently using this as 1Gbps connection and it seems that every few days I get a kernel crash or something as the system error led gets blinking and I do have to keep on force restart it from the button to be able to have access to it. 

 

The funny part is that I have both connections on but once it fails I cannot reach to it from any network so completely dead .

Just did a test and while on eth0 I get : 

 

[ ID] Interval           Transfer     Bitrate         Retr

[  5]   0.00-30.00  sec  3.18 GBytes   912 Mbits/sec  876             sender

[  5]   0.00-30.00  sec  3.18 GBytes   911 Mbits/sec                  receiver 

 

which is normal, same test on eth1 I get : 

[ ID] Interval           Transfer     Bitrate         Retr

[  5]   0.00-30.00  sec   403 MBytes   113 Mbits/sec  5610             sender

[  5]   0.00-30.00  sec   402 MBytes   113 Mbits/sec                  receiver

 

And it seems between eth1 of both Helios 64 I get : 

 

[ ID] Interval           Transfer     Bitrate         Retr

[  5]   0.00-30.00  sec   348 MBytes  97.3 Mbits/sec  5817             sender

[  5]   0.00-30.00  sec   348 MBytes  97.2 Mbits/sec                  receiver

 

which Is a total mess and I guess I cannot use this connection. 

Link to comment
Share on other sites

48 minutes ago, AurelianRQ said:

which Is a total mess and I guess I cannot use this connection. 

you could get yourself a 2.5gbe switch (in case you need both connections, eg for a management net). they are not expensive anymore

Link to comment
Share on other sites

Hi all,

 

Last week I rendered my device unbootable by upgrading to 20.08.13 (it is stuck in U-boot rather than in starting the kernel though). This is 'known territory' though. So starting again from a vanilla install, 20.08.10 gets stuck in U-boot too, with the following error:

 

Spoiler

In
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
change freq to 416MHz 0,1
Channel 0: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
Channel 1: LPDDR4,416MHz
Bus Width=32 Col=10 Bank=8 Row=16 CS=1 Die Bus-Width=16 Size=2048MB
256B stride
channel 0
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 1
CS = 0
MR0=0x18
MR4=0x1
MR5=0x1
MR8=0x10
MR12=0x72
MR14=0x72
MR18=0x0
MR19=0x0
MR24=0x8
MR25=0x0
channel 0 training pass!
channel 1 training pass!
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
change freq to 856MHz 1,0
ch 0 ddrconfig = 0x101, ddrsize = 0x40
ch 1 ddrconfig = 0x101, ddrsize = 0x40
pmugrf_os_reg[2] = 0x32C1F2C1, stride = 0xD
ddr_set_rate to 328MHZ
ddr_set_rate to 666MHZ
ddr_set_rate to 928MHZ
channel 0, cs 0, advanced training done
channel 1, cs 0, advanced training done
ddr_set_rate to 416MHZ, ctl_index 0
ddr_set_rate to 856MHZ, ctl_index 1
support 416 856 328 666 928 MHz, current 856MHz
OUT
Boot1: 2019-03-14, version: 1.19
CPUId = 0x0
ChipType = 0x10, 253
SdmmcInit=2 0
BootCapSize=100000
UserCapSize=14910MB
FwPartOffset=2000 , 100000
mmc0:cmd5,20
SdmmcInit=0 0
BootCapSize=0
UserCapSize=14792MB
FwPartOffset=2000 , 0
StorageInit ok = 66004
SecureMode = 0
SecureInit read PBA: 0x4
SecureInit read PBA: 0x404
SecureInit read PBA: 0x804
SecureInit read PBA: 0xc04
SecureInit read PBA: 0x1004
SecureInit read PBA: 0x1404
SecureInit read PBA: 0x1804
SecureInit read PBA: 0x1c04
SecureInit ret = 0, SecureMode = 0
atags_set_bootdev: ret:(0)
GPT 0x3380ec0 signature is wrong
recovery gpt...
GPT 0x3380ec0 signature is wrong
recovery gpt fail!
LoadTrust Addr:0x4000
No find bl30.bin
No find bl32.bin
Load uboot, ReadLba = 2000
Load OK, addr=0x200000, size=0xded88
RunBL31 0x40000
NOTICE:  BL31: v1.3(debug):42583b6
NOTICE:  BL31: Built : 07:55:13, Oct 15 2019
NOTICE:  BL31: Rockchip release version: v1.1
INFO:    GICv3 with legacy support detected. ARM GICV3 driver initialized in EL3
INFO:    Using opteed sec cpu_context!
INFO:    boot cpu mask: 0
INFO:    plat_rockchip_pmu_init(1190): pd status 3e
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9


U-Boot 2020.07-armbian (Sep 23 2020 - 17:44:09 +0200)

SoC: Rockchip rk3399
Reset cause: POR
DRAM:  3.9 GiB
PMIC:  RK808
SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
MMC:   mmc@fe320000: 1, sdhci@fe330000: 0
Loading Environment from MMC... unable to select a mode
*** Warning - No block device, using default environment

In:    serial
Out:   serial
Err:   serial
Model: Helios64
Revision: 1.2 - 4GB non ECC
Net:   eth0: ethernet@fe300000
scanning bus for devices...
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot/boot.scr
3185 bytes read in 5 ms (622.1 KiB/s)
## Executing script at 00500000
Boot script loaded from mmc 1
166 bytes read in 5 ms (32.2 KiB/s)
14089265 bytes read in 599 ms (22.4 MiB/s)
27275776 bytes read in 1156 ms (22.5 MiB/s)
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
No FDT memory address configured. Please configure
the FDT address via "fdt addr <address>" command.
Aborting!
2698 bytes read in 8 ms (329.1 KiB/s)
Applying kernel provided DT fixup script (rockchip-fixup.scr)
## Executing script at 09000000
## Loading init Ramdisk from Legacy Image at 06000000 ...
   Image Name:   uInitrd
   Image Type:   AArch64 Linux RAMDisk Image (gzip compressed)
   Data Size:    14089201 Bytes = 13.4 MiB
   Load Address: 00000000
   Entry Point:  00000000
   Verifying Checksum ... OK
ERROR: Did not find a cmdline Flattened Device Tree
   Loading Ramdisk to f5174000, end f5ee3bf1 ... OK
FDT and ATAGS support not compiled in - hanging
### ERROR ### Please RESET the board ###

Installing to either SD or the eMMC rendered the same issue.

I had to revert to 20.08.4 for my device to properly boot again (from SD, as eMMC does not work yet for this release). Since the 5.8.14 kernel is considered stable, I followed https://blog.kobol.io/2020/10/27/helios64-software-issue/ to upgrade to that kernel from 20.08.4 (using armbian-config). Fearing I would draw in the bad 5.8.16 kernel, I did not update anything else from 20.08.4. Unfortunately, this once again rendered my device unbootable, with the same error message as above. So for me, 20.08.10 seems to be not stable too. The Helios64 wiki claims it is though and many users of this forum seem to have no issues with 20.08.10 either.

After another fresh install of 20.08.4 - and no further upgrades - the Helios 64 is running stable for over a week now (it had been running on this version stable for a month previously too).

So this finally gets me to my question: Does anyone have any idea what goes wrong here? Does anyone have any suggestions on how to debug this? Why is my device unable to boot a fresh image for the 'stable' 20.08.10 release? At no point in the lifetime of the NAS did I tinker with U-boot.

Link to comment
Share on other sites

Subject: Question on disk usage

I completed my build of my Helios64 with 5 12tb WD drives in a Raid 5 array and got it up and running with only minor issues. (I had to manually install Open Media Vault as the box would appear to hang if I attempted to install using armbian-config. )

 

I've transferred over about 16TB of movies without issues and got Plex configured to use it. All is working well!

 

The one issue I've seen is the box hard drive lights will show lots of blinking and activity for several hours when there is nobody or no one  accessing the box at all. If I reboot, the lights/activity start up as soon as it reboots. But, most of the time the lights are solid lit when there is no activity as I would expect.

Is there some kind of housekeeping being done periodically or other reason for the hard drives to be accessed?

 

Kirk

Link to comment
Share on other sites

12 minutes ago, kirkusinnc said:

Subject: Question on disk usage

I completed my build of my Helios64 with 5 12tb WD drives in a Raid 5 array and got it up and running with only minor issues. (I had to manually install Open Media Vault as the box would appear to hang if I attempted to install using armbian-config. )

 

I've transferred over about 16TB of movies without issues and got Plex configured to use it. All is working well!

 

The one issue I've seen is the box hard drive lights will show lots of blinking and activity for several hours when there is nobody or no one  accessing the box at all. If I reboot, the lights/activity start up as soon as it reboots. But, most of the time the lights are solid lit when there is no activity as I would expect.

Is there some kind of housekeeping being done periodically or other reason for the hard drives to be accessed?

 

Kirk

probably your array is still building. what does cat /proc/mdstat show you?

Link to comment
Share on other sites

On my system (kernel 5.8.17) a thread kworker/fusb302_wq was running, permanently eating 18% of one core, and causing the load average to be always at least 1.0. The module fusb30x could not be unloaded, rmmod just hung, although it's use count was zero. After blacklisting the module and rebooting the problem was solved.

Now is the question: what did I disable? Some googling gave me this document

Quote

The FUSB301 is a fullyautonomous Type-C controller optimized for <15W applications.The FUSB301 offers CClogic detection for Source Mode, Sink Mode, Dual Role Port Mode, accessory detection support, and dead battery support. The FUSB301 features anexternal switch pin (SS_SW) to enable an external USB Super Speed Switch without interrupting the processor. The FUSB301 features ultra low power during operation, and an ultra thin, 10-Lead TMLP package.

So it has only to do with the USB-C port? The serial connection is still working fine.

Link to comment
Share on other sites

1 hour ago, RockBian said:

what did I disable?

power delivery through USB-C. 

 

Oh and mode switching.  So if it set to the mode you want by default it should be ok, but the better question is why was it eating clock cycles

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines