SteeMan Posted May 6 Posted May 6 @ebin-dev Thanks for the explanation (i failed to notice that you were setting io_is_busy to a different value). I know that you are still working to stabilize this board. But at the end of this process, it should be possible to submit a PR to incorporate your settings into the base builds. Thus I'm looking for things that will make that difficult. I just added a comment to PR6507 stating that the way io_is_busy is now set doesn't allow it to be overridden if needed. 0 Quote
ebin-dev Posted May 7 Author Posted May 7 (edited) On 5/6/2024 at 10:52 PM, SteeMan said: @ebin-dev Thanks for the explanation (i failed to notice that you were setting io_is_busy to a different value). I know that you are still working to stabilize this board. But at the end of this process, it should be possible to submit a PR to incorporate your settings into the base builds. Thus I'm looking for things that will make that difficult. I just added a comment to PR6507 stating that the way io_is_busy is now set doesn't allow it to be overridden if needed. Thanks for the initiative with the PR. Here is what happens: The cpufreqpolicy sets a generic value for the sampling_rate of 200000 (this is how often the governor’s worker routine should run, in microseconds). The sampling_rate should be set to a nominal value close to the cpuinfo_transition_latency (49000 and 40000 nanoseconds for the clusters of rk3399) such that it is effectively about 1000 times as large. For rk3399 boards the cluster sampling_rates of 49000 (big) and 40000 (little) are very responsive and not too demanding for the cpus. In that case no problems occur if io_is_busy is set to 1 ! I added a comment to PR6507 and requested that the sampling_rate should be a parameter (per cpu cluster) and be set to the value of the cpuinfo_transition_latency of that cpu. The fine tuned values for Helios64 are those (io_is_busy set to 1 again, sampling_rate set to more responsive values): for cpufreqpolicy in 0 4 ; do echo 1 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/io_is_busy echo 25 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/up_threshold echo 10 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_down_factor echo $(cat /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/cpuinfo_transition_latency) > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_rate done Edit: With these changes the timeout issues of the 2.5G interface are resolved for 6.6.29 and 6.6.30 too. Helios64 is one of the best arm based NAS systems I have seen so far. I am not going to replace it anytime soon. Edited May 15 by ebin-dev 1 Quote
BipBip1981 Posted May 9 Posted May 9 (edited) Hi, i use on my own build from scratch with framework official Armbian 24.05 with kernel 6.6.30 and DTB file send by ebin-dev here with 408-1800mhz ondemand governor and for the first time since I have it, it is stable and usable with my case of uses and it pass my pattern stability test ! Many thanks to ebin-dev for the great work. Edited May 9 by BipBip1981 0 Quote
Solution ebin-dev Posted May 14 Author Solution Posted May 14 (edited) There are new Armbian 24.05 images available on the Helios64 download page: both images Bookworm minimal and Jammy Desktop are based on linux 6.6.30 (download them !). Again, the rtl_nic firmware (in /lib/firmware/rtl_nic) should be replaced by the version downloaded from git.kernel.org, such that the 2.5G LAN interface works correctly. I would also recommend to copy the dtb attached below to /boot/dtb/rockchip/rk3399-kobol-helios64.dtb (execute 'update-initramfs -u' after that). It includes the 75mV bump of the opp states for the fast cores as suggested by @prahal, it enables the L2 cache info and it enables hs400 speed on emmc again. In particular the 75mV bump has a very positive effect on stability. The bootloader that comes with it would appear to contain the Rockchip DDR blob. It should be fine. If you have an issue with u-boot, just flash linux-u-boot-edge-helios64_22.02.1_arm64 as recommended before. The cpufreq ondemand governor is still the best choice. Good settings are # cat /etc/default/cpufrequtils ENABLE=true MIN_SPEED=600000 MAX_SPEED=1800000 GOVERNOR=ondemand Enjoy. P.S.: If you like a system more responsive to server tasks or push the 2.5G interface to the limits, some fine tuning is helpful. rk3399-kobol-helios64.dtb-6.6.30-L2-hs400-opp Edited May 20 by ebin-dev 3 Quote
SIGSEGV Posted May 16 Posted May 16 (edited) Thank you @ebin-dev for the dtb file & @prahal for figuring out what changes to make. I just reinstalled my NAS with the 6.6.30 kernel and the bookwork image. The dtb file is the only change I've made to configuration. Haven't had any issues yet - so far recompiling ZFS KMOD 2.2.4 to run with new kernel hasn't crashed - MAX CPU freq. for the BIG cores is 1.8GHz NFS and SMB are working fine. sbc-bench is stable as well. Edited May 18 by SIGSEGV 0 Quote
snakekick Posted May 17 Posted May 17 (edited) Thank you @ebin-dev same here, 6.6.29 (much more) stable with your dtb. I know we are all the community and can make changes but... is it possible for you to make a ticket/change request or as its called to add your changes to the main armbian image? so that we have a good running Helios out of the box? That would be fantastic! Edited May 17 by snakekick 0 Quote
ebin-dev Posted May 17 Author Posted May 17 Actually the changes to the dtb were figured out and were proposed by @prahal and therefore it would be up to Prahal to submit a PR. From my perspective there is nothing against a PR right now as the changes are sufficiently tested and have a very positive effect on stability. 2 Quote
grek Posted May 17 Posted May 17 Thanks guys, Last time I used Armbian_23.5.4 with 5.15 kernel , and I had about 60 days of uptime. Today I switched for Armbian 24.05 with DTB patched . Im using ZFS so only what Im do, is working with older packages, and do a hold with future updates I have SMB and NFS configured, I ran some tests and everything looks very good. Even my power consumption is down (I have 2x WD RED 4TB + Exos 18TB + WD RED 14TB + 1 SSD) and now its about 23 W Currently I have 'previous' values of cpu freq: (to have some compare) root@helios64:~# cat /etc/default/cpufrequtils ENABLE=true MIN_SPEED=408000 MAX_SPEED=1200000 GOVERNOR=ondemand 1 Quote
SIGSEGV Posted May 18 Posted May 18 @ebin-dev - one quick question. Applying this script at boot should change the sampling rate to the desired values? for cpufreqpolicy in 0 4 ; do echo 1 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/io_is_busy echo 25 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/up_threshold echo 10 > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_down_factor echo $(cat /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/cpuinfo_transition_latency) > /sys/devices/system/cpu/cpufreq/policy${cpufreqpolicy}/ondemand/sampling_rate done 0 Quote
ebin-dev Posted May 18 Author Posted May 18 (edited) Due to its current status I have disabled armbian-hardware-optimization (some 'volunteers' are needed to fix it) and run the following code in /etc/rc.local instead. Alternatively you could seek the line 'echo 200000' in armbian-hardware-optimization and change the lines around it accordingly. @SIGSEGV Yes - thereby you set the sampling_rates per cpu cluster to 40000 and 51000 on the current kernel (actually by the line starting with 'echo $(cat ...' ). #!/bin/sh -e # # rc.local # # This script is executed at the end of each multiuser runlevel. # Make sure that the script will "exit 0" on success or any other # value on error. # # In order to enable or disable this script just change the execution # bits. # # By default this script does nothing. cd /sys/devices/system/cpu/cpufreq for cpufreqpolicy in 0 4 ; do echo 1 > policy${cpufreqpolicy}/ondemand/io_is_busy echo 25 > policy${cpufreqpolicy}/ondemand/up_threshold echo 10 > policy${cpufreqpolicy}/ondemand/sampling_down_factor echo $(cat policy${cpufreqpolicy}/cpuinfo_transition_latency) > policy${cpufreqpolicy}/ondemand/sampling_rate done for i in $(awk -F":" "/ahci/ {print \$1}" < /proc/interrupts | sed 's/\ //g'); do echo 30 > /proc/irq/$i/smp_affinity done for i in $(awk -F":" "/xhci/ {print \$1}" < /proc/interrupts | sed 's/\ //g'); do echo 20 > /proc/irq/$i/smp_affinity done exit 0 Edited May 18 by ebin-dev 0 Quote
ebin-dev Posted May 18 Author Posted May 18 (edited) 18 hours ago, grek said: Currently I have 'previous' values of cpu freq: (to have some compare) root@helios64:~# cat /etc/default/cpufrequtils ENABLE=true MIN_SPEED=408000 MAX_SPEED=1200000 GOVERNOR=ondemand Just a remark: setting MIN_SPEED to 600 MHz would have NO effect on power consumption but required frequency switching would be less: switching between 408 and 600MHz would be avoided and the system could transition directly from 600 MHz to the highest frequency (essentially switching only between two states for each cpu). The huge power savings are impressive! Edited May 18 by ebin-dev 0 Quote
ebin-dev Posted May 18 Author Posted May 18 (edited) Sorry - I am probably the only one who is interested in network performance of the 2.5G interface (theoretical max 2.35Gbit/s). (iperf 3.17.1 measurements attached) Spoiler # ./iperf3 -c 192.168.xx.30 -p 5201 (client -> helios64) Connecting to host 192.168.xx.30, port 5201 [ 5] local 192.168.xx.54 port 54011 connected to 192.168.xx.30 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 284 MBytes 2.37 Gbits/sec [ 5] 1.01-2.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 2.01-3.01 sec 281 MBytes 2.36 Gbits/sec [ 5] 3.01-4.00 sec 278 MBytes 2.34 Gbits/sec [ 5] 4.00-5.01 sec 280 MBytes 2.34 Gbits/sec [ 5] 5.01-6.01 sec 281 MBytes 2.36 Gbits/sec [ 5] 6.01-7.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 7.01-8.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 8.01-9.01 sec 281 MBytes 2.36 Gbits/sec [ 5] 9.01-10.01 sec 281 MBytes 2.35 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.01 sec 2.74 GBytes 2.35 Gbits/sec sender [ 5] 0.00-10.01 sec 2.74 GBytes 2.35 Gbits/sec receiver iperf Done. # ./iperf3 -c 192.168.xx.30 -p 5201 -R (helios64 -> client) Connecting to host 192.168.xx.30, port 5201 Reverse mode, remote host 192.168.xx.30 is sending [ 5] local 192.168.xx.54 port 54013 connected to 192.168.xx.30 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 276 MBytes 2.30 Gbits/sec [ 5] 1.01-2.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 2.01-3.00 sec 280 MBytes 2.35 Gbits/sec [ 5] 3.00-4.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 4.01-5.01 sec 280 MBytes 2.35 Gbits/sec [ 5] 5.01-6.00 sec 280 MBytes 2.35 Gbits/sec [ 5] 6.00-7.01 sec 281 MBytes 2.35 Gbits/sec [ 5] 7.01-8.00 sec 280 MBytes 2.35 Gbits/sec [ 5] 8.00-9.01 sec 281 MBytes 2.35 Gbits/sec [ 5] 9.01-10.01 sec 280 MBytes 2.35 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.01 sec 2.74 GBytes 2.35 Gbits/sec 16 sender [ 5] 0.00-10.01 sec 2.73 GBytes 2.35 Gbits/sec receiver iperf Done ./iperf3 -c 192.168.xx.30 -p 5201 --bidir (bidirectional) Connecting to host 192.168.xx.30, port 5201 [ 5] local 192.168.xx.54 port 49681 connected to 192.168.xx.30 port 5201 [ 7] local 192.168.xx.54 port 49682 connected to 192.168.xx.30 port 5201 [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-1.01 sec 252 MBytes 2.11 Gbits/sec [ 7][RX-C] 0.00-1.01 sec 206 MBytes 1.72 Gbits/sec [ 5][TX-C] 1.01-2.01 sec 252 MBytes 2.11 Gbits/sec [ 7][RX-C] 1.01-2.01 sec 207 MBytes 1.74 Gbits/sec [ 5][TX-C] 2.01-3.01 sec 244 MBytes 2.05 Gbits/sec [ 7][RX-C] 2.01-3.01 sec 213 MBytes 1.78 Gbits/sec [ 5][TX-C] 3.01-4.01 sec 196 MBytes 1.65 Gbits/sec [ 7][RX-C] 3.01-4.01 sec 223 MBytes 1.87 Gbits/sec [ 5][TX-C] 4.01-5.01 sec 221 MBytes 1.86 Gbits/sec [ 7][RX-C] 4.01-5.01 sec 217 MBytes 1.82 Gbits/sec [ 5][TX-C] 5.01-6.01 sec 206 MBytes 1.73 Gbits/sec [ 7][RX-C] 5.01-6.01 sec 223 MBytes 1.87 Gbits/sec [ 5][TX-C] 6.01-7.01 sec 214 MBytes 1.80 Gbits/sec [ 7][RX-C] 6.01-7.01 sec 220 MBytes 1.84 Gbits/sec [ 5][TX-C] 7.01-8.01 sec 215 MBytes 1.80 Gbits/sec [ 7][RX-C] 7.01-8.01 sec 217 MBytes 1.82 Gbits/sec [ 5][TX-C] 8.01-9.01 sec 187 MBytes 1.57 Gbits/sec [ 7][RX-C] 8.01-9.01 sec 224 MBytes 1.88 Gbits/sec [ 5][TX-C] 9.01-10.01 sec 206 MBytes 1.73 Gbits/sec [ 7][RX-C] 9.01-10.01 sec 218 MBytes 1.83 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate Retr [ 5][TX-C] 0.00-10.01 sec 2.14 GBytes 1.84 Gbits/sec sender [ 5][TX-C] 0.00-10.01 sec 2.14 GBytes 1.84 Gbits/sec receiver [ 7][RX-C] 0.00-10.01 sec 2.12 GBytes 1.82 Gbits/sec 13 sender [ 7][RX-C] 0.00-10.01 sec 2.12 GBytes 1.82 Gbits/sec receiver Edited July 27 by TRS-80 move long content inside spoiler 0 Quote
TDCroPower Posted May 19 Posted May 19 @ebin-dev You are not alone, I also use the 2.5GB interface 😉 0 Quote
grek Posted June 3 Posted June 3 I found ,that in the logs I have some ATA errors ... Could u pls check yours ? uptime is 14 days, and it is 'stable' .... Linux helios64 6.6.30-current-rockchip64 #1 SMP PREEMPT Thu May 2 14:32:50 UTC 2024 aarch64 GNU/Linux 0 Quote
BipBip1981 Posted June 3 Posted June 3 Hi, I had this problem in past, you may try to unrack and rack hardisk or move it if you not use all rack. Problem to my side was bad contact in SATA connector. Bye 0 Quote
Igor Posted June 4 Posted June 4 There are some issues with Helios64 patches for 6.9.y https://github.com/armbian/build/pull/6691 Please check notes. 0 Quote
ebin-dev Posted June 4 Author Posted June 4 Checking that note reveals: Note: at the current status, helios64 board patches have been disabled because the base device tree does not apply on kernel 6.9. So helios64 is cancelled as of linux 6.9.x until someone takes care of the issue. Thank you very much. 0 Quote
prahal Posted June 8 Posted June 8 (edited) Started work on syncing the helios64 dts to upstream for 6.9: https://github.com/prahal/build/tree/helios64-6.9 . I removed the overclock disabling patch as the overclock as it disables an overclock that is at least nowadays not in the included rk3399-opp.dtsi (ie cluster0 has no opp6 and cluster no opp8). It was not a high priority beforehand but as the helios64 dts starts to change, thus carries unnecessary work. The pachtset applies. The helios64 dts compile fine to dtb. Kernel built and booted. (see below for network connection, ie ethernet MAC fixup) There are not many functional changes on my side (there are in upstream dts). There are a few differences with upstream dts I did not bring as I don't know if they are leftover from the initial patch set or new fixups. But this could already have been an issue before 3.9 for most of these changes (there is at least a new upstream change 93b36e1d3748c352a70c69aa378715e6572e51d1 "arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64") I brought forward. I also brought in vcc3v0_sd node "enable-active-high;" and "gpio = <&gpio0 RK_PA1 GPIO_ACTIVE_HIGH>;". Beware. ethernet MAC change (now working as designed). The fact I kept the aliases from upstream fixes the ability for the eth0 (then renamed to end0 by armbian-hardware-optimization) ethernet mac - grabbed from OTP via SPI in u-boot to be applied. Thus if you bound the MAC address that was generated by the kernel instead of from the hardware to a host and IP, you will have to find the new IP assigned by the DHCP server to connect to the helios64. I believe this is fine for edge even if not for current (so should be good for 6.9?). Edited June 8 by prahal warn about MAC ethernet change 10 Quote
ebin-dev Posted June 22 Author Posted June 22 (edited) @prahal Current kernel 6.9.6 works just fine on Helios64 (headless) thanks to your PR. I am still using linux-u-boot-edge-helios64_22.02.1_arm64. Level2 cache-info is now included: # lscpu -C NAME ONE-SIZE ALL-SIZE WAYS TYPE LEVEL SETS PHY-LINE COHERENCY-SIZE L1d 32K 192K 4 Data 1 128 64 L1i 32K 224K 2 Instruction 1 256 64 L2 512K 1.5M 16 Unified 2 512 64 The modified dtb (hs400, modified opp-table-1) is attached below (copy to /boot/dtb/rockchip/rk3399-kobol-helios64.dtb ; execute 'update-initramfs -u'). P.S.: sata performance needs to be addressed and helios64-heartbeat-led.service is not working rk3399-kobol-helios64.dtb-6.9.6-L2-hs400-opp Edited June 25 by ebin-dev 1 Quote
ebin-dev Posted July 10 Author Posted July 10 You may have noticed that Helios64 was moved to the supported section again about two weeks ago: thanks to @prahal who volunteers as a maintainer ! I think this should be appreciated i.e. by reacting to his last message two postings earlier. 2 Quote
snakekick Posted July 18 Posted July 18 i have now a uptime from 71 Days with your dtb file @ebin-dev do you will commit your fixes to the official version? 0 Quote
rjgould Posted July 25 Posted July 25 I thought I was one of the only one out here still making use of one of these. Amazing to see it back in the supported section, thank you @prahal 👏 I need to read through more of the thread but I saw @ebin-dev & @TDCroPower both mentioning the 2.5G network link. Can I ask you both, did you have to apply the solder fix to get it working? (https://blog.kobol.io/2020/11/13/helios64-2-5g-ethernet-issue/) 0 Quote
TDCroPower Posted July 25 Posted July 25 @rjgould Yes, I soldered the pins directly. But I think the fix is for the 2.5GBit port to work with 1GBit connections. 0 Quote
SIGSEGV Posted July 27 Posted July 27 Thank you @prahal!! - for getting us back into supported status. 0 Quote
TRS-80 Posted July 27 Posted July 27 (edited) On 7/25/2024 at 9:06 PM, rjgould said: I thought I was one of the only one out here still making use of one of these. Amazing to see it back in the supported section, thank you @prahal 👏 I don't even own one of these, but I am fascinated by how you guys finally got it stable it sounds like? After all these years. And the fact that Kobol (our once partner) went out of business years ago (I just looked and we are about a month shy of 3 years now). Such triumph of the human will (and this project!), it could almost warm an old cynics heart. EDIT: I updated (removed) the "instability" comments about Helios64 in my NAS article: So, you want to run a file server (aka NAS)? Edited July 27 by TRS-80 linked article update 2 Quote
rjgould Posted July 29 Posted July 29 So I don't know about everyone else, but for myself I've had my helios64 running on Armbian 22.02 with the 5.15.93-rockchip64 kernel for some time. Mine is an on-lan ZFS replication target for my main NAS, i.e. overnight low power local part of my backup strategy. I always knew it was not performing to it's full potential but it doesn't fall over. The biggest challenge I had was getting OpenMediaVault 6 on it with the ZFS plugins from the extras repository. The upgrades around those broke so often I ended up learning Ansible so I could script the steps I was doing by hand. Also, because of the quirks and the "if it breaks, it's going to be painful" situation of it all, I stuck to running from the SD card, which I have an image of at a stable point on my main NAS. Every once in a while I would try to move it all forwards to newer versions, when it doesn't go so well I flash the SD card from the image and start reading the forums, which is what brought me back here 😅 1 Quote
mrjpaxton Posted August 2 Posted August 2 Oh wow, been a while since I checked this thread. Apparently I'm still on a 2023 build. If I fully upgrade (via eMMC reflash) to Armbian 24.5.3 with the 6.6.36 kernel, will I still need the modified DTB file for improved stability, or has the DTB already been patched? 0 Quote
ebin-dev Posted August 3 Author Posted August 3 (edited) On 8/2/2024 at 7:06 AM, mrjpaxton said: If I fully upgrade (via eMMC reflash) to Armbian 24.5.3 with the 6.6.36 kernel, will I still need the modified DTB file for improved stability, or has the DTB already been patched? It seems that the patches are still missing. Until this is fixed you can use my dtb attached for the kernel 6.6.x branch (just copy it to /boot/dtb/rockchip/rk3399-kobol-helios64.dtb). The linux files (linux-image, linux-header, linux-dtb) can be downloaded from beta.armbian.com (to be installed with 'dpkg -i linux*). Copy the attached 6.6 dtb to its location and run 'update-initramfs -u' and reboot. Thats it. I am on 6.10.2 right now - I also include the dtb for the 6.10 branch below. rk3399-kobol-helios64.dtb-6.6.34-L2-hs400-opp rk3399-kobol-helios64.dtb-6.10.2-L2-hs400-opp Edited August 3 by ebin-dev 1 Quote
prahal Posted August 3 Posted August 3 mind the voltage changes I am uneasy to push to main until I either get no more crashes for a long long time or the issue is nailed down to its cause. I might revisit this choice. I plan to have the emmc hs400 fix in Armbian and vanilla linux. Probably not for the august Armbian release but I expect for the next one. 4 Quote
mrjpaxton Posted August 4 Posted August 4 (edited) Quote mind the voltage changes I am uneasy to push to main until I either get no more crashes for a long long time or the issue is nailed down to its cause. I might revisit this choice. I plan to have the emmc hs400 fix in Armbian and vanilla linux. Probably not for the august Armbian release but I expect for the next one. Yes, I have also experienced a crash without the patch. It crashed when I tried to attach a HDD hard drive with a SATA-to-USB3 device for backup (I made sure the AC adapter was used for it), so it is more "stable" and responsive than before, but not completely yet. Would it be helpful to disable Armbian's ramlog and get journald to write any dumps to storage so that I can check the logs last boot, or are you guys already aware of *all* of the issues that cause these random crashes/freezes? Edited August 4 by mrjpaxton 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.