magostinelli Posted March 25, 2024 Posted March 25, 2024 In my case, none of the suggest kernel works, the system freeze with the red led blinkig. Sometime after 10 seconds, sometime, after a few minutes. How is it possible? 0 Quote
ebin-dev Posted March 25, 2024 Author Posted March 25, 2024 1 hour ago, magostinelli said: In my case, none of the suggest kernel works, the system freeze with the red led blinkig. Sometime after 10 seconds, sometime, after a few minutes. How is it possible? Did you check that the recommended bootloader is flashed to emmc ? And that the rtl_nic firmware is updated ? Did you start from the Armbian Image with kernel 6.1.36 ? 0 Quote
magostinelli Posted March 26, 2024 Posted March 26, 2024 (edited) Started from 6.1.36 (the only stable kernel form me). How can i check bootloader and rtl_nic firmware? For the bootloader it was updated when I tryed the latest kernel from armbian. Thanks. EDIT: I reinstalled the 6.6.8 and now seems to be stable, i will test it. Edited March 27, 2024 by magostinelli update 0 Quote
Trillien Posted March 31, 2024 Posted March 31, 2024 Hi @ebin-dev, The link to https://imola.armbian.com/apt/pool/main/l/linux-u-boot-helios64-edge/linux-u-boot-edge-helios64_22.02.1_arm64.deb is dead. There's another version on Imola : linux-u-boot-helios64-edge_24.2.1_arm64__2022.07-Se092-Pe990-H8c72-V65aa-B11a8-R448a.deb Do you know whether we can use it instead? 0 Quote
Trillien Posted March 31, 2024 Posted March 31, 2024 (edited) Hi @TDCroPower About rtl_nic firmware, as instructed I've downloaded the 9 files and copied them into /lib/firmware/rtl_nic I've got an error at startup : Apr 01 00:10:55 helios64 kernel: r8152 2-1.4:1.0: checksum fail Apr 01 00:10:55 helios64 kernel: r8152 2-1.4:1.0: unable to load firmware patch rtl_nic/rtl8156a-2.fw (-14) Do you encounter this error too? Edited March 31, 2024 by Trillien 0 Quote
ebin-dev Posted April 1, 2024 Author Posted April 1, 2024 9 hours ago, Trillien said: Do you know whether we can use it instead? I replaced the link to linux-u-boot-edge and to the rtl_nic firmware. 0 Quote
Trillien Posted April 1, 2024 Posted April 1, 2024 Thank you ebin-dev. Works fine now ! For info, after moving the OS to emmc, the system failed starting for 3 times. It encountered Internal error: Oops - undefined instruction during Linux 6.6.8 startup. I still have an issue with rk3288-crypto. But it doesn't seem to impact the system a lot : Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: will run requests pump with realtime priority Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register ecb(aes) as ecb-aes-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register cbc(aes) as cbc-aes-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register ecb(des) as ecb-des-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register cbc(des) as cbc-des-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register ecb(des3_ede) as ecb-des3-ede-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register cbc(des3_ede) as cbc-des3-ede-rk Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register sha1 as rk-sha1 Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register sha256 as rk-sha256 Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b0000.crypto: Register md5 as rk-md5 Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b8000.crypto: can't request region for resource [mem 0xff8b8000-0xff> Apr 01 14:18:12 helios64 kernel: rk3288-crypto ff8b8000.crypto: Crypto Accelerator not successfully registered Apr 01 14:18:12 helios64 kernel: rk3288-crypto: probe of ff8b8000.crypto failed with error -16 0 Quote
TDCroPower Posted April 2, 2024 Posted April 2, 2024 I had a little time to test again today and had an idea. I simply install OMV on the eMMC + eMMC combination and then transfer it to the SSD. The installation worked without any problems but unfortunately armbian-config does not offer the transfer from eMMC + eMMC to eMMC + SSD. So I also discarded this idea and tried to install everything on my microSD and then switch to eMMC + SSD. Well... the installation fails here too! The only combination to install OMV cleanly is on eMMC + eMMC, everything else fails and then has the known strange cpufrequtils entries. @ebin-dev do you have any ideas as to what is causing the problem and how I can fix it? I'm slowly running out of good ideas 🥲 0 Quote
Trillien Posted April 2, 2024 Posted April 2, 2024 I've just been defeated by my Helios64 trying to have Linux 6.6.8 on emmc. Actually, as I move the OS from the SD card and then restart the device from emmc, I encounter unexpected reboots and freezes. So, I'm resigned and will try to have things more stable using the SD card. Keep you informed. 0 Quote
ebin-dev Posted April 3, 2024 Author Posted April 3, 2024 19 hours ago, Trillien said: Actually, as I move the OS from the SD card and then restart the device from emmc, I encounter unexpected reboots and freezes. Just flash the bootloader again to emmc. Double check that the UUIDs in /etc/fstab and in /boot/armbianEnv are correct. And double check that all your log files are present on emmc (otherwise rsync them manually): some apps do not recreate them once installed. If you helios64 is still not running smoothly, reinstall 6.6.8 on emmc, or alternatively 6.1.71, or 5.15.93. 0 Quote
Trillien Posted April 4, 2024 Posted April 4, 2024 So, I've just tried the procedure once more : I flashed the bookworm image onto a sd card. I booted my Helios64 using the sd card (It started like a charm !), and I downloaded every pieces of the cake. I set the CPU limit using "on-demand" I disable the armbian.list for apt update I added the files at /lib/firmware/rtl_nic/ I replaced the dtb file at /boot/dtb/rockchip for hs400 support and L2 cache I set the nic offload options At this point I copied the sd card to emmc using armbian-config. However, as it asks to Power Off, I refused and quitted. Actually, during its last step of the copy, armbian-config automatically writes its version of u-boot onto emmc. Thus, even if we've changed the bootloader on mmcblk1 (emmc), it's then overwritten during the copy to emmc. After copying the OS onto emmc, but still using sd card, I changed the bootloader on mmcblk1 (emmc) I shutdown the system, removed the sd card, and boot on emmc (still without any issue) I switched to linux 6.6.8 and reboot. Once more, the system starts without trouble. I finally ran sbc-bench Hi @TDCroPower, I really thank you for your post which helped me to finally achieve a clean install of linux 6.6.8 on emmc after hours of frustration. To guide people without confusion on the result, you may probably reorder steps and have the Bootloader update (6.) after Copying from sd card to emmc (10). 0 Quote
TDCroPower Posted April 4, 2024 Posted April 4, 2024 @Trillien unfortunately, I can no longer edit my instructions in the post, otherwise I would optimize them further with the new tips. 0 Quote
ebin-dev Posted April 4, 2024 Author Posted April 4, 2024 1 hour ago, Trillien said: Actually, during its last step of the copy, armbian-config automatically writes its version of u-boot onto emmc. Thus, even if we've changed the bootloader on mmcblk1 (emmc), it's then overwritten during the copy to emmc. This is all explained in detail in the message linked next to the recommended u-boot version. 0 Quote
TDCroPower Posted April 8, 2024 Posted April 8, 2024 now that OMV is finally installable on the Bookworm image from @ebin-dev (hopefully a proper fix will come from the OMV devs soon) I would like to share the changes you have made to the system for the Helios64. Here are my optimizations... CPU Frequency: Since only 2 of the 6 cores of the Helios64 can reach the max of "1800000", the general max setting must unfortunately be set to a value below which all 6 cores can reach. CPU 1: Microarchitecture: Cortex-A53 Max Frequency: 1.416 GHz Cores: 4 cores Features: NEON,SHA1,SHA2,AES,CRC32 CPU 2: Microarchitecture: Cortex-A72 Max Frequency: 1.800 GHz Cores: 2 cores Features: NEON,SHA1,SHA2,AES,CRC32 Theoretically, all 6 cores can handle a max of "1416000", but unfortunately there have also been frequent problems with this, so a value of max "1200000" is recommended for a stable system. root@helios64:~# cat /etc/default/cpufrequtils ENABLE="true" GOVERNOR="ondemand" MAX_SPEED="1200000" MIN_SPEED="408000" root bashrc In order to be able to use the bashrc with the root user, the following .profile file must be created in the /root directory with the following content... root@helios64:~# cat .profile if [ "$BASH" ]; then if [ -f ~/.bashrc ]; then . ~/.bashrc fi fi mesg n || true Docker Fix to fix the problem the line with ledtrig-netdev must be commented out so that containers can be started... root@helios64:~# cat /etc/modules-load.d/modules.conf lm75 #ledtrig-netdev Fancontrol: Here I have the following optimizations of the Fancontrol settings from a user from here... root@helios64:~# cat /etc/fancontrol # Helios64 PWM Fan Control Configuration # Temp source : /dev/thermal-cpu INTERVAL=10 FCTEMPS=/dev/fan-p6/pwm1=/dev/thermal-cpu/temp1_input /dev/fan-p7/pwm1=/dev/thermal-cpu/temp1_input MINTEMP=/dev/fan-p6/pwm1=45 /dev/fan-p7/pwm1=45 MAXTEMP=/dev/fan-p6/pwm1=110 /dev/fan-p7/pwm1=110 MINSTART=/dev/fan-p6/pwm1=25 /dev/fan-p7/pwm1=25 MINSTOP=/dev/fan-p6/pwm1=20 /dev/fan-p7/pwm1=20 #MINPWM=20 MINPWM=/dev/fan-p6/pwm1=20 /dev/fan-p7/pwm1=20 MAXPWM=/dev/fan-p6/pwm1=120 /dev/fan-p7/pwm1=120 after the change restart the fancontrol.service with... root@helios64:~# systemctl restart fancontrol.service LED State: If the flashing of the blue status LEDs of the Helios64 bothers you, you can simply stop the service, then it's quiet... root@helios64:~# systemctl stop armbian-led-state.service Alternatively, you can also just stop the status LED from flashing heartbeat by changing the value from heartbeat to none... root@helios64:~# cat /etc/armbian-leds.conf [/sys/class/leds/helios64::status] trigger=none brightness=0 invert=0 Do you have other optimizations for the system or other values like me? Please share them here 😉 0 Quote
ebin-dev Posted April 10, 2024 Author Posted April 10, 2024 I am using the standard settings - no need to reduce the frequency of the fast cores. Anything related to OMV should go to another thread ! # cat /etc/default/cpufrequtils ENABLE=true MIN_SPEED=408000 MAX_SPEED=1800000 GOVERNOR=ondemand 0 Quote
TDCroPower Posted April 10, 2024 Posted April 10, 2024 @ebin-dev We have already created an extra thread for OMV and have described the topic there... you haven't had any problems with the Max frequency settings so far? That would be really good, of course, as there were problems with Bullseye when using higher frequencies. 0 Quote
BipBip1981 Posted April 11, 2024 Posted April 11, 2024 Hello, Try: 1200MHZ 1200MHZ Performance The only stable settings for me since passed to 6.6.X Kernel with Armbian Bookworm 0 Quote
TDCroPower Posted April 12, 2024 Posted April 12, 2024 @BipBip1981 you have set min and max to 1200MHZ? However, this means that you have a considerable power consumption, as the processors are permanently set to 1200MHZ permanently? 0 Quote
BipBip1981 Posted April 12, 2024 Posted April 12, 2024 (edited) Hello, Yes i set fixed frequency to 1200. For power consumption, i my case off use it's not a problem because i start when use it and stop helios64 when i don't use. My big problem isn't unstability during long time but during intense use. For example: 400Mhz - 1400mhz is okok if helios do nothing... but when fast i/O network or Disk I/O, Helios64 crash or freeze. With 5.1X Kernel, Helios is stable at 400-1400MHZ Schedutil With first 6.X Kernel, Helios is stable at 400-14000MHZ Schedutil With 6.6 Kernel and upper my pattern to test stabilty crash when i use setting above and not crash with 1200 Fixed frequency. At 1400 or more with my pattern to test Helios crash. My configuration is: 8To x 4 Raid 10 -> LVM -> Lucks -> LVM -> BTRFS. To test stabilty i do: - Scrubbing RAID and at same time i do a BTRFS with Checksum check, that do a lot of I/O and Frequency change... Since i pass to Kernel 6.6.X, this pattern to test fail always. With 5.1X Kernel 400-1400MHZ Schedutil and 6.X Kernel 400-14000MHZ Schedutil and at 1200 Fixed, always pass and always crash at 400-1200MHZ. I advise you if you want stable online setting to Keep a 5.1X Kernel with 400-1400MHz. That not my choice because i prefer to follow most update Kernel and i believe (or dreaming) adevelloper find problem with Helios RockChip and correct it one day. Edited April 14, 2024 by BipBip1981 0 Quote
Trillien Posted April 15, 2024 Posted April 15, 2024 (edited) I was curious about how cpufrequtils can manage different policies over several cpus. Actually it can only dictate one set over the 6 cpus. For info, I've just found a post about having a change on cpufrequtils that can manage different frequency sets over the cpus (4x cpus at 400-1400MHz and 2x cpus at 400-1800MHz). #To get individual cpus current frequencies $ cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq #To get individual cpus governor (note there is no one governor for one cpu) $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor #To collect existing frequency policies $ ls -al /sys/devices/system/cpu/cpufreq/ I'm now testing the script... Edit: Not probant ! It reboots with "400-1400MHz schedutil" and "400-1800MHz conservative" governors. Edited April 15, 2024 by Trillien 0 Quote
BipBip1981 Posted April 16, 2024 Posted April 16, 2024 Hi everybody, I test 24.05 trunk build myself armbian bookworm with 6.6.27 Kernel and... For moment, i have a stable Helios64 with 400-1800Mhz schedutil and just my tuned fancontrol file. Running for more than 1 days without crash with my test pattern... and not crash/freeze until now !!!!!! I hope not crash during next few days... If not crash, i advise everybody to swith to this kernel Keep in touch 0 Quote
TDCroPower Posted April 16, 2024 Posted April 16, 2024 @BipBip1981 very good to hear that a higher version also runs stable, did the python test work for you without errors... for i in $(seq 1 100);do python3 -c "import pkg_resources" || break;done 0 Quote
ebin-dev Posted April 17, 2024 Author Posted April 17, 2024 (edited) 16 hours ago, BipBip1981 said: I test 24.05 trunk build myself armbian bookworm with 6.6.27 Kernel A first test with linux 6.6.27 (downloaded from beta.armbian.com) was successful on my system too - even with the modifications (hs400 speed and cache awareness). (here is a link to the linux-6.6.27 debs downloaded today; I also attached modified dtbs for some kernel versions) dtbs.zip Edited April 17, 2024 by ebin-dev 1 Quote
BipBip1981 Posted April 17, 2024 Posted April 17, 2024 Hi, My test parttern run and i do: helios64@helios64:~$ for i in $(seq 1 100);do python3 -c "import pkg_resources" || break;done helios64@helios64:~$ No crash to my side 😉 I never had until today my helios64 stable with 400-1800MHZ stable with my pattern test and podman container run. I think i am dreaming.... yes? not? Helios64 stable is AMAZING !!!!! 0 Quote
ebin-dev Posted April 17, 2024 Author Posted April 17, 2024 Linux 6.6.27 is not stable on my system - with or without the modifications (hs400 speed and cache awareness). I will switch back to 6.6.8 now. # kernel oops 2024-04-17T20:45:02.182350+02:00 helios64 kernel: [20582.266022] Unable to handle kernel paging request at virtual address ffff800481075118 2024-04-17T20:45:02.182405+02:00 helios64 kernel: [20582.266042] Mem abort info: 2024-04-17T20:45:02.182409+02:00 helios64 kernel: [20582.266047] ESR = 0x0000000086000005 2024-04-17T20:45:02.182413+02:00 helios64 kernel: [20582.266053] EC = 0x21: IABT (current EL), IL = 32 bits 2024-04-17T20:45:02.182416+02:00 helios64 kernel: [20582.266062] SET = 0, FnV = 0 2024-04-17T20:45:02.182421+02:00 helios64 kernel: [20582.266068] EA = 0, S1PTW = 0 2024-04-17T20:45:02.182424+02:00 helios64 kernel: [20582.266073] FSC = 0x05: level 1 translation fault 2024-04-17T20:45:02.182428+02:00 helios64 kernel: [20582.266080] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000037b7000 2024-04-17T20:45:02.182432+02:00 helios64 kernel: [20582.266088] [ffff800481075118] pgd=10000000f7fff003, p4d=10000000f7fff003, pud=0000000000000000 2024-04-17T20:45:02.182436+02:00 helios64 kernel: [20582.266111] Internal error: Oops: 0000000086000005 [#1] PREEMPT SMP 2024-04-17T20:45:02.182440+02:00 helios64 kernel: [20582.266686] Modules linked in: xt_comment xt_tcpudp nft_compat nf_tables nfnetlink eq3_char_loop(O) rpi_rf_mod_led(O) dummy_rx8130(O) hb_rf_eth(O) generic_raw_uart(O) sunrpc lz4hc lz4 zram binfmt_misc cp210x us> 2024-04-17T20:45:02.182445+02:00 helios64 kernel: [20582.273976] CPU: 5 PID: 13764 Comm: kworker/5:2 Tainted: G C O 6.6.27-current-rockchip64 #2 2024-04-17T20:45:02.182450+02:00 helios64 kernel: [20582.274829] Hardware name: Helios64 (DT) 2024-04-17T20:45:02.182453+02:00 helios64 kernel: [20582.275191] Workqueue: events dbs_work_handler 2024-04-17T20:45:02.182457+02:00 helios64 kernel: [20582.275626] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) 2024-04-17T20:45:02.182461+02:00 helios64 kernel: [20582.276247] pc : 0xffff800481075118 2024-04-17T20:45:02.182464+02:00 helios64 kernel: [20582.276567] lr : 0xffff800481075118 2024-04-17T20:45:02.182467+02:00 helios64 kernel: [20582.276882] sp : ffff800088b23530 2024-04-17T20:45:02.182472+02:00 helios64 kernel: [20582.277182] x29: ffff800088b23530 x28: ffff800081c4d090 x27: ffff0000040a8080 2024-04-17T20:45:02.182499+02:00 helios64 kernel: [20582.277830] x26: 000000003b9aca00 x25: ffff0000040a84f8 x24: 0000000000000000 2024-04-17T20:45:02.182503+02:00 helios64 kernel: [20582.278474] x23: ffff800081a47000 x22: ffff8000816f8008 x21: ffff800088b23578 2024-04-17T20:45:02.182507+02:00 helios64 kernel: [20582.279118] x20: 00000000000000fa x19: 00000001004d6073 x18: 0000000000000000 2024-04-17T20:45:02.182510+02:00 helios64 kernel: [20582.279762] x17: 000000040044ffff x16: 00100074b5503510 x15: 0000000000000000 2024-04-17T20:45:02.182513+02:00 helios64 kernel: [20582.280404] x14: 00000000000000c9 x13: 0000000000000002 x12: 0000000000000000 2024-04-17T20:45:02.182516+02:00 helios64 kernel: [20582.281048] x11: 0000000000000400 x10: 0000000000000a90 x9 : ffff800088b23450 2024-04-17T20:45:02.182520+02:00 helios64 kernel: [20582.281691] x8 : ffff000006da46f0 x7 : 0000000000000000 x6 : 0000000000000070 2024-04-17T20:45:02.182523+02:00 helios64 kernel: [20582.282333] x5 : 00000000410fd080 x4 : 0000000000f0000f x3 : 0000000000000002 2024-04-17T20:45:02.182527+02:00 helios64 kernel: [20582.282976] x2 : 0000000000000000 x1 : ffff000006da3c00 x0 : 00000000124a8be7 2024-04-17T20:45:02.182530+02:00 helios64 kernel: [20582.283621] Call trace: 2024-04-17T20:45:02.182533+02:00 helios64 kernel: [20582.283846] 0xffff800481075118 2024-04-17T20:45:02.182536+02:00 helios64 kernel: [20582.284136] rk3x_i2c_xfer_common.isra.0+0x32c/0x438 2024-04-17T20:45:02.182539+02:00 helios64 kernel: [20582.284589] rk3x_i2c_xfer+0x18/0x24 2024-04-17T20:45:02.182543+02:00 helios64 kernel: [20582.284917] __i2c_transfer+0x1ec/0x80c 2024-04-17T20:45:02.182546+02:00 helios64 kernel: [20582.285269] i2c_transfer+0x60/0x128 2024-04-17T20:45:02.182549+02:00 helios64 kernel: [20582.285598] i2c_transfer_buffer_flags+0x5c/0x90 2024-04-17T20:45:02.182552+02:00 helios64 kernel: [20582.286017] regmap_i2c_write+0x20/0x58 2024-04-17T20:45:02.182556+02:00 helios64 kernel: [20582.286370] _regmap_raw_write_impl+0x7ac/0x8f8 2024-04-17T20:45:02.182559+02:00 helios64 kernel: [20582.286778] _regmap_bus_raw_write+0x60/0x7c 2024-04-17T20:45:02.182563+02:00 helios64 kernel: [20582.287163] _regmap_write+0x60/0x188 2024-04-17T20:45:02.182567+02:00 helios64 kernel: [20582.287496] _regmap_update_bits+0x114/0x134 2024-04-17T20:45:02.182570+02:00 helios64 kernel: [20582.287881] regmap_update_bits_base+0x64/0x98 2024-04-17T20:45:02.182573+02:00 helios64 kernel: [20582.288282] regulator_set_voltage_sel_regmap+0x50/0x9c 2024-04-17T20:45:02.182577+02:00 helios64 kernel: [20582.288759] _regulator_call_set_voltage_sel+0x74/0xc8 2024-04-17T20:45:02.182580+02:00 helios64 kernel: [20582.289224] _regulator_do_set_voltage+0x47c/0x58c 2024-04-17T20:45:02.182584+02:00 helios64 kernel: [20582.289660] regulator_set_voltage_rdev+0x64/0x258 2024-04-17T20:45:02.182586+02:00 helios64 kernel: [20582.290096] regulator_do_balance_voltage+0x1f4/0x428 2024-04-17T20:45:02.182590+02:00 helios64 kernel: [20582.290553] regulator_balance_voltage+0x50/0x9c 2024-04-17T20:45:02.182593+02:00 helios64 kernel: [20582.290974] regulator_set_voltage_unlocked+0xa8/0x12c 2024-04-17T20:45:02.182597+02:00 helios64 kernel: [20582.291438] regulator_set_voltage+0x50/0x98 2024-04-17T20:45:02.182600+02:00 helios64 kernel: [20582.291828] _opp_config_regulator_single+0x50/0x19c 2024-04-17T20:45:02.182603+02:00 helios64 kernel: [20582.292281] _set_opp+0xd8/0x4bc 2024-04-17T20:45:02.182607+02:00 helios64 kernel: [20582.292577] dev_pm_opp_set_rate+0x18c/0x280 2024-04-17T20:45:02.182610+02:00 helios64 kernel: [20582.292963] set_target+0x30/0x3c [cpufreq_dt] 2024-04-17T20:45:02.182613+02:00 helios64 kernel: [20582.293377] __cpufreq_driver_target+0x1d0/0x344 2024-04-17T20:45:02.182616+02:00 helios64 kernel: [20582.293797] od_dbs_update+0xbc/0x1ac 2024-04-17T20:45:02.182620+02:00 helios64 kernel: [20582.294133] dbs_work_handler+0x40/0x7c 2024-04-17T20:45:02.182623+02:00 helios64 kernel: [20582.294486] process_one_work+0x160/0x3a8 2024-04-17T20:45:02.182626+02:00 helios64 kernel: [20582.294854] worker_thread+0x32c/0x438 2024-04-17T20:45:02.182630+02:00 helios64 kernel: [20582.295197] kthread+0x114/0x118 2024-04-17T20:45:02.182633+02:00 helios64 kernel: [20582.295494] ret_from_fork+0x10/0x20 2024-04-17T20:45:02.182636+02:00 helios64 kernel: [20582.295835] Code: ???????? ???????? ???????? ???????? (????????) 2024-04-17T20:45:02.182640+02:00 helios64 kernel: [20582.296383] ---[ end trace 0000000000000000 ]--- 0 Quote
prahal Posted April 18, 2024 Posted April 18, 2024 @ebin-devI discussed with on IRC #u-boot and I believe one board designer told me that there could be issue with the regulatorh hardware design (CPU big). He suggested me to up the voltage to max after looking at the schematics (that are available in the wiki in the left pane documents section) to try if it fixed my crashes and this indeed fixed lost of them. That is I first tried every opp-table-1 ie cpu-b at 1.2V then I tried with voltage closer to the vanilla rk3399 ones. In the end I was able to run the cpufreq switching test I gave you 100 times without a crash with upping all the opp voltages for cpu-b by 75mV. Any of the opp run mostly stable with only 50mV but in I still had crashes. So up 75mV looks fine. I still have crash around once a day but not with my cpufreq test case as far as I know. I am now on on demand cpufreq governor with freq from 408MHz to 1.8GHz. Still I would really like to be able to be able to reproduce the crashes I still get. They might be from gpu opp voltages as they have the same hardware design as the CPU-big. Or something else. But I doubt the kernel is involved except that any kernel version might stress the board less. But for one I had added a big delay between CPU-b frequent switching and still had crashes, so I doubt the speed has anything to do with it. And in my test I tried with all cpu-b OPP voltages to 1.2V except even the lowest one and was still able to get random crashes with my cpufreq test case, so I doubt this had anything to do with high freq. Only that 1.6GHz was the one the most sensible to a voltage without 75mV up from upstream rk3399 OPP voltage values. And 408/600 were the less likely to crash but still crashed from time to time. I don't know if you know how to redefine the opp voltage values for cpu-b. I will try to post you my patch asap ( currently on my phone). 1 Quote
prahal Posted April 18, 2024 Posted April 18, 2024 (edited) On 4/12/2024 at 3:02 PM, BipBip1981 said: To test stabilty i do: - Scrubbing RAID and at same time i do a BTRFS with Checksum check, that do a lot of I/O and Frequency change... Since i pass to Kernel 6.6.X, this pattern to test fail always. Can you provide the exact commands you run to get the crash? Also for most of the instability (big CPU cluster) see my comment above, that is up the opp-table-1 voltages by 75mV. Else I will post the DTS block to up the opp-table-1 voltages by 75mV in a few days at most, I hope. Edited April 18, 2024 by prahal typo 0 Quote
BipBip1981 Posted April 18, 2024 Posted April 18, 2024 (edited) Hi, My command: To scrub RAID10: echo check > /sys/block/md0/md/sync_action At same time i check BTRFS with: btrfs check --readonly --check-data-csum --progress /dev/disk/by-uuid/1d4e2c84-1c43-4d73-8acb-XXXXXXXXXXXXX At at same time i do copy/delete/copy random from personnal computer to helios64 samba share one pass finished after about 36h... i run pass 3 times For information, i use LUKS overs my raid10 My fancontrol file is: root@helios64:~# cat /etc/fancontrol # Helios64 PWM Fan Control Configuration # Temp source : /dev/thermal-cpu #INTERVAL=10 INTERVAL=30 FCTEMPS=/dev/fan-p6/pwm1=/dev/thermal-cpu/temp1_input /dev/fan-p7/pwm1=/dev/thermal-cpu/temp1_input MINTEMP=/dev/fan-p6/pwm1=40 /dev/fan-p7/pwm1=40 #MAXTEMP=/dev/fan-p6/pwm1=110 /dev/fan-p7/pwm1=110 MAXTEMP=/dev/fan-p6/pwm1=50 /dev/fan-p7/pwm1=50 #MINSTART=/dev/fan-p6/pwm1=60 /dev/fan-p7/pwm1=60 MINSTART=/dev/fan-p6/pwm1=20 /dev/fan-p7/pwm1=20 #MINSTOP=/dev/fan-p6/pwm1=40 /dev/fan-p7/pwm1=40 MINSTOP=/dev/fan-p6/pwm1=20 /dev/fan-p7/pwm1=20 MINPWM=20 and i install: Full Firmware deb package to remove error with 2,5G ethernet driver I do extend swap by swapfile to 6Go and i run 2 containers with podman, Plex and SFTPGO Edited April 18, 2024 by BipBip1981 0 Quote
BipBip1981 Posted April 18, 2024 Posted April 18, 2024 (edited) ....Grrrr I reboot... and now i lose network connection random time (approximately 6-10min) after full boot... and i lose access by USB wire to console... return in unstable world... I become CRAZY ! Back to 400-1200Mhz max schedutil, if lose connection... back to 1200 1200 Performance (equal to fix freq 1200) Edited April 18, 2024 by BipBip1981 0 Quote
ebin-dev Posted April 18, 2024 Author Posted April 18, 2024 4 hours ago, prahal said: redefine the opp voltage values for cpu-b Do I understand you correctly that you suggest to increase all opp-microvolt values in opp-table-1 by 75 millivolt ? This would i.e. change opp-microvolt = <0xc96a8 0xc96a8 0x1312d0>; to opp-microvolt = <0xdbba0 0xdbba0 0x1437c8>; # here are my current values opp-table-1 { compatible = "operating-points-v2"; opp-shared; phandle = <0x0d>; opp00 { opp-hz = <0x00 0x18519600>; opp-microvolt = <0xc96a8 0xc96a8 0x1312d0>; clock-latency-ns = <0x9c40>; }; opp01 { opp-hz = <0x00 0x23c34600>; opp-microvolt = <0xc96a8 0xc96a8 0x1312d0>; }; opp02 { opp-hz = <0x00 0x30a32c00>; opp-microvolt = <0xc96a8 0xc96a8 0x1312d0>; }; opp03 { opp-hz = <0x00 0x3c14dc00>; opp-microvolt = <0xd59f8 0xd59f8 0x1312d0>; }; opp04 { opp-hz = <0x00 0x47868c00>; opp-microvolt = <0xe7ef0 0xe7ef0 0x1312d0>; }; opp05 { opp-hz = <0x00 0x54667200>; opp-microvolt = <0xfa3e8 0xfa3e8 0x1312d0>; }; opp06 { opp-hz = <0x00 0x5fd82200>; opp-microvolt = <0x10c8e0 0x10c8e0 0x1312d0>; }; opp07 { opp-hz = <0x00 0x6b49d200>; opp-microvolt = <0x124f80 0x124f80 0x1312d0>; }; }; 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.