Werner Posted October 26, 2020 Posted October 26, 2020 2 hours ago, flower said: if kobol would sit in germany i would send it back. Bevor du das Ding in den Müll wirfst, ich nehme es dir ab Always good to have some additional boards for testing.
gprovost Posted October 27, 2020 Author Posted October 27, 2020 11 hours ago, flower said: this unit was sold as a "high quality nas". i was expecting a little tweaking and some flaws but not those instabilites. they are just inaccable for a nas. a nas is about data integrity! You right. No excuse on our side, we are behind schedule and not up to expectation on the software maturity, maybe we should have stick LK4.4 (from rockchip) and forget about Linux mainline for now :-/ However we are still working at improving the stability and we are optimistic that very soon, it will get better. Right now as you know LK5.8.16 is for some reason (we still can't figure out) unstable vs 5.8.14 We also realized that OMV install was removing some tx offloading tweak. So a lot of little things here and there that we only discover along the way. BTW regarding this crash are you sure tx offload on eth1 was disable ? 2
gprovost Posted October 27, 2020 Author Posted October 27, 2020 11 hours ago, flower said: but: no official communication - and even your install page is broken since weeks - thats just too much. What page is broken ??
flower Posted October 27, 2020 Posted October 27, 2020 2 hours ago, gprovost said: What page is broken ?? the original install instructions. yesterday morning it still was - now it is gone. i wish for a litte bit more communication from your end. i still hope it will work out - and i still really like it. i am just frustrated. regarding your question about tx offloading: i dont use omv and i have a script which disables them on every start. but... something is wrong with networking on eth1 anyway. try syncing ~1tb with nextcloud. it wont work: even if the unit doesnt crash the nc client will disconnect multiple times and (even worse) you have to auth again (which is not typical for a disconnect) my ssh connections are stable for hours though. not sure whats the problem there. but.. does that imply that it is stable as long as you dont use 2.5GBe? that would be a good example where better communication would have helped.
gprovost Posted October 27, 2020 Author Posted October 27, 2020 13 minutes ago, flower said: but... something is wrong with networking on eth1 anyway. try syncing ~1tb with nextcloud. it wont work: even if the unit doesnt crash the nc client will disconnect multiple times and (even worse) you have to auth again (which is not typical for a disconnect) Have you tuned your nextcloud install for a 4GB RAM system ? 15 minutes ago, flower said: but.. does that imply that it is stable as long as you dont use 2.5GBe? that would be a good example where better communication would have helped. No we are not saying that for now. 18 minutes ago, flower said: the original install instructions. yesterday morning it still was - now it is gone. We split 2 weeks ago the install page into 4 sub pages in order to add the eMMC install instruction. BTW we just added a new section called Recovery, where we explained how to use Maskrom mode.
flower Posted October 27, 2020 Posted October 27, 2020 4 minutes ago, gprovost said: Have you tuned your nextcloud install for a 4GB RAM system ? Yes, all docker container and system together is around 1gb ram usage. It uses zram after a while but that seems unrelated to that sync error. I never saw my ram completely filled. I dont use armbian nextcloud. I tweaked the mysql, redis, alpine variant which now runs stable on my old pc. Emby takes much ram there, but that container wasnt enabled on helios64.
gprovost Posted October 27, 2020 Author Posted October 27, 2020 1 minute ago, flower said: Yes, all docker container and system together is around 1gb ram usage. So you have tuned php-fpm, mariadb ? I haven't done extensive test of NC on Helios64, but I know that on Helios4 (with only 2GB RAM) i had to tune properly NC otherwise system will hangs (and reset because of watchdog). Not only because of OOM but also just too many thread / child process spawned overloading the system. I never tested NC in container though so maybe that doesn't apply to your use case. ( Just for reference : https://docs.nextcloud.com/server/19/admin_manual/installation/server_tuning.html )
flower Posted October 27, 2020 Posted October 27, 2020 So you have tuned php-fpm, mariadb ? I haven't done extensive test of NC on Helios64, but I know that on Helios4 (with only 2GB RAM) i had to tune properly NC otherwise system will hangs (and reset because of watchdog). Not only because of OOM but also just too many thread / child process spawned overloading the system. I never tested NC in container though so maybe that doesn't apply to your use case. ( Just for reference : https://docs.nextcloud.com/server/19/admin_manual/installation/server_tuning.html )Yes it is tuned. Proxy buffers are very low (reverse proxy and fastcgi), mariadb and phpfpm.I never saw it above 1gb ram - except for linux file caches which touch zram swap after a while. But never filled it.But what is your point about many processes? I do have many. Most of them are sleeping though. Do you think thats a problem? Gesendet von meinem CLT-L29 mit Tapatalk
gprovost Posted October 27, 2020 Author Posted October 27, 2020 3 minutes ago, flower said: But what is your point about many processes? I do have many. Most of them are sleeping though. Do you think thats a problem? My point is just to collect as much info as possible on use cases that generate crash in order to prioritize focus.
flower Posted October 27, 2020 Posted October 27, 2020 My point is just to collect as much info as possible on use cases that generate crash in order to prioritize focus.This afternoon (eg around 7h) i will put some smaller old disks in helios64 to keep an eye on the progress.I can start those containers there again.If you want me to provide any additional info just tell me.As it will sit there with test data only i can give you root access too in case you are interested. Gesendet von meinem CLT-L29 mit Tapatalk 2
fpabernard Posted October 27, 2020 Posted October 27, 2020 Just to say that I set up my Helios64 with LK5.8.16, it's up for 2 days, it copied 12TB of data over eth1 at 100Mbits/s from another OMV server (using rsync). So till now, I do not see the point of freezing kernel updates ... Frédéric
lyuyhn Posted October 28, 2020 Posted October 28, 2020 Did someone try to use the wakeonlan feature of the Helios64? I tried to set g mode on eth0 using ethtool, but nothing happens when I try to send a magic packet. Is there a trick to make this work?
gprovost Posted October 29, 2020 Author Posted October 29, 2020 8 hours ago, lyuyhn said: Did someone try to use the wakeonlan feature of the Helios64? I tried to set g mode on eth0 using ethtool, but nothing happens when I try to send a magic packet. Is there a trick to make this work? For now WoL is not supported yet. Still in progress because suspend mode doesn't work properly.
Dmitry Antipov Posted October 29, 2020 Posted October 29, 2020 On 10/6/2020 at 10:25 AM, ebin-dev said: I can confirm the issue with the USB-C cable. It does not fit into the USB-C port (the reason may just be the additional layer introduced by the label around the ports). The issue can be resolved by cutting away about 0.5 mm of the plastic around the plug at the end of the USB cable. It will then easily fit into the port. The same problem, fixed with a small file to sharpen the USB-C hole in the metal plate so the plastic part of the plug can be inserted through the plate closer to the port :-).
fromport Posted October 29, 2020 Posted October 29, 2020 Have been testing my helios64 with 5x12TB drives in different setups. omv & snapraid, but continuing a sync command uses so much ram that it become so slow that it might as well be described as unusable. Next I tried omv & ZFS Could get ZFS module compiled but then it wouldn't load the rest of utilities because missing dependencies (buster is real old) So finally switched to try lizardfs. Crashed on me after a few hours, hooked up serial console. Caught this error during the night: https://termbin.com/lsow Is it easy to downgrade to 5.8.14 or even 4.x kernel ?
flower Posted October 29, 2020 Posted October 29, 2020 1 hour ago, fromport said: Have been testing my helios64 with 5x12TB drives in different setups. omv & snapraid, but continuing a sync command uses so much ram that it become so slow that it might as well be described as unusable. Next I tried omv & ZFS Could get ZFS module compiled but then it wouldn't load the rest of utilities because missing dependencies (buster is real old) So finally switched to try lizardfs. Crashed on me after a few hours, hooked up serial console. Caught this error during the night: https://termbin.com/lsow Is it easy to downgrade to 5.8.14 or even 4.x kernel ? Downgrading and pinning is possible through armbian-config. it is described in the latest kobol blog (that page seems offline for me atm). i used this to downgrade: apt install \ linux-dtb-current-rockchip64=20.08.10 \ linux-headers-current-rockchip64=20.08.10 \ linux-image-current-rockchip64=20.08.10 \ armbian-firmware=20.08.10 \ linux-buster-root-current-helios64=20.08.10 \ linux-u-boot-helios64-current=20.08.10 (but be carefull... next update would update them again) afaik there is no way to go back to an 4.4 kernel. going from 4.4 to 5.8 is possible through armbian-config - but didnt work for me last time i tried
fromport Posted October 29, 2020 Posted October 29, 2020 3 hours ago, flower said: Downgrading and pinning is possible through armbian-config. it is described in the latest kobol blog (that page seems offline for me atm). i used this to downgrade: apt install \ linux-dtb-current-rockchip64=20.08.10 \ linux-headers-current-rockchip64=20.08.10 \ linux-image-current-rockchip64=20.08.10 \ armbian-firmware=20.08.10 \ linux-buster-root-current-helios64=20.08.10 \ linux-u-boot-helios64-current=20.08.10 (but be carefull... next update would update them again) afaik there is no way to go back to an 4.4 kernel. going from 4.4 to 5.8 is possible through armbian-config - but didnt work for me last time i tried I was on IRC on #armbian and they suggested using armbian-config to downgrade. Downgraded succesfully to 5.8.14. Looked good in the beginning but as soon as it got some load on it, it crashed. Then I used armbian-config to downgrade to 4.4 and like you predicted : it didn't go well Serial console now shows during bootup Quote U-Boot 2020.07-armbian (Oct 18 2020 - 23:38:26 +0200) SoC: Rockchip rk3399 Reset cause: POR DRAM: 3.9 GiB PMIC: RK808 SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB MMC: mmc@fe320000: 1, sdhci@fe330000: 0 Loading Environment from MMC... *** Warning - bad CRC, using default environment In: serial Out: serial Err: serial Model: Helios64 Revision: 1.2 - 4GB non ECC Net: eth0: ethernet@fe300000 scanning bus for devices... Hit any key to stop autoboot: 0 Card did not respond to voltage select! switch to partitions #0, OK mmc0(part 0) is current device Scanning mmc 0:1... Found U-Boot script /boot/boot.scr 3185 bytes read in 19 ms (163.1 KiB/s) ## Executing script at 00500000 Boot script loaded from mmc 0 117 bytes read in 15 ms (6.8 KiB/s) 7302624 bytes read in 716 ms (9.7 MiB/s) 22114312 bytes read in 2118 ms (10 MiB/s) libfdt fdt_check_header(): FDT_ERR_BADMAGIC No FDT memory address configured. Please configure the FDT address via "fdt addr <address>" command. Aborting! 2698 bytes read in 36 ms (72.3 KiB/s) Applying kernel provided DT fixup script (rockchip-fixup.scr) ## Executing script at 09000000 ## Loading init Ramdisk from Legacy Image at 06000000 ... Image Name: uInitrd Image Type: AArch64 Linux RAMDisk Image (gzip compressed) Data Size: 7302560 Bytes = 7 MiB Load Address: 00000000 Entry Point: 00000000 Verifying Checksum ... OK ERROR: Did not find a cmdline Flattened Device Tree Loading Ramdisk to f57ef000, end f5ee5da0 ... OK FDT and ATAGS support not compiled in - hanging ### ERROR ### Please RESET the board ### And that is from booting from the internal emmc Will have to find which jumper to install and boot from sdcard I guess
Werner Posted October 29, 2020 Posted October 29, 2020 8 minutes ago, fromport said: (but be carefull... next update would update them again) armbian-config can also be used to freeze firmware updates (kernel version to say)
fromport Posted October 30, 2020 Posted October 30, 2020 8 hours ago, fromport said: I was on IRC on #armbian and they suggested using armbian-config to downgrade. Downgraded succesfully to 5.8.14. Looked good in the beginning but as soon as it got some load on it, it crashed. Then I used armbian-config to downgrade to 4.4 and like you predicted : it didn't go well Serial console now shows during bootup And that is from booting from the internal emmc Will have to find which jumper to install and boot from sdcard I guess My machine is really making me sweat. I installed armbian-buster-legacy-4.4 on an SD card and managed to boot it (much slower performance than internal emmc) I tried to install lizardfs-chunkserver again. In order to do that I am mounting the spinning rust partitions who I previously formatted with XFS When trying to mount the partitions RW I get this error Quote XFS Superblock has unknown read-only compatible features (0x4) enabled It says you can mount in RO mode. I hooked up an external 256GB ssd to the front usb port and started copying the contents of the first HD to the SSD I was using Quote rsync -av --info=progress2 [source] [dest] It started copying and suddenly the machine stopped responding (after few minutes) This is what I saw on the serial console Quote Armbian 20.08.17 Buster ttyFIQ0 filer1 login: [ 1690.059453] BUG: spinlock lockup suspected on CPU#0, swapper/0/0 [ 1690.063991] lock: 0xffffffc0f7f2b200, .magic: dead4ead, .owner: swapper/4/0, .owner_cpu: 4 [ 1690.070046] BUG: spinlock lockup suspected on CPU#5, rsync/4022 [ 1690.072860] lock: 0xffffffc0f7f2b200, .magic: dead4ead, .owner: swapper/4/0, .owner_cpu: 4 [ 1690.104336] BUG: spinlock lockup suspected on CPU#3, kworker/3:1/3325 [ 1690.109230] lock: 0xffffffc0f7f2b200, .magic: dead4ead, .owner: swapper/4/0, .owner_cpu: 4 [ 1691.144138] BUG: spinlock lockup suspected on CPU#2, kworker/2:2/656 [ 1691.144143] BUG: spinlock lockup suspected on CPU#1, kworker/1:2/164 [ 1691.144152] lock: 0xffffffc0f7f2b200, .magic: dead4ead, .owner: swapper/4/0, .owner_cpu: 4 [ 1691.159662] lock: 0xffffffc0f7f2b200, .magic: dead4ead, .owner: swapper/4/0, .owner_cpu: 4 I have tried no both legacy and the most up to date images. And the only thing that is utterly consistent : it crashes on me all the times, no matter what I do. Could this be a bad hardware version ?
gprovost Posted October 30, 2020 Author Posted October 30, 2020 @fromport First time we see this kind of crash message, but looking online it seems it isn't an unknown event on other rk3399 board. Could do the same test that trigger this crash but first set the governor to performance. cpufreq-set -c 0 -g performance cpufreq-set -c 4 -g performance cpufreq-info to check the governor has been change Trying to figure out if it's still a DVFS config issue.
fromport Posted October 30, 2020 Posted October 30, 2020 (edited) 28 minutes ago, gprovost said: @fromport First time we see this kind of crash message, but looking online it seems it isn't an unknown event on other rk3399 board. Could do the same test that trigger this crash but first set the governor to performance. cpufreq-set -c 0 -g performance cpufreq-set -c 4 -g performance cpufreq-info to check the governor has been change Trying to figure out if it's still a DVFS config issue. 21:58:29 up 24 min, 7 users, load average: 9.06, 6.67, 3.94 [knock wood, but so far so good] 5 rsync in parallel at the moment Edited October 30, 2020 by fromport
fromport Posted October 30, 2020 Posted October 30, 2020 @gprovost Was able to copy all info in parallel to SSD drive. First time this machine felt stable under load! I put those 2 commands in /etc/rc.local ;-) Thank you for restoring my faith in the helios64, i had almost lost it
gprovost Posted October 30, 2020 Author Posted October 30, 2020 @fromport Yeah it seems we haven't found the sweet voltage spot that works for everyone.
usual user Posted October 30, 2020 Posted October 30, 2020 Out of curiosity, with mainline kernel and ondemand governor in place. Apply this: echo 40000 > /sys/devices/system/cpu/cpufreq/policy0/ondemand/sampling_rate echo 465000 > /sys/devices/system/cpu/cpufreq/policy4/ondemand/sampling_rate Does it still crash in your use case? If it does not crash any longer I will explain what is going on.
eClapton Posted October 30, 2020 Posted October 30, 2020 Hello My Helios64 arrived two weeks ago, after a loooong trip through the Silk Road all the way down to South Europe. First thing I want to give a GREAT THANK YOU to the people at kobol shop support, as they have helped me with my transport/forwarded/ carrier nightmare loop: the address is in Chinese-no address specified-contact the sender-won’t give you your parcel-please call again- no address specified.... So. After reading a good slice of the wiki I couldn’t find a comprehensive manual/article about the simplest usage of the front panel. Glad if you can point me to such information. The principal source of confusion for me are the comments section of the kernel 4.4/5 versions, where there are some mentions about power on/off problems. The questions I have right now are: What does each light mean? I’ve read blue means ok, red means problem, in reddit they have pointed out that the System blue light blinking is normal. How does the power on/off button behave ? I’m confused about the PSU/stand by/ WOL/on states, can I damage the OS with a long press? Does a short press wake up the system if WOL is not configured? The reset button : is it equivalent to a software reboot? What is the procedure to hot-swap drives? Does it need some panel button pressing? Is it done via software? Thank you for your help.
Dmitry Antipov Posted October 30, 2020 Posted October 30, 2020 (edited) BTW, where is /dev/hwrng? Rockchip 4.4 kernel has https://github.com/rockchip-linux/kernel/blob/develop-4.4/drivers/char/hw_random/rockchip-rng.c, so I assume that hardware number generator should be supported. Edited October 30, 2020 by Dmitry Antipov
fromport Posted October 30, 2020 Posted October 30, 2020 6 hours ago, usual user said: Out of curiosity, with mainline kernel and ondemand governor in place. Apply this: echo 40000 > /sys/devices/system/cpu/cpufreq/policy0/ondemand/sampling_rate echo 465000 > /sys/devices/system/cpu/cpufreq/policy4/ondemand/sampling_rate Does it still crash in your use case? If it does not crash any longer I will explain what is going on. I am so glad my machine survived the night without crashing. It's now part of my lizardfs distributed storage pool and is still synchronizing Quote Memory: Total Used Free Buffers RAM: 3901088 3843136 57952 1328 Swap: 1950540 57468 1893072 Bootup: Thu Oct 29 22:20:18 2020 Load average: 1.26 1.61 1.68 1/275 7993 user : 01:00:25.07 1.8% page in : 81241566 nice : 00:00:01.07 0.0% page out: 410572768 system: 02:27:57.04 4.3% page act: 4335568 IOwait: 01:55:27.21 3.4% page dea: 226869 hw irq: 00:00:00.00 0.0% page flt: 2458237 sw irq: 00:37:46.36 1.1% swap in : 2239 idle : 2d 02:53:08.08 89.4% swap out: 15848 uptime: 09:49:06.91 context : 660800969 mtdblock0 70r sdb 14593r 354626w mmcblk0 8184r 5 sdc 24794r 344540w mmcblk0p1 8036r sdd 24769r 342203w mmcblk1 193r sde 23975r 342134w mmcblk1rpmb 4r zram0 1541r 2040 mmcblk1boot1 116r zram1 2802r 15849 mmcblk1boot0 116r sdf 1061502r 2503w sda 46103r 350811w eth0 TX 194.79GiB RX 302.53GiB lo TX 0.00B RX 0.00B eth1 TX 0.00B RX 0.00B Once it has ran 24 hours , i will reboot and change those parameters and report back.
Jaques-Ludwig Posted October 30, 2020 Posted October 30, 2020 (edited) Hello, got my helios64 a week ago. It's a great system. Now I wanted to install nextcloud. I used armbian-config, but it seems it didn't work. Is there a manual way to install nextcloud on helios64? Thanks a lot, Jaques-Ludwig Edited October 30, 2020 by Jaques-Ludwig
Werner Posted October 30, 2020 Posted October 30, 2020 13 minutes ago, Jaques-Ludwig said: I used armbian-config, Sure thing:https://docs.nextcloud.com/server/20/admin_manual/installation/
dancgn Posted October 30, 2020 Posted October 30, 2020 Someone got Mayan EDMS installed? I tried to install it throu armbian-config, but it doesn't work. It seems, that he try to install the wrong arch-Version Should i update the armbian-config throu omv? Bad thing. for now the system works a complete week! But with the oldest image. I'm afraid to update...
Recommended Posts