hartraft

  • Posts

    12
  • Joined

  • Last visited

Reputation Activity

  1. Like
    hartraft reacted to prahal in Upgrading to Bullseye (troubleshooting Armbian 21.08.1)   
    bisected eMMC breakage:
    06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66 is the first bad commit
     
    commit 06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66 Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Date: Thu May 20 01:12:23 2021 +0300 regulator: core: resolve supply for boot-on/always-on regulators commit 98e48cd9283dbac0e1445ee780889f10b3d1db6a upstream. For the boot-on/always-on regulators the set_machine_constrainst() is called before resolving rdev->supply. Thus the code would try to enable rdev before enabling supplying regulator. Enforce resolving supply regulator before enabling rdev. Fixes: aea6cb99703e ("regulator: resolve supply after creating regulator") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> drivers/regulator/core.c | 6 ++++++ 1 file changed, 6 insertions(+)  
    is https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/regulator/core.c?id=98e48cd9283dbac0e1445ee780889f10b3d1db6a
    from thread https://lore.kernel.org/all/20210519221224.2868496-1-dmitry.baryshkov@linaro.org/
     
    diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 7b3de8b0b1ca..043b5f63b94a 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1422,6 +1422,12 @@ static int set_machine_constraints(struct regulator_dev *rdev) * and we have control then make sure it is enabled. */ if (rdev->constraints->always_on || rdev->constraints->boot_on) { + /* If we want to enable this regulator, make sure that we know + * the supplying regulator. + */ + if (rdev->supply_name && !rdev->supply) + return -EPROBE_DEFER; + if (rdev->supply) { ret = regulator_enable(rdev->supply); if (ret < 0) {  
    EDIT 1: this change to return EPROBE_DEFER if regulator has a supply name but no supply ready was intended to complete commit aea6cb99703e17019e025aa71643b4d3e0a24413 which  expects set_machine_constraints to return this eprobe_defer error to attempt to resolve the supply ( before attempting a second run of set_machine_constraints).
     
    One still need to find out if aea6cb99703e17019e025aa71643b4d3e0a24413 fails to resolve the supply and thus does not make a second attempt to set_machine_constraints thus leaves the regulator disabled or else.
     
     
    commit aea6cb99703e17019e025aa71643b4d3e0a24413 Author: Micha? Miros?aw <mirq-linux@rere.qmqm.pl> Date: Sat Sep 26 23:32:41 2020 +0200 regulator: resolve supply after creating regulator When creating a new regulator its supply cannot create the sysfs link because the device is not yet published. Remove early supply resolving since it will be done later anyway. This makes the following error disappear and the symlinks get created instead. DCDC_REG1: supplied by VSYS VSYS: could not add device link regulator.3 err -2 Note: It doesn't fix the problem for bypassed regulators, though. Fixes: 45389c47526d ("regulator: core: Add early supply resolution for regulators") Signed-off-by: Micha? Miros?aw <mirq-linux@rere.qmqm.pl> Link: https://lore.kernel.org/r/ba09e0a8617ffeeb25cb4affffe6f3149319cef8.1601155770.git.mirq-linux@rere.qmqm.pl Signed-off-by: Mark Brown <broonie@kernel.org> diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index ff8e99ca0306..9f704a6c4802 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -5280,15 +5280,20 @@ regulator_register(const struct regulator_desc *regulator_desc, else if (regulator_desc->supply_name) rdev->supply_name = regulator_desc->supply_name; - /* - * Attempt to resolve the regulator supply, if specified, - * but don't return an error if we fail because we will try - * to resolve it again later as more regulators are added. - */ - if (regulator_resolve_supply(rdev)) - rdev_dbg(rdev, "unable to resolve supply\n"); - ret = set_machine_constraints(rdev, constraints); + if (ret == -EPROBE_DEFER) { + /* Regulator might be in bypass mode and so needs its supply + * to set the constraints */ + /* FIXME: this currently triggers a chicken-and-egg problem + * when creating -SUPPLY symlink in sysfs to a regulator + * that is just being created */ + ret = regulator_resolve_supply(rdev); + if (!ret) + ret = set_machine_constraints(rdev, constraints); + else + rdev_dbg(rdev, "unable to resolve supply early: %pe\n", + ERR_PTR(ret)); + } if (ret < 0) goto wash;  
     
     
    My logs for bad shows:
    [ 1.257297] reg-fixed-voltage vcc1v8-sys-s0: Failed to register regulator: -517 [ 1.257446] reg-fixed-voltage vcc0v9-s3: Failed to register regulator: -517 [ 1.257588] reg-fixed-voltage avdd-0v9-s0: Failed to register regulator: -517 [ 1.257728] reg-fixed-voltage avdd-1v8-s0: Failed to register regulator: -517 [ 1.258114] reg-fixed-voltage pcie-power: Failed to register regulator: -517 [ 1.258312] reg-fixed-voltage vcc3v3-sys-s3: Failed to register regulator: -517 [ 1.258451] reg-fixed-voltage vcc3v0-sd: Failed to register regulator: -517 [ 1.258705] reg-fixed-voltage vcc5v0-usb: Failed to register regulator: -517 [ 1.261721] reg-fixed-voltage usblan-power: Failed to register regulator: -517 [ 2.158719] vdd_log: supplied by regulator-dummy [ 2.230749] rk808-regulator rk808-regulator: there is no dvs0 gpio [ 2.230781] rk808-regulator rk808-regulator: there is no dvs1 gpio [ 2.230792] rk808-regulator rk808-regulator: max buck steps per change: 4 [ 2.240854] rk808 0-001b: failed to register 12 regulator [ 2.249848] fan53555-regulator 0-0040: FAN53555 Option[8] Rev[1] Detected! [ 2.253127] fan53555-regulator 0-0041: FAN53555 Option[8] Rev[1] Detected! [ 2.423549] reg-fixed-voltage vcc1v8-sys-s0: Failed to register regulator: -517 [ 2.424263] reg-fixed-voltage vcc0v9-s3: Failed to register regulator: -517 [ 2.424830] reg-fixed-voltage avdd-0v9-s0: Failed to register regulator: -517 [ 2.425443] reg-fixed-voltage avdd-1v8-s0: Failed to register regulator: -517 [ 2.556287] rk808-regulator rk808-regulator: there is no dvs0 gpio [ 2.556318] rk808-regulator rk808-regulator: there is no dvs1 gpio [ 2.556328] rk808-regulator rk808-regulator: max buck steps per change: 4  
    while for good:
    [ 2.172993] vdd_log: supplied by regulator-dummy [ 2.330040] rk808-regulator rk808-regulator: there is no dvs0 gpio [ 2.330071] rk808-regulator rk808-regulator: there is no dvs1 gpio [ 2.330081] rk808-regulator rk808-regulator: max buck steps per change: 4 [ 2.347345] fan53555-regulator 0-0040: FAN53555 Option[8] Rev[1] Detected! [ 2.350525] fan53555-regulator 0-0041: FAN53555 Option[8] Rev[1] Detected!
  2. Like
    hartraft reacted to lanefu in Upgrading to Bullseye (troubleshooting Armbian 21.08.1)   
    I have slightly different approach to going's solution, but this would be compatible with it
     
     
  3. Like
    hartraft reacted to ebin-dev in Upgrading to Bullseye (troubleshooting Armbian 21.08.1)   
    @piter75 commited the emmc fix into Armbian and built Armbian 21.08.2 which is already available through the Armbian mirrors (via 'apt update && apt upgrade').
     
    It is save to upgrade your Buster or Bullseye installation to Armbian 21.08.2 - but emmc read speed will be about 50% what it used to be.
     
    You can roll back to the previous linux 5.10.43 kernel if you don't like that - just install those files with 'dpkg -i *.deb'.
     
  4. Like
    hartraft reacted to ebin-dev in Upgrading to Bullseye (troubleshooting Armbian 21.08.1)   
    Really ?
    If you are on macOS Big Sur all the drivers and apps are already there. If you need to access the serial console of your Helios64, just connect your mac to the USB-C port of your Helios64. A device such as  /dev/tty.usbserial-XXXXXXXX will be recognised by macOS. All you need to do then is to enter the following into the Terminal of your mac (X to be replaced):
     
    screen /dev/tty.usbserial-XXXXXXXX 1500000 -L  
    Edit: If the 'screen' command on macOS does not show the output of the bootloader (scrambled characters) you could use the app serial. The only thing you need to configure is to set the serial speed manually to 1500000 bps.
     
    If someone would like to contribute in kernel testing - the Armbian build system is very easy to install and use. All you need is an Ubuntu 21.04 (amd64) installation. All the rest is explained here. To compile your own kernel is then as simple as this:
    ./compile.sh BOARD=helios64 BRANCH=current KERNEL_ONLY=yes RELEASE=bullseye  
  5. Like
    hartraft reacted to piter75 in Upgrading to Bullseye (troubleshooting Armbian 21.08.1)   
    Armbian "current" (5.10.y) compiles without issues.
    I second @Igor's opinion that a change somewhere in this diff broke the eMMC.
    I tried reverting a few obvious parts of it, like the mmc driver changes, but without success.
     
    However I did find that with the unit I have the issue happens only in hs400{,es} modes.
    With those disabled my unit works fine and I can use nand-sata-install to transfer os from SD to eMMC successfully which is not possible with 5.10.60+ and hs400 enabled on eMMC.
     
    If anyone wants to check if switching eMMC to hs200 mode works also on their unit, here is how:
    Upgrade the kernel to 5.10.60, but don't reboot yet.
    Run:
    curl -o rk3399-kobol-helios64.dtb https://users.armbian.com/piter75/helios64/rk3399-kobol-helios64.dtb sudo cp rk3399-kobol-helios64.dtb /boot/dtb/rockchip/rk3399-kobol-helios64.dtb sudo reboot  
    If this workaround works I will disable hs400{,es} (again) in Armbian until the underlying issue is found.
    There will be a performance penalty to that change but keep in mind that Helios64 was originally released with hs200 and only recently gained hs400 back ;-)
    Below you can find the comparison between hs400 and hs200 modes using iozone.
     
  6. Like
    hartraft reacted to aprayoga in Kobol Team is pulling the plug ;(   
    Hi everyone, thank you for all of your support.
    I'll be still around but not in full time as before.
     
    I saw Helios64 has several issues in this new release. I will start looking into it.
  7. Like
    hartraft got a reaction from calvin_thefreak in Feature / Changes requests for future Helios64 board or enclosure revisions   
    Like everyone I'm sad for the news. While the reality is difficult, in the end Kobol still achieved a great job in delivering the product in these circumstances. Considering how many kickstarters fail and have vapourware products. My only hope is that this will inspire a future aarch64 NAS whether it's by Kobol or another manufacturer. 
     
    Would it be possible for the Kobol team to share the 3D print files for the drive trays? So that if some need repair we can fix this ourselves? 
  8. Like
    hartraft reacted to barnumbirr in Does anyone actually have a stable system?   
    I've had a secondary device connected to my Helios64 over serial for the last couple of months. Unfortunately, even at verbosity 7 it dies so suddenly that the output doesn't actually provide any valuable information:

     
    Starting kernel ... [ 2.721938] cacheinfo: Unable to detect cache hierarchy for CPU 0 [ 2.881028] vcc3v3_sys_s0: failed to get the current voltage: -EPROBE_DEFER [ 2.900490] dw_wdt ff848000.watchdog: No valid TOPs array specified [ 3.012992] dwmmc_rockchip fe320000.mmc: All phases bad! [ 3.013479] mmc1: tuning execution failed: -5 [ 3.013881] mmc1: error -5 whilst initialising SD card [ 3.135010] dwmmc_rockchip fe320000.mmc: All phases bad! [ 3.135502] mmc1: tuning execution failed: -5 [ 12.521811] rk_gmac-dwmac fe300000.ethernet: cannot get clock clk_mac_speed [ 15.212309] dw-apb-uart ff1a0000.serial: forbid DMA for kernel console [ 17.035708] lm75 2-004c: supply vs not found, using dummy regulator [ 18.572536] rk_gmac-dwmac fe300000.ethernet eth0: PTP not supported by HW [ 18.817621] OF: graph: no port node found in /i2c@ff3d0000/typec-portc@22 [ 18.833386] OF: graph: no port node found in /syscon@ff770000/usb2-phy@e450/otg-port [ 19.262970] [drm] unsupported AFBC format[3231564e] [ 19.366283] rockchip_vdec: module is from the staging directory, the quality is unknown, you have been warned. [ 19.410603] r8152 4-1.4:1.0 (unnamed net_device) (uninitialized): netif_napi_add() called with weight 256 [ 19.421732] hantro_vpu: module is from the staging directory, the quality is unknown, you have been warned. [ 24.903650] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 helios64 login: DDR Version 1.24 20191016 In soft reset SRX channel 0 CS = 0 MR0=0x18 MR4=0x1 MR5=0x1 MR8=0x10 MR12=0x72 MR14=0x72 This is what my uptime looks like over the last 90 days. Again, none of the reboots were triggered by me.


     
    Reboots wouldn't be such a pain the backside on their own if 90% of them didn't retrigger a mdadm RAID resync...
  9. Like
    hartraft reacted to wurmfood in Summary of troubleshooting items   
    There are a number of modifications that have been suggested that people implement to address certain issues.
     
    The ones I can find are:
    - In /boot/armbianEnv.txt:
    extraargs=libata.force=3.0 - If doing debugging, also add:
    verbosity=7 console=serial extraargs=earlyprintk ignore_loglevel  
    - In /boot/boot.cmd
    regulator dev vdd_log regulator value 930000 regulator dev vdd_center regulator value 950000 and then run:
    mkimage -C none -A arm -T script -d /boot/boot.cmd /boot/boot.scr  
    - In /etc/default/cpufrequtils:
    ENABLE=true MIN_SPEED=408000 MAX_SPEED=1800000 GOVERNOR=ondemand (or 1200000 instead of 1800000)
     
    - And if using ZFS:
    for disk in /sys/block/sd[a-e]/queue/scheduler; do echo none > $disk; done  
     
    I've gathered these from a variety of threads. Am I missing any here?
  10. Like
    hartraft reacted to griefman in Failing to Boot   
    So, 
    after struggling for quite some time i finally was able to get things back to normal without losing data (at least i think so).
     
    The problem was apparently a broken kernel upgrade to version 5.10.43 . There were broken symlinks and not only. 
     
    In the end what solved it was the following:
    I first put the latest stable armbian on an sd card and booted with it. then i actually upgraded that image to the latest kernel and copied all version related files from /boot of the sd card to the /boot of the mmc. I also copied the 5.10.45 modules from /lib/module from the sd card to the mmc.
    Fixed all symlinks in /boot and then the device finally booted. After that it was all about reinstalling kernel headers, cleaning up wrong zfs versions and packages and rebooting frequently enough in between
     
    Hope that i didnt break too much and that this helps someone.
  11. Like
    hartraft reacted to djurny in ArmbianEnv.txt file being overwritten   
    Hi,
    You can try the following to get an idea of which process is modifying the file:
     
    sudo apt-get install auditd sudo auditctl -w /boot/armbianEnv.txt -p wa sudo tail -F /var/log/audit/audit.log  
    The output should show actions performed on the file and (IIRC) the process ID performing those actions, perhaps that might help?
     
    e.g.:
    type=CWD msg=audit(1624526184.097:207): cwd="/home/djurny" type=PATH msg=audit(1624526184.097:207): item=0 name="/boot/" inode=7601 dev=b3:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PATH msg=audit(1624526184.097:207): item=1 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.097:207): proctitle=xx type=SYSCALL msg=audit(1624526184.109:208): arch=40000028 syscall=94 per=800000 success=yes exit=0 a0=3 a1=81a4 a2=c80eeb00 a3=81a4 items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=PATH msg=audit(1624526184.109:208): item=0 name=(null) inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:208): proctitle=xx type=SYSCALL msg=audit(1624526184.109:209): arch=40000028 syscall=226 per=800000 success=yes exit=0 a0=1787820 a1=b6e3e1a0 a2=1907fc0 a3=1c items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=CWD msg=audit(1624526184.109:209): cwd="/home/djurny" type=PATH msg=audit(1624526184.109:209): item=0 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:209): proctitle=xx  
    Source: Find which process is modifying a file [duplicate]
     
    It honestly sounds a bit more like a filesystem that has lost track or a inode mixup somewhere? As someone mentioned, perhaps something else is writing to this file, thinking it is something else? Perhaps try to locate a symbolic (or hard)link to armbanEnv.txt:
    ## find any symbolic links to armbianEnv.txt sudo find / -xdev -type l -ls | egrep -i -- armbianEnv.txt ## find hardlinks to armbianEnv.txt - need to be on the same filesystem! sudo find / -xdev -samefile /boot/armbianEnv.txt  
    Groetjes,
  12. Like
    hartraft reacted to Polarisgeek in ArmbianEnv.txt file being overwritten   
    I've been having a recent issue over the last couple months where my ArmbianEvn.txt file is being overwritten thus causing Helios to hang in the boot process.  Using some samples i've found on these forums, I was able to re-create my file and get the Helios back up and running and have since stored a copy so that when it happens, I can just copy and paste and back up running in no time.  To date, i've had this happen to me 4 times.
     
    I'm not worried about having go through this at this point in time, but I was curious if anyone else has seen this issue or not?  I've saved a couple samples of what was written into the ArmbianEnv.txt file:
    /var/log/armbian-hardware-monitor.log {
      rotate 12
      weekly
      compress
      missi
     
    /var/log/alternatives.log {
    monthly
    rotate 12
    compress
    delaycompress
    missingok
    notifempty
    create 644 root root
    }
     
    I'm running Buster 5.10.21.  I have 5-14TB EXOS drives in Raid 6 and have OMV as the only thing installed.  
  13. Like
    hartraft reacted to SIGSEGV in We want YOU for armbian   
    Hello @Igor, @Werner,
    I'm open to collaborating to the armbian project, my background is on software development and system administration and I have a small Helios64 to test things with.
  14. Like
    hartraft reacted to Heisath in eMMc drive filled to 90% overnight and no idea why.   
    might need to do 
    sudo ncdu -x /  
    Then you can navigate through a tree-view, and figure out where the size comes from. 
     
    As a general step-by-step guide:
    - use "df -h" (or something like that), to figure out which device / mountpoint is getting full
    user@builder:~$ df -h Filesystem Size Used Avail Use% Mounted on tmpfs 1,6G 1,2M 1,6G 1% /run /dev/sda3 79G 31G 44G 41% / tmpfs 7,9G 0 7,9G 0% /dev/shm tmpfs 5,0M 0 5,0M 0% /run/lock tmpfs 4,0M 0 4,0M 0% /sys/fs/cgroup /dev/sda2 94M 5,2M 89M 6% /boot/efi tmpfs 1,6G 136K 1,6G 1% /run/user/1000 We now know, /dev/sda3 is fullest, and is mounted on /
     
    - next use "sudo ncdu -x /" to check the contents / where the space goes:
    ncdu 1.15.1 ~ Use the arrow keys to navigate, press ? for help --- / -------------------------------------------------------------------------- 17,2 GiB [##########] /home 8,5 GiB [#### ] /usr 2,0 GiB [# ] swapfile 1,0 GiB [ ] /var 760,0 MiB [ ] /root 508,4 MiB [ ] /opt 251,1 MiB [ ] /boot 11,6 MiB [ ] /etc 68,0 KiB [ ] /tmp 36,0 KiB [ ] /snap e 16,0 KiB [ ] /lost+found 8,0 KiB [ ] /media e 4,0 KiB [ ] /srv e 4,0 KiB [ ] /mnt e 4,0 KiB [ ] /cdrom @ 0,0 B [ ] libx32 @ 0,0 B [ ] lib64 @ 0,0 B [ ] lib32 @ 0,0 B [ ] sbin @ 0,0 B [ ] lib @ 0,0 B [ ] bin Total disk usage: 30,1 GiB Apparent size: 28,8 GiB Items: 789697 We see that most storage goes to /home, /usr and the swapfile.
     
    As the interface is interactive, you can just navigate down the folder structure to find the culprit.
  15. Like
    hartraft reacted to halfa in Armbian 21.05.2 Focal with Linux 5.10.35-rockchip64: fancontrol die in error, fans not spinning   
    For anybody passing by, the issue is due to the fact that for some reason the armbian-bsp-cli-helios64 package for 21.05.2 (EDIT: clarify, 21.05.1 is fine as seen below) was build with the old udev rule (for 4.4 kernels):
    $ ls armbian-bsp-cli-helios64_21.05.1_arm64.deb\data.tar\.\etc\udev\rules.d\ 10-wifi-disable-powermanagement.rules 50-mali.rules 50-rk3399-vpu.rules 50-usb-realtek-net.rules 70-keep-usb-lan-as-eth1.rules 90-helios64-hwmon.rules 90-helios64-ups.rules $ ls armbian-bsp-cli-helios64_21.05.2_arm64.deb\data.tar\.\etc\udev\rules.d\ 10-wifi-disable-powermanagement.rules 50-mali.rules 50-rk3399-vpu.rules 50-usb-realtek-net.rules 70-keep-usb-lan-as-eth1.rules 90-helios64-hwmon-legacy.rules 90-helios64-ups.rules The content of the 90-helios64-hwmon.rules is indeed correct and match the 5.10.x kernel device tree: https://github.com/armbian/build/blob/master/packages/bsp/helios64/90-helios64-hwmon.rules
     
    I tried reversing the build system to find why the old file was used instead of the other, but the best I could find is
    # in config/sources/families/include/rockchip64_common.inc 395 ### Fancontrol tweaks 396 # copy hwmon rules to fix device mapping 397 if [[ $BRANCH == legacy ]]; then 398 install -m 644 $SRC/packages/bsp/helios64/90-helios64-hwmon-legacy.rules $destination/etc/udev/rules.d/ 399 else 400 install -m 644 $SRC/packages/bsp/helios64/90-helios64-hwmon.rules $destination/etc/udev/rules.d/ 401 fi  
  16. Like
    hartraft reacted to halfa in Armbian 21.05.2 Focal with Linux 5.10.35-rockchip64: fancontrol die in error, fans not spinning   
    Posting here following what was recommended on twitter.
    After updating my helios64 earlier this week and rebooting to get the new kernel, I realized it was suspiciously silent.
    A quick check to sensor temps readings and physical check made me realize the fan were not spinning.
     
    After a quick read on the wiki, I checked fancontrol which was indeed failing:
    root@helios64:~ # systemctl status fancontrol.service ● fancontrol.service - fan speed regulator Loaded: loaded (/lib/systemd/system/fancontrol.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/fancontrol.service.d └─pid.conf Active: failed (Result: exit-code) since Fri 2021-05-28 00:08:13 CEST; 1min 42s ago Docs: man:fancontrol(8) man:pwmconfig(8) Process: 2495 ExecStartPre=/usr/sbin/fancontrol --check (code=exited, status=0/SUCCESS) Process: 2876 ExecStart=/usr/sbin/fancontrol (code=exited, status=1/FAILURE) Main PID: 2876 (code=exited, status=1/FAILURE) May 28 00:08:13 helios64 fancontrol[2876]: MINPWM=0 May 28 00:08:13 helios64 fancontrol[2876]: MAXPWM=255 May 28 00:08:13 helios64 fancontrol[2876]: AVERAGE=1 May 28 00:08:13 helios64 fancontrol[2876]: Error: file /dev/thermal-cpu/temp1_input doesn't exist May 28 00:08:13 helios64 fancontrol[2876]: Error: file /dev/thermal-cpu/temp1_input doesn't exist May 28 00:08:13 helios64 fancontrol[2876]: At least one referenced file is missing. Either some required kernel May 28 00:08:13 helios64 fancontrol[2876]: modules haven't been loaded, or your configuration file is outdated. May 28 00:08:13 helios64 fancontrol[2876]: In the latter case, you should run pwmconfig again. May 28 00:08:13 helios64 systemd[1]: fancontrol.service: Main process exited, code=exited, status=1/FAILURE May 28 00:08:13 helios64 systemd[1]: fancontrol.service: Failed with result 'exit-code'.  
    Basically fancontrol expect a device in /dev to read the sensors value from, and that device seems to be missing. After a bit of poking around and learning about udev, I managed to manually solve the issue by recreating the device symlink manually:
    /usr/bin/mkdir /dev/thermal-cpu/ ln -s /sys/devices/virtual/thermal/thermal_zone0/temp /dev/thermal-cpu/temp1_input systemctl restart fancontrol.service systemctl status fancontrol.service Now digging more this issue happen because udev is not creating the symlink like it should for some reason. After reading the rule in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and a bit of udev documentation, I managed to find how to test it:
    root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0 [...] Reading rules file: /etc/udev/rules.d/90-helios64-hwmon-legacy.rules Reading rules file: /etc/udev/rules.d/90-helios64-ups.rules [...] DEVPATH=/devices/virtual/thermal/thermal_zone0 ACTION=add SUBSYSTEM=thermal IS_HELIOS64_HWMON=1 HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0 USEC_INITIALIZED=7544717 run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0 ' <-- something is wrong here, there is no target Unload module index Unloaded link configuration context. After spending a bit more time reading the udev rule, I realized that the second argument was empty because we don't match the ATTR{type}=="soc-thermal" condition. We can look up the types like this:
    root@helios64:~ # find /sys/ -name type | grep thermal /sys/devices/virtual/thermal/cooling_device1/type /sys/devices/virtual/thermal/thermal_zone0/type /sys/devices/virtual/thermal/cooling_device4/type /sys/devices/virtual/thermal/cooling_device2/type /sys/devices/virtual/thermal/thermal_zone1/type /sys/devices/virtual/thermal/cooling_device0/type /sys/devices/virtual/thermal/cooling_device3/type /sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_alert0/type /sys/firmware/devicetree/base/thermal-zones/gpu/trips/gpu_crit/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_crit/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert0/type /sys/firmware/devicetree/base/thermal-zones/cpu/trips/cpu_alert1/type root@helios64:~ # cat /sys/devices/virtual/thermal/thermal_zone0/type cpu <-- we were expecting soc-thermal! and after rewriting the line with the new type, udev is happy again
    # Edit in /etc/udev/rules.d/90-helios64-hwmon-legacy.rules and add the following line after the original one ATTR{type}=="cpu", ENV{HWMON_PATH}="/sys%p/temp", ENV{HELIOS64_SYMLINK}="/dev/thermal-cpu/temp1_input", RUN+="/usr/bin/mkdir /dev/thermal-cpu/" root@helios64:~ # udevadm control --reload root@helios64:~ # udevadm test /sys/devices/virtual/thermal/thermal_zone0 [...] DEVPATH=/devices/virtual/thermal/thermal_zone0 ACTION=add SUBSYSTEM=thermal IS_HELIOS64_HWMON=1 HWMON_PATH=/sys/devices/virtual/thermal/thermal_zone0/temp HELIOS64_SYMLINK=/dev/thermal-cpu/temp1_input USEC_INITIALIZED=7544717 run: '/usr/bin/mkdir /dev/thermal-cpu/' run: '/bin/ln -sf /sys/devices/virtual/thermal/thermal_zone0/temp /dev/thermal-cpu/temp1_input' Unload module index Unloaded link configuration context. Apparently for some reason the device-tree changed upstream and the thermal type changed from soc-thermal to cpu?
  17. Like
    hartraft reacted to wurmfood in Kobol Team is taking a short Break !   
    Enjoy your well deserved break. It's been a rough year for all of us, but I will saying that getting my Helios64 set up and running has been a welcome diversion and bright spot during this time.
    Thanks all and I can't wait to see what comes next.
  18. Like
    hartraft reacted to gprovost in Kobol Team is taking a short Break !   
    It’s been 3 months since we posted on our blog. While we have been pretty active on Armbian/Kobol forum for support and still working at improving the software support and stability, we have been developing in parallel the new iteration of Helios64 around the latest Rockchip SoC RK3568.
     
    However things haven’t been progressing as fast as we would have wished. Looking back, 2020 has been a very challenging year to deliver a new product and it took quite a toll on the small team we are. Our energy level is a bit low and we still haven’t really recovered. Now with electronic part prices surge and crazy lead time, it’s even harder to have business visibility in an already challenging market.
     
    In light of the above, we decided to go on a full break for the next 2 months, to recharge our battery away from Kobol and come back with a refocused strategy and pumped up energy.
     
    Until we are back, we hope you will understand that communication on the different channels (blog, wiki, forum, support email) will be kept to a minimum for the next 2 months.
     
    Thanks again all for your support.
  19. Like
    hartraft reacted to tkaiser in How to provide and interpret Debug output   
    Armbian implements some basic 'system logging' at every startup and shutdown and contains a little utility to provide this collected information combined with some more useful debug info from user installations. All that's needed is executing 'sudo armbianmonitor -u' (or armbian-config --> Software --> Diagnostics) and then all this support related information is uploaded to an online pasteboard service automagically (see example output) 
     

     
    How to interpret this wall of text? The output is best read from bottom to top since the most important information is collected during upload:
     
    At the bottom /proc/interrupts contents to check for IRQ affinity problems (interrupt collissions on some CPU cores negatively affecting performance) Then last 250 lines dmesg output are included. Here you might find important information wrt the last (kernel) events that happened on the machine uptime output including average load statistics (1, 5 and 15 min) free output telling how much physical memory is available and how much swap and/or zram is used (you need to look directly above whether zram is active or not to interpret the 'Swap:' line) vmstat output contains virtual memory usage information since last reboot iostat output contains the same but allows for a 'per device' view since all devices are listed with individual statistics (so it's easy to spot IO bottlenecks by looking at these numbers and also looking at %iowait value) 'Current system health' displays what the system is actually doing while uploading the debug log (on systems where DC-IN monitoring is available also allowing for underpowering diagnosis -- if you read here numbers below 5.0V stop reading the log and tell the user to fix his underpowering issues first) In case the installation has been moved from SD card to other storage nand-sata-install.log will be included in the output 'Loaded modules' allow to look for module related problems If it's an Allwinner board running legacy kernel the whole script.bin contents are included 'Installed packages' shows version numbers of relevant Armbian packages 'Group membership of' should list all groups the user is member of. If this line is missing ignore the whole contents and ask the user to re-submit debug info, this time doing it correctly not as root but using 'sudo armbianmonitor -u' (group memberships are important to understand certain problems, eg. users not being member of audio group won't have success getting noise out of their devices) If the board is PCIe capable list of attached PCIe devices is included The lsusb output lists all connected USB devices and also information about speed (12M, 480M, 5000M) and protocol/connection details (mass-storage vs uas for example) If the user installed the lshw utility and verbosity is set to 4 or above in /boot/armbianEnv.txt some more disk related information will be included  
    Important: The debug output also contains all collected support files that follow this naming scheme: /tmp/armbianmonitor_checks_* -- so if a user complains about 'transmission so slow' or 'latest files are always missing' ask him to run 'armbianmonitor -c /path/to/torrent-storage' and afterwards 'sudo armbianmonitor -u' without a reboot in between since then the checking results will also be contained.
     
    Everything above of this information at the output's bottom is result of regular logging at startup and shutdown (the contents of /var/log/armhwinfo.log). 
     
    At startup the following items are logged: dmesg output, /etc/armbian-release and /boot/armbianEnv.txt contents, lsusb and lscpu output, /proc/cpuinfo and /proc/meminfo contents, network interface information, available partitions and filesystems, on Allwinner boards where /boot/script.bin points to, some metadata information for all MMC media connected to the host (eg. SD card and/or eMMC) and some system health information.
     
    At shutdown iostat, vmstat and free output are added to /var/log/armhwinfo.log as well as the last 100 lines from dmesg output. If these '### shutdown' entries are missing after reboots the system crashed while shutting down.
  20. Like
    hartraft reacted to piter75 in How to make a SD card bootable with U-Boot   
    @Chalix You have done quite a research already
    Helios64 is Rockchip RK3399 based so you are better off with the theory found here http://opensource.rock-chips.com/wiki_Boot_option as a starter.
     
    U-boot and vendor blob binaries are found in "/usr/lib/linux-u-boot-$BRANCH-helios64_$RELEASE_arm64" folder on your system.
    You are probably using "current" $BRANCH and the latest $RELEASE is 21.02.3.
     
    The recipe to write the images to device (more or less) is to be found here.
     
  21. Like
    hartraft reacted to lyuyhn in Information on Autoshutdown / sleep / WOL?   
    Any news on this side? I'm really looking forward to be able to suspend and use wol!
  22. Like
    hartraft reacted to Polarisgeek in Does anyone actually have a stable system?   
    Just wanted to add my experience as maybe someone can get something semi-useful out of it. 
     
    I started out with 3 - 4TB drives with 2 of them in Raid 1 and the 3rd on it's own.  I installed OMV and CUPS on a Micro SD card.  Even from the very beginning, I always had trouble restarting the system.  Sometimes it would boot after 3 tries, sometimes it would take 20 tries.  I watched and it was never the same issue twice that caused the boot failure.  I removed drives, plugged in different combo's, several fixes on the forums, anything I could think of and nothing really seemed to make a difference.  The good news was that once it was up and running, it ran great.  I had 2 months of uptime with no issues.  Then i'd run an update and have a heck of a time getting it restarted.  Just figured that's the way it was and hoped things would get better as the software matured.  
     
    I'm now on my second iteration with 5 - 14TB (Thanks Uncle Joe) drives in Raid 6.  After 2 months, i've had 2 random crashes for unknown reasons, one at 3 days and the other after 7 days.  Aside from running updates and rebooting, it's been running solid for me.  I'm only running OMV and a CUPS server.    Getting to this point was a huge PITA.  I spent nearly 8 hours trying to get it up and running over the course of a Sunday.  It seemed like no matter what I did, i'd have a Kernel Panic during the setup process, and then the system wouldn't boot at all giving different errors every time.  So i'd re-download the image, and attempt to install from scratch again.  I played with installing different options, in different sequences (as much as possible), without much luck.  I verified the download every time.  I finally tried using USBImager, as an alternative to BalenaEtcher, and found that wring to the sdcard seemed better and the installation seemed better, although it did panic a couple times.  Once I finally got it running, I gave it some time, installed all the updates, reboot several times in between setting up OMV for testing and have been happy with it since.  The biggest difference now is that I don't worry about rebooting as it comes right up the first time instead of several tries like the first iteration.  
     
    Overall, i've been happy with mine.  I'm very new to the Linux environment so it's been fun learning a new thing.  Keep up the great work!
  23. Like
    hartraft got a reaction from gprovost in Does anyone actually have a stable system?   
    My system has been "stable" running the LK 5.9. I updated to 5.10 in the past but had issues with stability under small load, not rebooting through omv etc. so I went back to the previous version. 
    Running MergerFS / SnapRAID / LUKS on Seagate Ironwolves with the 1.2ghz frequency 
     
    $ uname - a
    Linux helios64 5.9.14-rockchip64 #20.11.3 SMP PREEMPT Fri Dec 11 20:50:18 CET 2020 aarch64 GNU/Linux
     
    $ uptime
     20:25:53 up 67 days, 19:23, 2 users, load average: 0.08, 0.12, 0.12
     
    I'm sticking with this until things get clearer but would like to get full use of my NAS hardware... 
  24. Like
    hartraft reacted to aprayoga in SATA issue, drive resets: ataX.00: failed command: READ FPDMA QUEUED   
    Hi all, you could download the SATA firmware update at https://cdn-kobol-io.s3-ap-southeast-1.amazonaws.com/Helios64/SATA_FW/Helios64_SATA_FW_update_00021_200624.img.xz
     
    Instruction:
    1. Download the sd card image
    2. Flash the image into a microSD card
    3. Insert microSD card into Helios64 and power on. Helios64 should automatically boot from microSD. If Helios64 still boot from eMMC, disable the eMMC 
    4. Wait for a while, the system will do a reboot and then power off if firmware flashing succeed.
       If failed, both System Status and System Fault LEDs will blink
    5. Remove the microSD card and boot Helios64 normally. See if there is any improvement.
     
    Our officially supported stock firmware can be downloaded from https://cdn-kobol-io.s3-ap-southeast-1.amazonaws.com/Helios64/SATA_FW/Helios64_SATA_FW_factory_00020_190702.img.xz. If there is no improvement on newer firmware, please revert back to this stock firmware.
     
    SHA256SUM:
    e5dfbe84f4709a3e2138ffb620f0ee62ecbcc79a8f83692c1c1d7a4361f0d30f *Helios64_SATA_FW_factory_00020_190702.img.xz 0d78fec569dd699fd667acf59ba7b07c420a2865e1bcb8b85b26b61d404998c5 *Helios64_SATA_FW_update_00021_200624.img.xz  
  25. Like
    hartraft reacted to aprayoga in UPS service and timer   
    @SIGSEGV @clostro the service is triggered by udev rules.
     
    @wurmfood, I didn't realize the timer fill the the log every 20s. Initial idea was one time timer, poweroff after 10m of power loss event. Then it was improved to add polling of the battery level and poweroff when threshold reached. Your script look good, we are considering to adapt it to official release. Thank you