djurny

  • Posts

    62
  • Joined

  • Last visited

Reputation Activity

  1. Like
    djurny got a reaction from IgorS in Real time clock DS3231   
    Hi @IgorS,
    After following the how to's on this forum and the internets, would like to share some other things I did to make it work even better on my OrangePi Zero.
     
    Connect DS3231 to TWI0 (PA11+PA12 and +5V/GND of course).
      Add i2c0 overlay to /boot/armbianEnv.txt (or use armbian-config to enable the i2c0 overlay). [...] overlays=usbhost2 usbhost3 uart1 pps-gpio i2c0 [...]  
    Add custom overlay to add DS3231 RTC (using DS3232 module instead of DS1307). Save as rtc0-i2c0-ds3231.dts.
     
    Add the custom DT overlay: sudo armbian-add-overlay rtc0-i2c0-ds3231.dts  
    Add custom overlay to rename H2+ SoC RTC to rtc1. Save as rtc1-soc.dts.
     
    Add the custom DT overlay: sudo armbian-add-overlay rtc1-soc.dts  
    Disable fake-hwclock service: sudo systemctl stop fake-hwclock.service # stop sudo systemctl disable fake-hwclock.service # disable sudo systemctl mask fake-hwclock.service # really disable Reboot and verify that you now have 2x RTC on your OrangePi Zero: root@sinaspi:~# ls -l /dev/rtc* lrwxrwxrwx 1 root root 4 Aug 5 22:57 /dev/rtc -> rtc0 crw------- 1 root root 253, 0 Aug 5 22:57 /dev/rtc0 crw------- 1 root root 253, 1 Aug 5 22:57 /dev/rtc1 As mentioned in another post, on the H2+ the SoC supplied RTC is indeed running fast, confirm this as follows: for RTC in /dev/rtc[0-9] do hwclock --rtc="${RTC:?}" --adjust hwclock --rtc="${RTC:?}" --systohc done sleep $(( 5 * 60 )) for RTC in /dev/rtc[0-9] do echo "${RTC:-N/A}:" hwclock --rtc="${RTC:?}" --get date --rfc-3339=ns done On my OrangePi Zero the SoC RTC is dashing ahead:
     
    /dev/rtc0: 2021-08-08 09:17:52.760046+08:00 2021-08-08 09:17:52.526062078+08:00 /dev/rtc1: 2021-08-08 11:59:11.150733+08:00 2021-08-08 09:17:54.392611945+08:00 (rtc0 = DS3231 and rtc1 = SoC RTC.)
     
    On the same OrangePi Zero, there is also a GPS receiver connected that has PPS output. Used @Elektrický's how to, to set up GPS and ntpsec. After successfully following the how to, ntp will synchronize and adjust system clock to high(er) accuracy. Once system clock is synchronized, the kernel will also update the RTC (/dev/rtc0) every 11 minutes, giving you a system as follows:
    root@sinaspi:~# timedatectl Local time: Sun 2021-08-08 09:24:01 CST Universal time: Sun 2021-08-08 01:24:01 UTC RTC time: Sun 2021-08-08 01:24:02 Time zone: Asia/Taipei (CST, +0800) System clock synchronized: yes NTP service: inactive RTC in local TZ: no root@sinaspi:~# ntpq -p remote refid st t when poll reach delay offset jitter ======================================================================================================= 0.debian.pool.ntp.org .POOL. 16 p - 256 0 0.0000 0.0000 0.0019 1.debian.pool.ntp.org .POOL. 16 p - 256 0 0.0000 0.0000 0.0019 2.debian.pool.ntp.org .POOL. 16 p - 256 0 0.0000 0.0000 0.0019 3.debian.pool.ntp.org .POOL. 16 p - 64 0 0.0000 0.0000 0.0019 oPPS(0) .PPS. 0 l 56 64 377 0.0000 -0.0072 0.0035 xSHM(0) .GPS. 0 l 21 64 377 0.0000 -15.3659 1.7862 +SHM(2) .PPS. 0 l 18 64 377 0.0000 -0.0163 0.0094 +europa.ellipse.net 209.180.247.49 2 u 40 64 377 167.9069 -0.9358 0.1437 +ntp1.time.nl .MRS. 1 u 43 64 377 210.9643 1.7869 0.1945 +promethee.boudot.one 94.198.159.10 2 u 21 64 377 220.2528 2.3322 0.0931 root@sinaspi:~# As the DS3231 is now set as rtc0, the udev rules in /lib/udev/rules.d/85-hwclock.rules will make sure to read the DS3231 clock time after a reboot, making sure your system clock has a nice starting offset after being powered off for a while.
     
    Hope this helps anyone out there,
    Groetjes,
  2. Like
    djurny got a reaction from Elektrický in Orange Pi Zero NTP Stratum 1 PPS GPS Server with Armbian OS, Hardware and Software Tutorial   
    Hi @Elektrický,
    Nice tutorial, worked great for me. Bought a cheap GPS module:

     
    And a pinheader to solder onto the OrangePi Zero board itself.

     
    The only thing I had to change was to swap the RX/TX wires, as a straight connection did not work for my GPS board.


     
    I did not have any success getting PPS to work through the USB connection (works as ttyACM). I had to use the GPIO dtb overlay and electrical connection as per your tut.
    gpsd in action:

     
    ntp[sec] in action:

     
    Thanks a lot!
    Groetjes,
     
    p.s. I still have to figure out why I needed this
  3. Like
    djurny reacted to tparys in Unable to mount LUKS-encrypted disk   
    I think @lanefu was referring to doing a diff between the configs of the rockchip64 kernel which does not work, and the sunxi kernel which does.
     
    Based on the error, I'd wager the keyslot open is failing because you're missing a hash function. Believe the default is SHA1, unless you specified something different. Running "cryptsetup luksDump /dev/sda1" or whatever, should tell you for sure.
     
    If your system has AF_ALG compiled in, you can also run "cryptsetup benchmark", which runs typical hash and cipher algorithms used by LUKS. If it fails on something, that could be a clue.
  4. Like
    djurny reacted to Igor in Armbian 21.08 (Caracal) Release Thread   
    Manually or by driving our test rig?
     
     
     
  5. Like
    djurny got a reaction from hartraft in ArmbianEnv.txt file being overwritten   
    Hi,
    You can try the following to get an idea of which process is modifying the file:
     
    sudo apt-get install auditd sudo auditctl -w /boot/armbianEnv.txt -p wa sudo tail -F /var/log/audit/audit.log  
    The output should show actions performed on the file and (IIRC) the process ID performing those actions, perhaps that might help?
     
    e.g.:
    type=CWD msg=audit(1624526184.097:207): cwd="/home/djurny" type=PATH msg=audit(1624526184.097:207): item=0 name="/boot/" inode=7601 dev=b3:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PATH msg=audit(1624526184.097:207): item=1 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.097:207): proctitle=xx type=SYSCALL msg=audit(1624526184.109:208): arch=40000028 syscall=94 per=800000 success=yes exit=0 a0=3 a1=81a4 a2=c80eeb00 a3=81a4 items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=PATH msg=audit(1624526184.109:208): item=0 name=(null) inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:208): proctitle=xx type=SYSCALL msg=audit(1624526184.109:209): arch=40000028 syscall=226 per=800000 success=yes exit=0 a0=1787820 a1=b6e3e1a0 a2=1907fc0 a3=1c items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=CWD msg=audit(1624526184.109:209): cwd="/home/djurny" type=PATH msg=audit(1624526184.109:209): item=0 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:209): proctitle=xx  
    Source: Find which process is modifying a file [duplicate]
     
    It honestly sounds a bit more like a filesystem that has lost track or a inode mixup somewhere? As someone mentioned, perhaps something else is writing to this file, thinking it is something else? Perhaps try to locate a symbolic (or hard)link to armbanEnv.txt:
    ## find any symbolic links to armbianEnv.txt sudo find / -xdev -type l -ls | egrep -i -- armbianEnv.txt ## find hardlinks to armbianEnv.txt - need to be on the same filesystem! sudo find / -xdev -samefile /boot/armbianEnv.txt  
    Groetjes,
  6. Like
    djurny got a reaction from Polarisgeek in ArmbianEnv.txt file being overwritten   
    Hi,
    You can try the following to get an idea of which process is modifying the file:
     
    sudo apt-get install auditd sudo auditctl -w /boot/armbianEnv.txt -p wa sudo tail -F /var/log/audit/audit.log  
    The output should show actions performed on the file and (IIRC) the process ID performing those actions, perhaps that might help?
     
    e.g.:
    type=CWD msg=audit(1624526184.097:207): cwd="/home/djurny" type=PATH msg=audit(1624526184.097:207): item=0 name="/boot/" inode=7601 dev=b3:01 mode=040755 ouid=0 ogid=0 rdev=00:00 nametype=PARENT cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PATH msg=audit(1624526184.097:207): item=1 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=CREATE cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.097:207): proctitle=xx type=SYSCALL msg=audit(1624526184.109:208): arch=40000028 syscall=94 per=800000 success=yes exit=0 a0=3 a1=81a4 a2=c80eeb00 a3=81a4 items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=PATH msg=audit(1624526184.109:208): item=0 name=(null) inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:208): proctitle=xx type=SYSCALL msg=audit(1624526184.109:209): arch=40000028 syscall=226 per=800000 success=yes exit=0 a0=1787820 a1=b6e3e1a0 a2=1907fc0 a3=1c items=1 ppid=9641 pid=9642 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts5 ses=2 comm="vi" exe="/usr/bin/vim.basic" subj==unconfined key=(null) type=CWD msg=audit(1624526184.109:209): cwd="/home/djurny" type=PATH msg=audit(1624526184.109:209): item=0 name="/boot/armbianEnv.txt" inode=43198 dev=b3:01 mode=0100644 ouid=0 ogid=0 rdev=00:00 nametype=NORMAL cap_fp=0 cap_fi=0 cap_fe=0 cap_fver=0 cap_frootid=0 type=PROCTITLE msg=audit(1624526184.109:209): proctitle=xx  
    Source: Find which process is modifying a file [duplicate]
     
    It honestly sounds a bit more like a filesystem that has lost track or a inode mixup somewhere? As someone mentioned, perhaps something else is writing to this file, thinking it is something else? Perhaps try to locate a symbolic (or hard)link to armbanEnv.txt:
    ## find any symbolic links to armbianEnv.txt sudo find / -xdev -type l -ls | egrep -i -- armbianEnv.txt ## find hardlinks to armbianEnv.txt - need to be on the same filesystem! sudo find / -xdev -samefile /boot/armbianEnv.txt  
    Groetjes,
  7. Like
    djurny got a reaction from lanefu in Orange Pi Zero NTP Stratum 1 PPS GPS Server with Armbian OS, Hardware and Software Tutorial   
    Hi @Elektrický,
    Nice tutorial, worked great for me. Bought a cheap GPS module:

     
    And a pinheader to solder onto the OrangePi Zero board itself.

     
    The only thing I had to change was to swap the RX/TX wires, as a straight connection did not work for my GPS board.


     
    I did not have any success getting PPS to work through the USB connection (works as ttyACM). I had to use the GPIO dtb overlay and electrical connection as per your tut.
    gpsd in action:

     
    ntp[sec] in action:

     
    Thanks a lot!
    Groetjes,
     
    p.s. I still have to figure out why I needed this
  8. Like
    djurny got a reaction from lanefu in Script to check the power status, and shutdown if battery is low!   
    Hi,
    Have you checked "nut" (network UPS tools) to see if they have some prefab solutions?
    Perhaps a custom nut driver, as I recall they have this type of behavior already available for "regular" UPSes.
    Groetjes,
  9. Like
    djurny got a reaction from lanefu in Help with udev rules to statically assign device names for USB-UARTS based on USB port ID   
    Hi, sorry, I forgot to mention that the configure-serial-tty script does require some target modification: it depends on the baudrate being fixed on the target, as sending <BRK> both triggers Magic SysRq and optional baudrate switch by *getty. To get this to work on my setup with highish successrate, I reconfigured the target's serial-getty configurations to only work on one baudrate.
     
    See parts of my ansible task that sets that puppy up:
    The ansible task shell script will try to parse the commandline to get the currently configured baudrates. It assumes the baudrates are sorted from high to low, i.e. 1500000,115200,38400,9600 for my NanoPi R2S and Helios64 boxen. It will replace the baudrates with the most-left it had parsed (which should be the highest baudrate). It also will disable the options to *getty that skip setting the baudrate explicitly (i.e. keep-baud and extract-baud).
     
    Let me know if this worked out for you!
    Groetjes,
  10. Like
    djurny got a reaction from lanefu in Help with udev rules to statically assign device names for USB-UARTS based on USB port ID   
    Hi,
    I made a small detection script using expect. That will spit out the hosts it found per try and baudraute guesstimate. Afterwards it will query the USB port location using udevadm info and build udev rules on a name basis for the hosts it found on those ports.
    It's not a generic solution, let me know if you are interested, I'll share when back at my workstation.
    Groetjes,
  11. Like
    djurny reacted to gprovost in Feature / Changes requests for future Helios64 board or enclosure revisions   
    RK3399 has a single PCIe 2.1 x 4 lanes port, you can't split the lanes for multi interfaces unless you use a PCIe switch which is an additional component that would increase a lot the board cost.
     
    That's clearly the intention, we don't want either to restart form scratch the software and documentation :P
     
    Yes this will be fixed, it was a silly design mistake. We will also make the back panel bracket holder with rounder edges to avoid user to scratch their hands :/
     
    We will post soon how to manage fan speed based on HDD temperature (using hddtemp tool), that would make more sense than current approach.
    You can already find an old example : https://unix.stackexchange.com/questions/499409/adjust-fan-speed-via-fancontrol-according-to-hard-disk-temperature-hddtemp
     
  12. Like
    djurny reacted to Elektrický in Orange Pi Zero NTP Stratum 1 PPS GPS Server with Armbian OS, Hardware and Software Tutorial   
    Orange Pi Zero NTP Stratum 1 PPS GPS Server with Armbian OS.
    Link to the Tutorial - http://schwartzel.eu3.org/ntp-stratum1.html

    This tutorial uses a 3.3V capable GPS module with PPS output - TOPGNSS GN-701 (u-blox 7) but other similar modules should work.

     
    This tutorial is for the Orange Pi Zero, but will probably work for other boards.

     

    I couldn't easily do a comprehensive hardware and software tutorial on this forum, so I've published it on my web server and linked from here and attached a PDF.
    Link to the Tutorial - http://schwartzel.eu3.org/ntp-stratum1.html

    Tutorial PDF
    ntp-stratum1.pdf

    If you spot any typo's or errors please let me know.
     
  13. Like
    djurny reacted to gprovost in Kobol Team is taking a short Break !   
    It’s been 3 months since we posted on our blog. While we have been pretty active on Armbian/Kobol forum for support and still working at improving the software support and stability, we have been developing in parallel the new iteration of Helios64 around the latest Rockchip SoC RK3568.
     
    However things haven’t been progressing as fast as we would have wished. Looking back, 2020 has been a very challenging year to deliver a new product and it took quite a toll on the small team we are. Our energy level is a bit low and we still haven’t really recovered. Now with electronic part prices surge and crazy lead time, it’s even harder to have business visibility in an already challenging market.
     
    In light of the above, we decided to go on a full break for the next 2 months, to recharge our battery away from Kobol and come back with a refocused strategy and pumped up energy.
     
    Until we are back, we hope you will understand that communication on the different channels (blog, wiki, forum, support email) will be kept to a minimum for the next 2 months.
     
    Thanks again all for your support.
  14. Like
    djurny got a reaction from gprovost in Helios64 - freeze whatever the kernel is.   
    Hi @SR-G,
    FYI I have been running Linux kobol0 5.9.13-rockchip64 #trunk.16 SMP PREEMPT Tue Dec 8 21:23:17 CET 2020 aarch64 GNU/Linux for some days now. Still up. Perhaps you can give this one a try?
    Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster  
    /etc/default/cpufrequtils:
    ENABLE=true MIN_SPEED=408000 MAX_SPEED=1200000 GOVERNOR=powersave (The box runs idles with 'powersave' and will be set to 'performance' when performing tasks.)
     
    I do not know which non-dev kernel version this is? I assume it's 21.02 or 21.03?

    Groetjes,
     
  15. Like
    djurny got a reaction from tikey in M.2 SSD "Crucial MX500 1TB CT1000MX500SSD4" not detected   
    Hi @tikey,
    Did you make sure the SSD is making proper contact to the backplane connector? I also have some issues with the utmost left disk slot, when I remove/replace a disk in that slot, I have to open up the box and make sure that the connectors are tight and snug. If that does not help, you can try to insert the disk in another slot to rule out any connector or power rail issues. When box is open, make sure to also press down on the connectors in the main board, to make sure this is not related to loose cables or such.
    Just FYI, I run a CT120BX500SSD1 in my box, although not the same model, but it works :-)
    Groetjes,
  16. Like
    djurny got a reaction from snakekick in Sleep usb sleep   
    Hi @snakekick,
    USB HDD for snapraid parity, sounds like my setup! You can check with vm_block_dump what is waking up your device: 
     
    echo 1 | sudo tee /proc/sys/vm/block_dump Above will enable logging of block device accesses into syslog.
     
    If you want to see it happen without flooding your /var/log/syslog:
    sudo service rsyslog stop while true ; do dmesg -cT ; done sudo service rsyslog start  
    See: Documentation for /proc/sys/vm/* and How to conserve battery power using laptop-mode.
     
    You should also check if you have enabled SMART offline auto testing on your HDD; that might also wake up your drive, but this is done by the drive itself:
    sudo smartctl -a /dev/sdX | egrep 'offline' sudo smartctl --offlineauto=off /dev/sdX  
    Other things that will wake up your drive: temperature monitoring services like hddtemp, you should check if it offers options to not access the drive if it's in standby/sleep mode.  Other things like blkid when used as root, will also check all blockdevices, even if you think it is using cache.
     
    What is the brand of USB dock you are using? Perhaps your dock is doing something to the drive to wake it up regularly.
     
    Hope that helps,
    Groetjes,
  17. Like
    djurny got a reaction from gprovost in Sleep usb sleep   
    Hi @snakekick,
    USB HDD for snapraid parity, sounds like my setup! You can check with vm_block_dump what is waking up your device: 
     
    echo 1 | sudo tee /proc/sys/vm/block_dump Above will enable logging of block device accesses into syslog.
     
    If you want to see it happen without flooding your /var/log/syslog:
    sudo service rsyslog stop while true ; do dmesg -cT ; done sudo service rsyslog start  
    See: Documentation for /proc/sys/vm/* and How to conserve battery power using laptop-mode.
     
    You should also check if you have enabled SMART offline auto testing on your HDD; that might also wake up your drive, but this is done by the drive itself:
    sudo smartctl -a /dev/sdX | egrep 'offline' sudo smartctl --offlineauto=off /dev/sdX  
    Other things that will wake up your drive: temperature monitoring services like hddtemp, you should check if it offers options to not access the drive if it's in standby/sleep mode.  Other things like blkid when used as root, will also check all blockdevices, even if you think it is using cache.
     
    What is the brand of USB dock you are using? Perhaps your dock is doing something to the drive to wake it up regularly.
     
    Hope that helps,
    Groetjes,
  18. Like
    djurny reacted to Z06Frank in Helios64 - freeze whatever the kernel is.   
    djurny......thanks for your help but I decided to restart from scratch knowing the issue. Back up and running now. I set CPU Governor to "Performance".
     
    I read and make some power management changes in Windows 10 (esp ACHI Power Management) to get max performance for SSD/HHDs file transfer and turned off any idling/power savings. I also wanted an option on this tweak to prevent data loss on power loss to hdparm.conf:  http://ubuntuhandbook.org/index.php/2014/01/disable-disk-caching-prevent-data-loss/
  19. Like
    djurny got a reaction from gprovost in Helios64 - freeze whatever the kernel is.   
    Hi,
    Not sure if I can help here, but willing to help.
    Can you put the SDcard into another Linux system and list the contents of the /boot folder? Also, please share the contents of the /boot/armbianEnv.txt. Looks like the same thing I had once, after a downgrade of kernel, the DTB files were renamed or not present for unknown reasons.
    Groetjes
  20. Like
    djurny reacted to gprovost in Helios64 - freeze whatever the kernel is.   
    @djurny Thanks for the feedback. We will let you know.
     
    We are working with Rockchip to get to the root cause of this instability. We are still suspecting wrong DDR timing settings.
  21. Like
    djurny got a reaction from gprovost in Helios64 USB-C serial console   
    Hi all,
    Something to share for those who use the USB-C serial console from another Linux host. Install and use 'tio' to connect to the serial console instead of minicom. This supports both 1500k baud and also can be easily used inside GNU screen (minicom gets a meta key conflict per default; CTRL-A is default meta key for both GNU screen and minicom). Minicom resulted in regular errors posted in syslog by the ftdi_sio kernel module. Did not run any strace to find out what syscall is causing it, but in short, tio appears to not treat the tty as a modem: no errors are popping up in syslog. Hopefully the serial consoles will remain up now.
    One caveat: I did not find a way to send a BREAK over serial using tio. This is something that is handy in case kernel freezes up, as sometimes you will still have opportunity to do a magic sysrq triggered reboot (BREAK + b = initiate a reboot of the kernel, also see magic sysrq & REISUB).
     
    Groetjes,
     
  22. Like
    djurny got a reaction from aprayoga in Helios64 USB-C serial console   
    Hi all,
    Something to share for those who use the USB-C serial console from another Linux host. Install and use 'tio' to connect to the serial console instead of minicom. This supports both 1500k baud and also can be easily used inside GNU screen (minicom gets a meta key conflict per default; CTRL-A is default meta key for both GNU screen and minicom). Minicom resulted in regular errors posted in syslog by the ftdi_sio kernel module. Did not run any strace to find out what syscall is causing it, but in short, tio appears to not treat the tty as a modem: no errors are popping up in syslog. Hopefully the serial consoles will remain up now.
    One caveat: I did not find a way to send a BREAK over serial using tio. This is something that is handy in case kernel freezes up, as sometimes you will still have opportunity to do a magic sysrq triggered reboot (BREAK + b = initiate a reboot of the kernel, also see magic sysrq & REISUB).
     
    Groetjes,
     
  23. Like
    djurny got a reaction from gprovost in Helios64 - freeze whatever the kernel is.   
    Hi,
    I've also experienced almost hourly instabilities when running some load on my Helios64 box. Tried several kernels, each with their own Oops/BUG pattern. See below for an overview:
     
     
    It's not exhaustive; in the end I did the following and the box is now running some load (snapraid scrub on ~12TiB of data) without any issue:
    Enabled daily built kernel, now running Linux kobol0 5.9.11-rockchip64 #trunk.2 SMP PREEMPT Sun Nov 29 00:29:16 CET 2020 aarch64 GNU/Linux.
    Why: Every kernel had their own pattern, either do_undefinstr or XHCI hangup or page fault. Assumed latest greatest has most fixes. Enabled the i2c dtb overlays.
    Why: Some of the kernels showed some IRQ related to i2c in the Oops/BUG. Thought I find something in the dtb related to i2c and just enable it to see if that might fix something. Moved rootfs from USB stick to SATA SSD in slot4.
    Why: Some of the kernels had a repeated hanging XHCI controller, so I tried to remove some USB devices from the controller, to see if the amount of load on the controller itself might be a vector (, Victor). Also removed tlp and set SATA link power management to max_performance (hat tip @gprovost). It's a weak investigation, as I fiddled with multiple things at once, trying to get things going quickly (I do not have much spare time to spend on this as I would like to). Still, perhaps this will trigger someone or give some more angles to fiddle with for others.
    Fingers crossed.
     
    Looking good so far:
    djurny@kobol0:~$ uname -a Linux kobol0 5.9.11-rockchip64 #trunk.2 SMP PREEMPT Sun Nov 29 00:29:16 CET 2020 aarch64 GNU/Linux djurny@kobol0:~$ uptime 07:26:58 up 2 days, 10:40, 7 users, load average: 1.73, 1.76, 1.74 djurny@kobol0:~$ (The box has been running rdfind, xfs_fsr, snapraid scrub & check for the last 2 days (in that order).)
     
    Groetjes,
  24. Like
    djurny reacted to ShadowDance in SATA issue, drive resets: ataX.00: failed command: READ FPDMA QUEUED   
    This is a continuation of ata1.00: failed command: READ FPDMA QUEUED noted by @djurny. I've experienced the same issue, and have some additional data points to provide.
     
    My observations so far:
    I'm using WDC WD60EFRX (68MYMN1 and 68L0BN1) drives The drives were working without issue previously behind a ASMedia ASM1062 SATA Controller, I've also used some of them behind ASM1542 (external eSATA enclosure) I can reproduce the issue on a clean install of Armbian 20.08.21 Buster and Focal I can reproduce via simple `dd` to `/dev/null` from the drive so filesystem does not seem to be the underlying cause Every drive is affected (i.e. each SATA slot) At what point dd produces an error varies from SATA slot to SATA slot (not drive dependent), SATA slot 4 can reproducibly produce the error almost immediately after starting a read The problem goes away when setting `extraargs=libata.force=3.0` in `/boot/armbianEnv.txt` [1] [1] However, even with SATA limited to 3 Gbps, the problem did reappear when hot-plugging a drive.
     
    This reset happened on drive slot 3 when I hot-plugged a drive onto slot 5. This seems weird to me considering they are supposed to be on different power rails. This may suggest there is in general a problem with either the PSU or power delivery to the drives. Here's an excerpt from the reset:
     
    [152957.354311] ata3.00: exception Emask 0x10 SAct 0x80000000 SErr 0x9b0000 action 0xe frozen [152957.354318] ata3.00: irq_stat 0x00400000, PHY RDY changed [152957.354322] ata3: SError: { PHYRdyChg PHYInt 10B8B Dispar LinkSeq } [152957.354328] ata3.00: failed command: READ FPDMA QUEUED [152957.354335] ata3.00: cmd 60/58:f8:00:f8:e7/01:00:71:02:00/40 tag 31 ncq dma 176128 in                          res 40/00:f8:00:f8:e7/00:00:71:02:00/40 Emask 0x10 (ATA bus error) [152957.354338] ata3.00: status: { DRDY } [152957.354345] ata3: hard resetting link  
    And the full dmesg from when the error happened is below:
     
     
  25. Like
    djurny reacted to gprovost in Helios64 Support   
    It might be that one side of front panel (the side with red LEDs) touch a bit the metal opening shorting the LED therefore lighting them up.
    Could you trip to loosen a bit the 2x screws holding the front panel, then push a bit back the PCB, then tighten again the screw.
    Otherwise putting a piece of tape on the PCB side that touch the metal opening, to isolate the LED, could help.
     
    For next batch we will have to increase more the gap because mass production doesn't seem to meet exactly our tolerance requirement  :-/