Jump to content

djurny

Members
  • Posts

    66
  • Joined

  • Last visited

Everything posted by djurny

  1. @jsr, can you also check /proc/cryptinfo ? If I remember correctly, the mv/marvell module should show up in cryptinfo. Edit: Also noticed you are LUKSing an mmc device. Not sure about the speed of device used in your setup, but my SD cards hardly reach 30MiB/s, even if CESA would work. Your throughput might be limited by additional bottleneck.
  2. Hi @SR-G, FYI I have been running Linux kobol0 5.9.13-rockchip64 #trunk.16 SMP PREEMPT Tue Dec 8 21:23:17 CET 2020 aarch64 GNU/Linux for some days now. Still up. Perhaps you can give this one a try? Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster /etc/default/cpufrequtils: ENABLE=true MIN_SPEED=408000 MAX_SPEED=1200000 GOVERNOR=powersave (The box runs idles with 'powersave' and will be set to 'performance' when performing tasks.) I do not know which non-dev kernel version this is? I assume it's 21.02 or 21.03? Groetjes,
  3. Hi, I would also be interested in finding out what it would take to host a local repository mirror. Over the past few weeks I did try to rsync the repo to local storage, but I'm not sure if that will affect anything on the 'other' side. Had to break off the rsync as it appeared to sync extremely slow. Groetjes,
  4. Hi @tikey, Did you make sure the SSD is making proper contact to the backplane connector? I also have some issues with the utmost left disk slot, when I remove/replace a disk in that slot, I have to open up the box and make sure that the connectors are tight and snug. If that does not help, you can try to insert the disk in another slot to rule out any connector or power rail issues. When box is open, make sure to also press down on the connectors in the main board, to make sure this is not related to loose cables or such. Just FYI, I run a CT120BX500SSD1 in my box, although not the same model, but it works :-) Groetjes,
  5. Hi @3735943886, When logging in on any of "my" Armbian devices, it will also (start and) attach to GNU screen session. I sometimes observe this behavior as well (mostly in aptitude and not armbian-config), but have not been able to pinpoint when this starts or what causes this. Have you tried to detach from GNU screen and retry? If it shows and behaves correctly outside of GNU screen, you can start digging in the environment settings first by 'env'. Perhaps also show the current active LOCALEs and write down the terminal encoding configured for your ssh client/terminal, as for me sometimes the arrow keys refuse service in aptitude (seems linked to either wrong terminal enconding on connecting client or wrong terminal type set on the Armbian side). I'll try to reproduce at home and share results, perhaps you can do the same, so we can compare output. Groetjes,
  6. Hi @snakekick, USB HDD for snapraid parity, sounds like my setup! You can check with vm_block_dump what is waking up your device: echo 1 | sudo tee /proc/sys/vm/block_dump Above will enable logging of block device accesses into syslog. If you want to see it happen without flooding your /var/log/syslog: sudo service rsyslog stop while true ; do dmesg -cT ; done sudo service rsyslog start See: Documentation for /proc/sys/vm/* and How to conserve battery power using laptop-mode. You should also check if you have enabled SMART offline auto testing on your HDD; that might also wake up your drive, but this is done by the drive itself: sudo smartctl -a /dev/sdX | egrep 'offline' sudo smartctl --offlineauto=off /dev/sdX Other things that will wake up your drive: temperature monitoring services like hddtemp, you should check if it offers options to not access the drive if it's in standby/sleep mode. Other things like blkid when used as root, will also check all blockdevices, even if you think it is using cache. What is the brand of USB dock you are using? Perhaps your dock is doing something to the drive to wake it up regularly. Hope that helps, Groetjes,
  7. Hi, Not sure if I can help here, but willing to help. Can you put the SDcard into another Linux system and list the contents of the /boot folder? Also, please share the contents of the /boot/armbianEnv.txt. Looks like the same thing I had once, after a downgrade of kernel, the DTB files were renamed or not present for unknown reasons. Groetjes
  8. Hi, Can you connect the USB-C cable, connect to the serial console (1.500.000 baud and no HW/SW flow control) and post the output after a reboot/powercycle? What's on the serial console should point to what is going wrong. Did you by any change update the system and continue without rebooting? I had something similar happen on one of the nanopi R2S boxes, where the update was not successful, resulting in an incomplete /boot folder. Groetjes,
  9. Hi, Short update. After looking at the most recent kernel freezes I experienced on 5.9.13-rockchip64 #trunk.16 and reading through a thread meant for Helios4, I decided to stop using cpufreq governor conservative. Switched to either 'powersave' or 'performance' depending on workload, but not to have the frequency changed on-the-fly. Box has been running smooth since last power cycle. As system is nice and stable now, will do some more maintenance before upgrading to latest advised configuration. Groetjes, See last Oops, mentioning something about trying to set some regulator voltage, triggered by cpufreq-dt module: (Note that system did not fully freeze, parts of the sytem continued service). After posting this, I configured cpufreq to use schedutil governor and after roughly 3 hours of load, it froze up with one of the other patterns observed before: Will change back to powersave and give it some load again. Not sure if there is a correlation here. Groetjes,
  10. Hi, Also here kernel Oops after some load: [82182.500900] Unable to handle kernel paging request at virtual address ffff800011b14000 [..] [82182.505753] Internal error: Oops: 96000007 [#1] PREEMPT SMP [..] [82182.526948] Call trace: [82182.527414] x1 : ffff800011532db0 x0 : 00000000ffffffea [82182.527921] gic_handle_irq+0x124/0x158 [82182.528384] Call trace: [82182.528883] el1_irq+0xb8/0x180 [82182.529350] __handle_domain_irq+0xc4/0x108 [82182.529593] arch_cpu_idle+0x14/0x20 [82182.530058] Code: f822683a a94153f3 a9425bf5 a94363f7 (a9446bf9) [82182.530425] do_idle+0x210/0x260 [82182.530640] ---[ end trace c165b2007f1cb8d2 ]--- [82182.530946] cpu_startup_entry+0x28/0x60 [82182.531312] Kernel panic - not syncing: Attempted to kill the idle task! [82182.531657] rest_init+0xd8/0xe8 [82182.532193] SMP: stopping secondary CPUs [82182.532503] arch_call_rest_init+0x10/0x1c [82182.534913] start_kernel+0x80c/0x848 [82182.535258] ---[ end trace c165b2007f1cb8d3 ]--- [82182.535692] Kernel Offset: disabled [82182.536009] CPU features: 0x0240022,2000200c [82182.536388] Memory Limit: none [82182.536675] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- It appears that the system had been idling for some hours before the page fault occurred. Is there anything I can collect or try to see if this will improve? This oops seems unrelated to system load. Thanks, Groetjes,
  11. Hi all, Something to share for those who use the USB-C serial console from another Linux host. Install and use 'tio' to connect to the serial console instead of minicom. This supports both 1500k baud and also can be easily used inside GNU screen (minicom gets a meta key conflict per default; CTRL-A is default meta key for both GNU screen and minicom). Minicom resulted in regular errors posted in syslog by the ftdi_sio kernel module. Did not run any strace to find out what syscall is causing it, but in short, tio appears to not treat the tty as a modem: no errors are popping up in syslog. Hopefully the serial consoles will remain up now. One caveat: I did not find a way to send a BREAK over serial using tio. This is something that is handy in case kernel freezes up, as sometimes you will still have opportunity to do a magic sysrq triggered reboot (BREAK + b = initiate a reboot of the kernel, also see magic sysrq & REISUB). Groetjes,
  12. Hi, A short update, unfortunately kernel has crashed again, but after a couple of days. So there is improvement :-) No serial console output, as the usb-serial connection on my Pi stopped responding (will open another thread on this, not really Helios64 related though). Will restart some loading and try a different usb-serial setup, hopefully both will not crash (that often) anymore. Groetjes,
  13. Hi, I've also experienced almost hourly instabilities when running some load on my Helios64 box. Tried several kernels, each with their own Oops/BUG pattern. See below for an overview: It's not exhaustive; in the end I did the following and the box is now running some load (snapraid scrub on ~12TiB of data) without any issue: Enabled daily built kernel, now running Linux kobol0 5.9.11-rockchip64 #trunk.2 SMP PREEMPT Sun Nov 29 00:29:16 CET 2020 aarch64 GNU/Linux. Why: Every kernel had their own pattern, either do_undefinstr or XHCI hangup or page fault. Assumed latest greatest has most fixes. Enabled the i2c dtb overlays. Why: Some of the kernels showed some IRQ related to i2c in the Oops/BUG. Thought I find something in the dtb related to i2c and just enable it to see if that might fix something. Moved rootfs from USB stick to SATA SSD in slot4. Why: Some of the kernels had a repeated hanging XHCI controller, so I tried to remove some USB devices from the controller, to see if the amount of load on the controller itself might be a vector (, Victor). Also removed tlp and set SATA link power management to max_performance (hat tip @gprovost). It's a weak investigation, as I fiddled with multiple things at once, trying to get things going quickly (I do not have much spare time to spend on this as I would like to). Still, perhaps this will trigger someone or give some more angles to fiddle with for others. Fingers crossed. Looking good so far: djurny@kobol0:~$ uname -a Linux kobol0 5.9.11-rockchip64 #trunk.2 SMP PREEMPT Sun Nov 29 00:29:16 CET 2020 aarch64 GNU/Linux djurny@kobol0:~$ uptime 07:26:58 up 2 days, 10:40, 7 users, load average: 1.73, 1.76, 1.74 djurny@kobol0:~$ (The box has been running rdfind, xfs_fsr, snapraid scrub & check for the last 2 days (in that order).) Groetjes,
  14. Hi @gprovost, Looks like I had 'tlp' installed, presumably during some messing about with getting things like powertop installed. Thanks for the trigger: Removed tlp and LEDs are back to normal! Groetjes,
  15. Hi, It's only the "HDD x Activity LEDs" that are cycling. The other LEDs are not showing this cycling. The cycling speed appears to increase with CPU frequency, just like how the "System Activity LED" (heartbeat trigger) frequency will increase when CPU frequency increases. I made a video of the effect, but have some trouble uploading it (>20MiB). here: https://streamable.com/v8wa36, Note I do not have any trouble with this effect, just wondering if this is by design and how I can customize it :-) Groetjes,
  16. Hi @gprovost, replies have crossed. Yes, all the HDD status LEDs are cycling simultaneously. For the HDD with activity, it's still blinking with activity. I cannot make out if the actitivy LED itself is also cycling, my eyes are a bit too old for that. Groetjes,
  17. Hi, Only the USB HDD (sdc) is configured to have spindown after some minutes, the others were not explicitly configured to enter either standby/sleep mode. Note that all of the HDD status LEDs show are cycling, not just a few. hdparm shows the following strange values: Device: Used as: Interface: Media: hdparm -C says: sdb rootfs USB USB stick drive state is: standby sdg swap SATA SSD drive state is: active/idle sda data0 SATA HDD drive state is: active/idle sdd data1 SATA HDD drive state is: active/idle sde data2 SATA HDD drive state is: active/idle sdf data3 SATA HDD drive state is: active/idle sdc parity0 USB HDD drive state is: active/idle I've never heard that a USB media stick can be put in standby mode? Even more strange is that '/' is running off of this USB device. See below for more details. Groetjes,
  18. Hi, Not sure if anyone has seen the same as I have been seen on my box; the HDD status LEDs are cycling in brightness, from 0% to 100% and back to 0% continuously. It's a smooth transition, no flashy circus. I might have triggered this myself, but no idea how? Is this is a new feature? I tried to make some video of it, but it is quite difficult to see. Groetjes,
  19. Hi @JeffDwork, I have used snapraid for testing, and running md5sums on all the content on the disks. Once to 'sync' or create MD5 hashes, subsequent runs to 'scrub' or check MD5 hashes. This gave me some warm feeling on how fast the system can calculate hashes and how fast the disk I/O is. This would then give me a good indication on scheduling maintenance actions, e.g. if 'scrub' takes 12 hours, need to make sure it does not push out or overlap other scheduled maintenance actions etc. Overall, it will depend on what you care about the most; CPU performance/temperature, disk throughput, filesystem reliability, system stability or perhaps other factors. Groetjes,
  20. Hi, Both are Zyxel switches, one is 16 ports GS1100 -16 and the other is 8 ports GS-108Bv3. Both are Gbps capable, as shown by other devices connected to the same switch: Helios4: Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Just now, I tried yet another set of cables to connect to the Helios64 box, and it seems I have bought several batches of cat "5e" cables, with stress on "5e". Looks like a cabling issue still. I never noticed this, as most are connected to either Raspberry Pi2b or OrangePi zero devices (see ethtool output from one of the Pis below, connected to the same 16ports Gbps capable switch). I already ordered another batch of [apparently] shielded cat 6 cables, hopefully they are indeed shielded, cat 6 and not cat "6". Raspberry Pi2b: Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Please disregard my previous post. Thanks, Groetjes,
  21. djurny

    djurny

  22. Hi, Are there any plans to make a toddler-proof version of the front grille, that will cover the buttons? Currently I just applied some lofi containment by simply flipping the front grille so it covers the front panel. Perhaps some snap-in plexiglass for the panel cutout, with a little doorknob type of thing? Have not checked if the buttons can be disabled in software yet (https://wiki.kobol.io/helios64/button/), perhaps the PMIC can be programmed in user space? Groetjes,
  23. Hi, After fixing the LED issue, I started to try out if snapraid is working. On the Helios4 snapraid ran into some issues due to the amount of files available on the snapraid "array"; 32bit addressing constraints caused snapraid to bork out regularly. No matter the snapraid configuration tweaking/trial & error applied, it kept on requiring more than 4GB of addressing space. After running "sync" and "scrub" for the first time on the Helios64, I noticed a more than comfortable amount of alleged ata I/O errors like below: ata1.00: failed command: READ FPDMA QUEUED After some searching around on the internet, it appeared that limiting SATA link speed, these errors can be prevented. Checking other server deployments, this behavior was also seen in a 8 disk mdadm RAID setup, whre [new] WD blue disks also show these READ FPDMA QUEUED erros, which disappeared after ata error handling starts to turn down SATA link speeds to 3Gbps. To test this out, I added the following to /boot/armbianEnv.txt: extraargs=libata.force=3.0 Upon rebooting the box, it appears that libata indeed limited the SATA link speed for all drives to 3Gbps: Oct 29 22:01:59 localhost kernel: [ 3.143259] ata1: FORCE: PHY spd limit set to 3.0Gbps Oct 29 22:01:59 localhost kernel: [ 3.143728] ata1: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010100 irq 238 Oct 29 22:01:59 localhost kernel: [ 3.143736] ata2: FORCE: PHY spd limit set to 3.0Gbps Oct 29 22:01:59 localhost kernel: [ 3.144192] ata2: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010180 irq 239 Oct 29 22:01:59 localhost kernel: [ 3.144199] ata3: FORCE: PHY spd limit set to 3.0Gbps Oct 29 22:01:59 localhost kernel: [ 3.144654] ata3: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010200 irq 240 Oct 29 22:01:59 localhost kernel: [ 3.144661] ata4: FORCE: PHY spd limit set to 3.0Gbps Oct 29 22:01:59 localhost kernel: [ 3.145115] ata4: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010280 irq 241 Oct 29 22:01:59 localhost kernel: [ 3.145122] ata5: FORCE: PHY spd limit set to 3.0Gbps Oct 29 22:01:59 localhost kernel: [ 3.145603] ata5: SATA max UDMA/133 abar m8192@0xfa010000 port 0xfa010300 irq 242 Redoing the snapraid scrub, the READ FPDMA QUEUED errors indeed had disappeared. As the disks in the box are WD red HDDs, there is not really a point of having 6Gbps (~600MB/s) SATA linkspeed anyway, disk performance is rated at less than 300MB/s throughput. (Occasionally it tips sustained sequential reads around 130MiB/s for large files.) Note that YMMV. Groetjes,
  24. D'oh. Looks like that is indeed the case. Will plan to try to add some clearance for the front panel during the next scheduled down activity
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines