Dennboy

  • Posts

    38
  • Joined

  • Last visited

Everything posted by Dennboy

  1. Hi Igor, Thanks for your support! Bullseye already feels stable enough for my purposes (synchronized voltage/current sensors for green R&D projects). I temporarily fixed my /var/log/journal size issue at (every) boot by clearing /var/log.hdd/journal and copying over /var/log/journal after logrotate has run. Guess this could be automated somehow in the boot scripts. Kind regards, Dennis
  2. Hi Again, The problem seems to be /var/log.hdd/journal which is copied back to /var/log/journal after reboot without cleaning. Directly after a reboot: % du /var/log/journal /var/log.hdd/journal 46560 /var/log/journal/50bea5a2341c40588d32c8103dea6e71 46568 /var/log/journal 46320 /var/log.hdd/journal/50bea5a2341c40588d32c8103dea6e71 46324 /var/log.hdd/journal % sudo diff -r /var/log/journal /var/log.hdd/journal [sudo] password for dennis: Only in /var/log/journal/50bea5a2341c40588d32c8103dea6e71: system@0005cc0672bea9a2-be31a0e4284cf010.journal~ Binary files /var/log/journal/50bea5a2341c40588d32c8103dea6e71/system.journal and /var/log.hdd/journal/50bea5a2341c40588d32c8103dea6e71/system.journal differ Kind regards, Dennis
  3. Hi again, Just updated and rebooted bullseye on nanopi neo3 and now the journals take 50M, before the reboot it was 17.3M .... Fortunately, the log rotation seems to be working again has /bin/bash on first line now, so after 15 minutes /var/log was back to reasonable size. Could not yet reproduce this on opi0, it used 20M of journals after reboot. Welcome to Armbian 21.08.2 Bullseye with Linux 5.10.63-rockchip64 System load: 100% Up time: 0 min Memory usage: 12% of 978M IP: 192.168.0.13 CPU temp: 60�°C Usage of /: 29% of 15G # journalctl --disk-usage Archived and active journals take up 50.6M in the file system. npin3bullseye:~:% df -h /var/log Filesystem Size Used Avail Use% Mounted on /dev/zram1 49M 48M 0 100% /var/log Welcome to Armbian 21.08.2 Bullseye with Linux 5.10.60-sunxi System load: 48% Up time: 0 min Memory usage: 15% of 491M IP: 192.168.0.30 CPU temp: 57�°C Usage of /: 21% of 15G [ General system configuration (beta): armbian-config ] % journalctl --disk-usage Archived and active journals take up 20.0M in the file system. opi0bullseye:~:% df -h /var/log Filesystem Size Used Avail Use% Mounted on /dev/zram1 49M 27M 19M 60% /var/log Kind regards, Dennis
  4. Dear all, I gave the new opi0 bullseye image a try, since It seems to be supported now. It may be worth to have a fresh look at the limits for systemd journald logging, and log rotation. My /var/log was loaded upto 45M some hours after booting, installing packages on the opi0, mainly due to /var/log/journal. I vacuumed it to 10M but after some reboots, package installations and hours it was full again although SystemMaxUse=20M by default. Additionally, the log rotation (fired every 15 minutes from /etc/cron.d/armbian-truncate-logs) does not seem to work properly due to an issue in the script that is started with /bin/sh (see script line 1, instead of a shell that performs the $() command expansion): % sudo /usr/lib/armbian/armbian-truncate-logs [sudo] password for dennis: /usr/lib/armbian/armbian-truncate-logs: 18: /etc/default/armbian-ramlog: Syntax error: "(" unexpected Running it with bash seems to work fine (sudo bash /usr/lib/armbian/armbian-truncate-logs, which also vacuums the journal). After deciphering the journald.conf manpage, I found that both runtime /run/log/journal, as persistent /var/log/journal are running, and /var/log/journal is not really persistent since it is actually mapped zram memory (journald does not care and treats it as disk). % egrep -i "(system|runtime)MaxUse" /etc/systemd/journald.conf SystemMaxUse=20M #RuntimeMaxUse= % sudo journalctl --vacuum-size=10M Vacuuming done, freed 0B of archived journals from /run/log/journal. Deleted empty archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0005cb79ce3b8f57-f277d6397cc77cac.journal~ (380.0K). Deleted empty archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/user-1000@0005cb7769ae3241-2385d20dac4738ae.journal~ (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@1c1a53e8b25d40f3b676b26240347188-0000000000000001-0005ca73fa192ab2.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/user-1000@7268c35dcad84410bca267900c47dcf3-00000000000003de-0005cb76b53d43a7.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0005cb77143d39ed-a0190e6a8b8f96ef.journal~ (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0005cb77ab234dfb-97d2013e1117189a.journal~ (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/user-1000@0005cb77ab294eee-86b91930cb2eb4a9.journal~ (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0368477b592e43fdbf9ffcfb6934d264-0000000000000001-0005cb77ab1c0c05.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0368477b592e43fdbf9ffcfb6934d264-00000000000002be-0005cb77ab27dc7d.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/user-1000@bd369698bf69432a8bc7ab495a90d103-00000000000002cd-0005cb77abdedbd0.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0368477b592e43fdbf9ffcfb6934d264-0000000000000960-0005cb7805815938.journal (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/system@0005cb787689a85a-c5fb08a0411e45bd.journal~ (2.5M). Deleted archived journal /var/log/journal/078479a1b0a4459f80590dd571c3d6de/user-1000@0005cb787781d2a7-d60e1ce6a19e246b.journal~ (2.5M). Vacuuming done, freed 30.4M of archived journals from /var/log/journal/078479a1b0a4459f80590dd571c3d6de. Vacuuming done, freed 0B of archived journals from /var/log/journal. Vacuuming done, freed 0B of archived journals from /run/log/journal/078479a1b0a4459f80590dd571c3d6de. % ls -lsa /run/log/journal/078479a1b0a4459f80590dd571c3d6de total 5056 0 drwxr-s---+ 2 root systemd-journal 200 Sep 8 11:58 . 0 drwxr-sr-x+ 3 root systemd-journal 60 Sep 8 11:01 .. 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:51 system@6b593560fcd344f2910e96f6d1a94416-000000000000719c-0005cb7a7b871b5f.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:52 system@6b593560fcd344f2910e96f6d1a94416-0000000000007493-0005cb7a81dbae26.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:54 system@6b593560fcd344f2910e96f6d1a94416-0000000000007783-0005cb7a85740dde.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:55 system@6b593560fcd344f2910e96f6d1a94416-0000000000007a7a-0005cb7a8c94d731.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:56 system@6b593560fcd344f2910e96f6d1a94416-0000000000007d6f-0005cb7a8d656445.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:57 system@6b593560fcd344f2910e96f6d1a94416-0000000000008062-0005cb7a93bd9e11.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 11:58 system@6b593560fcd344f2910e96f6d1a94416-0000000000008352-0005cb7a9754eac1.journal 632 -rw-r-----+ 1 root systemd-journal 647168 Sep 8 12:32 system.journal % ls -lsa /var/log/journal/078479a1b0a4459f80590dd571c3d6de total 12740 4 drwxr-sr-x 2 root systemd-journal 4096 Sep 8 11:58 . 4 drwxr-sr-x 3 root systemd-journal 4096 Aug 26 10:39 .. 2564 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:37 system@0005cb78a256a261-d8ca0603791c37d6.journal~ 2564 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:26 system@1e380884930f4d539d6a0d93d2b03d39-0000000000000001-0005cb7876859703.journal 2476 -rw-r-----+ 1 root systemd-journal 2527232 Sep 8 11:01 system.journal 2564 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:26 user-1000@933777cd01ac4494a761849c2e955801-00000000000002ee-0005cb787781bf5f.journal 2564 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:36 user-1000.journal % ls -lsa /var/log.hdd/journal/078479a1b0a4459f80590dd571c3d6de total 43904 4 drwxr-sr-x 2 root systemd-journal 4096 Sep 8 09:56 . 4 drwxr-sr-x 3 root systemd-journal 4096 Aug 26 10:39 .. 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 07:46 system@0005cb77143d39ed-a0190e6a8b8f96ef.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:28 system@0005cb77ab234dfb-97d2013e1117189a.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:25 system@0005cb787689a85a-c5fb08a0411e45bd.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:37 system@0005cb78a256a261-d8ca0603791c37d6.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:28 system@0368477b592e43fdbf9ffcfb6934d264-0000000000000001-0005cb77ab1c0c05.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:53 system@0368477b592e43fdbf9ffcfb6934d264-00000000000002be-0005cb77ab27dc7d.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:18 system@0368477b592e43fdbf9ffcfb6934d264-0000000000000960-0005cb7805815938.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 07:46 system@1c1a53e8b25d40f3b676b26240347188-0000000000000001-0005ca73fa192ab2.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:26 system@1e380884930f4d539d6a0d93d2b03d39-0000000000000001-0005cb7876859703.journal 376 -rw-r----- 1 root systemd-journal 385024 Sep 8 09:37 system.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:10 user-1000@0005cb7769ae3241-2385d20dac4738ae.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:28 user-1000@0005cb77ab294eee-86b91930cb2eb4a9.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:25 user-1000@0005cb787781d2a7-d60e1ce6a19e246b.journal~ 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 07:46 user-1000@7268c35dcad84410bca267900c47dcf3-00000000000003de-0005cb76b53d43a7.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:26 user-1000@933777cd01ac4494a761849c2e955801-00000000000002ee-0005cb787781bf5f.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 08:43 user-1000@bd369698bf69432a8bc7ab495a90d103-00000000000002cd-0005cb77abdedbd0.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:18 user-1000@bd369698bf69432a8bc7ab495a90d103-0000000000000a2b-0005cb7811371fa6.journal 2560 -rw-r----- 1 root systemd-journal 2621440 Sep 8 09:36 user-1000.journal % sudo journalctl --disk-usage Archived and active journals take up 17.3M in the file system. Kind regards, Dennis
  5. Hi tparys, Yes, that is exactly my point. I had NetworkManager stopped and disabled, and during an apt update && apt upgrade, the linked /etc/resolv.conf was turned into a NetworkManager managed textfile, rendering 2 nanopi neo3's unreachable after reboot. I had to use debug-uart to repair the softlink. Kind regards, Dennis
  6. Hi Ianefu, Thanks for the pointer. It appears that rc-manager in `/etc/NetworkManager/NetworkManager.conf` is currently set to file, it probably needs to be set to something else to respect the softlink in /etc/resolv.conf. From the networkmanager.conf manpage : rc-manager Set the resolv.conf management mode. The default value depends on NetworkManager build options, and this version of NetworkManager was build with a default of "resolvconf". Regardless of this setting, NetworkManager will always write resolv.conf to its runtime state directory /run/NetworkManager/resolv.conf. From the description of the different options, it looks like it can be set to `symlink` or `resolvconf` to improve its behaviour, even `file` should not replace existing links... unless there is something in the network-manager debian package that does this... Kind regards, Dennis
  7. Dear all, I'm running Armbian 21.02.1 on the nanopi neo3 (diagnostic: http://ix.io/3y5D) using /etc/network/interfaces and have network-manager disabled (`sudo systemctl disable network-manager.service` ). I occasionally use network-manager for a usb-wifi stick. When I recently upgraded the packages it triggered a network-manager update. After a reboot the nanopi was not able to start is networking, it only got a 169.254.6.233 address. After analyzing the situation the network-manager update apparently removed the softlink `/etc/resolv.conf -> /run/resolvconf/resolv.conf` and replaced it with a text file. Can network-manager be changed somehow, so that it keeps using the softlink instead of its own interpretation of /etc/resolv.conf? Kind regards, Dennis
  8. Hi tparys, Thanks for your reply pointing to the upstream debian repo's, and sorry for posting an upstream bug here. I just looked up the dtc package in debian, and they appear to have it fixed in their newest update 1.4.7-4 from 27 january 2021 (see DTC debian changelog), it is now available in Armbian too, and seems to be working. Apparently it took some time to travel to the repo/mirrors, it wasn't yet available when I posted here. Kind regards, Dennis
  9. Dear all, My system logs: http://ix.io/2Pgj I consistency get a crash with dtc on both orangepi, nanopi neo+2, nanopi neo3 when I try to see the current devicetree via the filesystem, e.g.: $ dtc -I fs /sys/firmware/devicetree/base/ <stdout>: Warning (unit_address_vs_reg): /memory: node has a reg or ranges property, but no unit name <stdout>: Warning (clocks_property): /soc/spdif@1c21000:clocks: cell 0 is not a phandle reference <stdout>: Warning (clocks_property): /soc/spdif@1c21000:clocks: cell 2 is not a phandle reference .... /dts-v1/; / { [1] 16929 segmentation fault dtc -I fs /sys/firmware/devicetree/base/ My current work-around is to install the kernel headers and start dtc from there, but headers takes quite long to install: $ sudo apt install linux-headers-current-sunxi $ /lib/modules/5.10.12-sunxi/build/scripts/dtc/dtc -I fs /sys/firmware/devicetree/base/ /dts-v1/; / { compatible = "xunlong,orangepi-zero\0allwinner,sun8i-h2-plus"; serial-number = "02c000421661c2d2"; model = "Xunlong Orange Pi Zero"; interrupt-parent = <0x01>; #address-cells = <0x01>; #size-cells = <0x01>; ... }; I could also try to install an upstream version of dtc, but it would be great if the supplied dtc doesn't crash. Kind regards, Dennis
  10. I tried to override the interrupt affinity using a devicetree overlay, but also this seems to be ignored on sunxi. Anybody knows why? Is the affinity simply not implemented by sunxi? Below is the overlay to change the interrupt affinity. Devicetree before modification shows interrupt-affinity=<&cpu0>,<&cpu1>,<&cpu2>,<&cpu3>; /dts-v1/; /plugin/; / { compatible = "allwinner,sun8i-h3","allwinner,sun50i-h5","friendlyarm,nanopi-neo2"; fragment@1 { target-path = "/pmu"; __overlay__ { interrupt-affinity = <&cpu0>; }; }; }; After boot only using cpu0 is reflected in the runtime devicetree on opi0: $ /lib/modules/5.10.12-sunxi/build/scripts/dtc/dtc -I fs /sys/firmware/devicetree/base|less pmu { compatible = "arm,cortex-a7-pmu"; interrupts = <0x00 0x78 0x04 0x00 0x79 0x04 0x00 0x7a 0x04 0x00 0x7b 0x04>; interrupt-affinity = <0x30>; }; cpus { cpu@0 { compatible = "arm,cortex-a7"; phandle = <0x30>; ... }; ... }; Here is the /proc/interrupt table again: HomeCT2:~:% grep ads /proc/interrupts 66: 1043 0 0 0 sunxi_pio_edge 1 Edge ads131a04 162: 68 255 1040571 1 ads131a04-dev0 Edge ads131a04_consumer0 So the affinity has been changed but seems to be ignored. p.s. Sorry for talking to myself;-)
  11. Hi, I experimented some further and set /proc/irq/default_smp_affinity to 1 for the first core, and it looks good on paper, but in practice most interrupts still go to the second core. echo 1 > /proc/irq/default_smp_affinity # dynamically load ADC kernel driver HomeCT2:~:% cat /proc/irq/162/smp_affinity 1 HomeCT2:~:% cat /proc/irq/162/effective_affinity 0 HomeCT2:~:% grep ads /proc/interrupts 66: 1397 0 0 0 sunxi_pio_edge 1 Edge ads131a04 162: 557 1394461 27 0 ads131a04-dev0 Edge ads131a04_consumer0
  12. Dear all, I'm frequently losing GPIO interrupts on various boards, which seem to be caused by bursty processes. When I shield the cores that have those interrupts according to /proc/interrupts with e..g. "cset shield --cpu=0,1" , way less interrupts are lost (e.g. for 4kHz interrupts one in hours instead of once per minute). Most of these (ADC) interrupts are handled by my own kernel module, but couldn't find how to restrict them to e.g. CPU0, and they are not movable using /proc/irq/*/smp_affinity. HomeCT1:~:% grep ads /proc/interrupts 66: 166559 0 0 0 sunxi_pio_edge 1 Edge ads131a04 162: 273 152692648 13858531 17 ads131a04-dev0 Edge ads131a04_consumer0 HomeCT1:~:% cat /proc/irq/162/smp_affinity f I noticed that on allwinner (opi0, nanopi neo+2), the ADC interrupts are on different CPU cores. On the neo3 they always seem to go to the 1st CPU core. Is there any way to force the interrupts to 1st core on the allwinner as well (e.g. in kernel module c-code or via devicetree)? This would make it easier to shield the first core and reduce the lost interrupts. Or is there another way to give more priority to the GPIO interrupts? Making the kernel thread realtime with chrt is apparently not enough. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2407 root rt 0 0 0 0 S 4.6 0.0 116:06.38 [irq/162-ads131a] Kind regards, Dennis
  13. Hi Igor, Thanks very much for fixing this, building out-of-tree modules now works great in Armbian 20.02.1, on 5.10.12-sunxi (opi0) and 5.10.12-rockchip64 (neo3)! Kind regards, Dennis
  14. Yes Igor, for sure we are thankful for all your help here! I'm glad only toys got broken Hope you can teach your kid to repair toys some day I tried with the nanopineo2 (Armbian 20.11.6 Buster), the modules build fine but the installation reports SSL errors during installation of the module, that didn't appear on the orangepizero. The module(s) can be loaded and run after depmod -a. INSTALL pps_timer.ko At main.c:160: - SSL error:02001002:system library:fopen:No such file or directory: ../crypto/bio/bss_file.c:69 - SSL error:2006D080:BIO routines:BIO_new_file:no such file: ../crypto/bio/bss_file.c:76 sign-file: certs/signing_key.pem: No such file or directory DEPMOD 5.10.4-sunxi64 Kind regards, Dennis
  15. Hi @hjoe, Your fix seems to work beautifully on the orangepizero, thanks! My kernel module loads and runs beautifully. Tomorrow I'll try on nanopi neo2+ and neo3. @Igor Good luck with the family in these strange times. I hope things change for the best in the next months as more people get vacinated. Kind regards, Dennis
  16. Just retried with Armbian 20.11.7 kernel on nanopineo+2, but have the same issue on that board.
  17. Just retried with a very simple kernel module (added license in second attempt to get rid of kernel taint), with the same result. $ sudo modprobe ktest_module modprobe: ERROR: could not insert 'ktest_module': Exec format error $ dmesg|tail [ 24.151391] pps_timer: module PLT section(s) missing [ 33.784067] vcc3v0: disabling [ 33.784092] vcc5v0: disabling [ 33.784099] vcc-wifi: disabling [ 1413.944125] pps_timer: module PLT section(s) missing [ 7157.483985] ktest_module: module license 'unspecified' taints kernel. [ 7157.484004] Disabling lock debugging due to kernel taint [ 7157.484019] ktest_module: module PLT section(s) missing [ 7357.485416] ktest_module: module PLT section(s) missing [ 7465.032086] ktest_module: module PLT section(s) missing #include <linux/module.h> /* Needed by all modules */ #include <linux/kernel.h> /* Needed for KERN_INFO */ #include <linux/init.h> /* Needed for the macros */ #include <linux/version.h> /* Licensed under the GPL-2 or later. */ MODULE_LICENSE("GPL v2"); static int __init ktest_module_init(void) { //printk(KERN_INFO "Driver version %s\n", VERSION); printk(KERN_INFO "Buona Sera :)\n"); return 0; } static void __exit ktest_module_exit(void) { printk(KERN_INFO "Ciao!\n"); } module_init(ktest_module_init); module_exit(ktest_module_exit);
  18. Dear all, I can successfully build an out-of-tree module on 5.9 and 5.10.x kernels, but have trouble running them since Armbian 20.11.6. I fixed the module.lds missing problem using the receipt from a bugreport with an extra Makefile target: PWD=$(shell pwd) VER=$(shell uname -r) KERNEL_BUILD=/lib/modules/$(VER)/build # Later if you want to package the module binary you can provide an INSTALL_ROOT # INSTALL_ROOT=/tmp/install-root MY_CFLAGS += -g -DDEBUG # for newer 5.10.x kernels with x<=4 $(KERNEL_BUILD)/scripts/module.lds: $(KERNEL_BUILD)/scripts/module.lds.S sudo sh -c "sed '$ d' $(KERNEL_BUILD)/scripts/module.lds.S > $(KERNEL_BUILD)/scripts/module.lds" pps: pps_timer.c $(MAKE) -C $(KERNEL_BUILD) M=$(PWD) modules pps_install: pps sudo $(MAKE) -C $(KERNEL_BUILD) M=$(PWD) obj_m=pps_timer.o \ INSTALL_MOD_PATH=$(INSTALL_ROOT) modules_instal When trying the module on the 5.10.x kernel, I get an error that the Exec format is not correct: $ sudo modprobe pps_timer modprobe: ERROR: could not insert 'pps_timer': Exec format error And in dmesg, I see the following line: [ 1413.944125] pps_timer: module PLT section(s) missing Did something change in the 5.10.x kernel that needs something extra in the Makefile or source for out-of-tree modules? Kind regards, Dennis
  19. Dear maintainers, I have my sensors configured to reboot every night via a user cronjob (0 0 * * * /sbin/reboot), 14 sensors do this without a problem. I've fixed the nanopi neo+2 reboot from NAND some months ago (by re-using friendlyarm first stage u-boot). I just stumbled upon a failed reboot with one of my nanopi neo+2 nodes, after two successful reboots. Looking at the /var/log.hdd/syslog, it got stuck in the shutdown procedure when the watchdog reported a failure. The /var/log.hdd/syslog.1 extracts below show the start of the watchdog, and the stop of the watchdog and its failure. After the failure the system doesn't come up anymore, it needed a powercycle, which is quite inconvenient since it is installed at a hard to access remote location. Sep 18 00:03:26 EnexisVT2-1 systemd[1]: Starting watchdog daemon... Sep 18 00:03:26 EnexisVT2-1 systemd[1]: Reached target Graphical Interface. Sep 18 00:03:26 EnexisVT2-1 systemd[1]: Starting Update UTMP about System Runlevel Changes... Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: starting daemon (5.15): Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: int=1s realtime=yes sync=no load=0,0,0 soft=no Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: memory not checked Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: ping: no machine to check Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: file: no file to check Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: pidfile: no server process to check Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: interface: no interface to check Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: temperature: no sensors to check Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: no test binary files Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: no repair binary files Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: error retry time-out = 60 seconds Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: repair attempts = 1 Sep 18 00:03:26 EnexisVT2-1 watchdog[2212]: alive=[none] heartbeat=[none] to=root no_act=no force=no Sep 18 00:03:26 EnexisVT2-1 systemd[1]: Started watchdog daemon. ... Sep 19 00:00:01 EnexisVT2-1 CRON[6188]: (dennis) CMD (/sbin/reboot) ... Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopping Authorization Manager... ... Sep 19 00:00:02 EnexisVT2-1 watchdog[2212]: stopping daemon (5.15) Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopping watchdog daemon... ... Sep 19 00:00:02 EnexisVT2-1 systemd[1]: watchdog.service: Control process exited, code=exited, status=1/FAILURE Sep 19 00:00:02 EnexisVT2-1 systemd[1]: watchdog.service: Failed with result 'exit-code'. Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopped watchdog daemon. Sep 19 00:00:02 EnexisVT2-1 systemd[1]: watchdog.service: Triggering OnFailure= dependencies. Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Requested transaction contradicts existing jobs: Transaction for wd_keepalive.service/start is destructive (armbian-zram-confi g.service has 'stop' job queued, but 'start' is included in transaction). Sep 19 00:00:02 EnexisVT2-1 systemd[1]: watchdog.service: Failed to enqueue OnFailure= job, ignoring: Transaction for wd_keepalive.service/start is destructive (armbi an-zram-config.service has 'stop' job queued, but 'start' is included in transaction). Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopped target Multi-User System. Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopping rng-tools.service... Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopping OpenBSD Secure Shell server... Sep 19 00:00:02 EnexisVT2-1 systemd[1]: Stopping LSB: Start or stop stunnel 4.x (TLS tunnel for network daemons)... Sep 19 00:00:02 EnexisVT2-1 ntpd[1396]: ntpd exiting on signal 15 (Terminated) ... cold reboot Sep 19 00:00:09 EnexisVT2-1 kernel: [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034] Sep 19 00:00:09 EnexisVT2-1 fake-hwclock[406]: Sat 19 Sep 2020 12:00:03 AM UTC After this the system didn't boot anymore, and we had to manually cold-boot it. So, I've stopped&disabled the watchdog for now, also had to set run_wd_keepalive=0 in /etc/default/watchdog, since the watchdog also failed to stop from the commandline (also on other systems): Sep 23 11:33:59 EnexisVT2-1 systemd[1]: Starting watchdog daemon... Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: starting daemon (5.15): Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: int=1s realtime=yes sync=no load=0,0,0 soft=no Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: memory not checked Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: ping: no machine to check Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: file: no file to check Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: pidfile: no server process to check Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: interface: no interface to check Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: temperature: no sensors to check Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: no test binary files Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: no repair binary files Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: error retry time-out = 60 seconds Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: repair attempts = 1 Sep 23 11:33:59 EnexisVT2-1 watchdog[3236]: alive=[none] heartbeat=[none] to=root no_act=no force=no Sep 23 11:33:59 EnexisVT2-1 systemd[1]: Started watchdog daemon. ... Sep 23 11:34:03 EnexisVT2-1 watchdog[3236]: stopping daemon (5.15) Sep 23 11:34:03 EnexisVT2-1 systemd[1]: Stopping watchdog daemon... Sep 23 11:34:03 EnexisVT2-1 systemd[1]: watchdog.service: Control process exited, code=exited, status=1/FAILURE Sep 23 11:34:03 EnexisVT2-1 systemd[1]: watchdog.service: Failed with result 'exit-code'. Sep 23 11:34:03 EnexisVT2-1 systemd[1]: Stopped watchdog daemon. Sep 23 11:34:03 EnexisVT2-1 systemd[1]: watchdog.service: Triggering OnFailure= dependencies. Note that I froze the armbian upgrades on all these sensors on armbian 20.02.7, to avoid having to recompile my kernel modules on every upstream update. I noticed that the systemd package got an update recently, unsure if this update may mitigate the problem. systemd-sysv/stable 241-7~deb10u4 arm64 [upgradable from: 241-7~deb10u3] systemd/stable 241-7~deb10u4 arm64 [upgradable from: 241-7~deb10u3] dennis@EnexisVT2-1:~$ dpkg -l "*current*" Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-========================================-============-============-============================================================ ii linux-buster-root-current-nanopineoplus2 20.02.1 arm64 Armbian tweaks for buster on nanopineoplus2 (current branch) hi linux-dtb-current-sunxi64 20.02.7 arm64 Linux DTB, version 5.4.28-sunxi64 hi linux-headers-current-sunxi64 20.02.7 arm64 Linux kernel headers for 5.4.28-sunxi64 on arm64 hi linux-image-current-sunxi64 20.02.7 arm64 Linux kernel, version 5.4.28-sunxi64 hi linux-u-boot-nanopineoplus2-current 20.02.1 arm64 Uboot loader 2019.10
  20. Hi Quanta, Depending on the type of serial connector, it may (partly) power the nanopi, I usually only connect the GND, RX and TX with my FT232RL USB to TTL Serial Adapter. I've seen cases where parts of the board didn't fully reset when the serial was still connected. You can of course start a separate thread about serial adaptor power issues, which may be a more general issue. My reboot problems however also seem to occur without the serial connected, i.e. can can notice it doesn't become reachable again after reboot is invoked remotely. Kind regards, Dennis
  21. Yes, I use the image from FriendlyArm mentioned on their website (in my case, I booted the image that was on the MMC when it got shipped, which is most likely the same one). To capture their first stage bootloader with dd, I analysed the nand-sata-install script to see what needed to be copied. It may be possible to directly fetch the bootloader part from the sdcard image, with a similar dd with if=sdcard.img when you have a linux PC. Since u-boot is opensource, we can probably ask FriendlyArm for their software/configuration changes to the first-stage u-boot in order to get this integrated into the Armbian tree as a patch.
  22. Hi Quanta, Thanks for you reaction. In my case the (re)boot from sdcard do not fail, I only had failing EMMC re-boots. I think the older first stage u-boot from FriendlyArm is running at slower speed so there is less chance of failures. Your u-boot on the sdcard may be fixable with a similar procedure, i.e. by capturing the first stage u-boot bytes from FriendlyArm NEO image and writing it to the sdcard instead of the MMC I guess it would be possible to do this on a linux PC without first booting from the sdcard. Before changing the sdcard, you can of course make a backup on your hard-drive: dd if=/dev/mmcblk0 of=recoveryimage.img bs=8k I think the flash wearing on regular sdcards would be similar to EMMC, most mobile phones run from EMMC nowadays and have sdcard as optional storage. Industrial-grade sdcards may be more reliable, but can still have degraded contacts and problems after vibrations. You can also start using sdcards once the EMMC starts failing, that's my plan with our remote Neo2+'s. Kind regards, Dennis
  23. For your information, with this work-around, U-boot warns about a SPL version mismatch (DRAM: sunxi SPL version mismatch: expected 3, got 2), but works nontheless. I wonder what is different in the SPL version that makes the boot possible (maybe the CPU frequency or DRAM frequency?), and if we can transfer this functionality to the newer SPL without breaking the reboot. See the serial trace of a successful reboot below for the NanoPi neo plus hardware revision 2.0 (revision 1.2 with 512MB DRAM works as well): [ 311.266768] reboot: Restarting system U-Boot SPL 2017.11 (Nov 15 2019 - 05:14:40) DRAM: 1024 MiB(504MHz) CPU Freq: 408MHz memory test: 1 Pattern 55aa Writing...Reading...OK Trying to boot from MMC2 Boot device: emmc NOTICE: BL31: v2.1(debug):bb2d778-dirty NOTICE: BL31: Built : 18:34:11, Jul 5 2019 NOTICE: BL31: Detected Allwinner H5 SoC (1718) NOTICE: BL31: Found U-Boot DTB at 0x40899d8, model: FriendlyARM NanoPi NEO Plus 2 INFO: ARM GICv2 driver initialized INFO: Configuring SPC Controller NOTICE: BL31: PMIC: Defaulting to PortL GPIO according to H5 reference design. INFO: BL31: Platform setup done INFO: BL31: Initializing runtime services INFO: BL31: cortex_a53: CPU workaround for 855873 was applied INFO: BL31: Preparing for EL3 exit to normal world INFO: Entry point address = 0x4a000000 INFO: SPSR = 0x3c9 U-Boot 2019.04-armbian (Jul 06 2019 - 18:02:12 +0200) Allwinner Technology CPU: Allwinner H5 (SUN50I) Model: FriendlyARM NanoPi NEO Plus 2 DRAM: sunxi SPL version mismatch: expected 3, got 2 1 GiB MMC: mmc@1c0f000: 0, mmc@1c11000: 1 An occasional successful boot from NAND without the workaround looks like below. The main difference seems to be the cpu/memory frequency and the memory test. [ 401.171972] reboot: Restarting system U-Boot SPL 2019.04-armbian (Jul 06 2019 - 18:02:12 +0200) DRAM: 1024 MiB Trying to boot from MMC2 U-Boot SPL 2019.04-armbian (Jul 06 2019 - 18:02:12 +0200) DRAM: 1024 MiB Trying to boot from MMC2 NOTICE: BL31: v2.1(debug):bb2d778-dirty NOTICE: BL31: Built : 18:34:11, Jul 5 2019 NOTICE: BL31: Detected Allwinner H5 SoC (1718) NOTICE: BL31: Found U-Boot DTB at 0x40899d8, model: FriendlyARM NanoPi NEO Plus 2 INFO: ARM GICv2 driver initialized INFO: Configuring SPC Controller NOTICE: BL31: PMIC: Defaulting to PortL GPIO according to H5 reference design. INFO: BL31: Platform setup done INFO: BL31: Initializing runtime services INFO: BL31: cortex_a53: CPU workaround for 855873 was applied INFO: BL31: Preparing for EL3 exit to normal world INFO: Entry point address = 0x4a000000 INFO: SPSR = 0x3c9 U-Boot 2019.04-armbian (Jul 06 2019 - 18:02:12 +0200) Allwinner Technology CPU: Allwinner H5 (SUN50I) Model: FriendlyARM NanoPi NEO Plus 2 DRAM: 1 GiB MMC: mmc@1c0f000: 0, mmc@1c11000: 1
  24. Dear all, Starting from NAND memory used to work fine on the nanopi neo plus2, however with the new boards (both v1.2 and v2.0 revision) we just obtained it frequently doesn't want to reboot, while a cold boot works fine (I tried also earlier u-boot versions (2019.04, 2018.11) without much success for the reboot from NAND). The re-boot process stops at this point (from the serial console): [ OK ] Reached target Shutdown. [ 152.452083] reboot: Restarting system U-Boot SPL 2019.10-armbian (Jan 25 2020 - 19:56:27 +0100) DRAM: 1024 MiB Trying to boot from MMC2 I suspected the problem to be in the u-boot, since the friendlyarm on NAND reboots without a hitch. I found a work-around for the latest stretch image (Armbian_20.02.0-rc1_Nanopineoplus2_stretch_current_5.4.14.7z), by doing the following: boot friendlyarm image from NAND and copy the sunxi-spl.bin: sudo dd if=/dev/mmcblk2 of=sunxi-spl-friendlyarm.bin count=4 bs=8k skip=1 boot armbian from NAND and update u-boot: switched to linux-image-next-sunxi64=5.90 kernel via armbian-config / System / Other / switch dd if=sunxi-spl-friendlyarm.bin of=/dev/mmcblk2 count=4 bs=8k seek=1 conv=fsync The atached armbianmonitor -u upload is from after a fresh reboot with the working configuration. Kind regards, Dennis
  25. A way to disable hardware is by setting its status to disabled in the device tree, take a look at " dtc -I fs -O dts /sys/firmware/devicetree/base|less" to see the runtime devicetree. For instance for sound and video this would look like below, which you can convert to dtb with something like "dtc -I dts -O dtb youroverlay.dts > youroverlay.dtb" and add as a user-overlay (place dtb file in /boot/overlay-user) and add "user-overlays=youroverlay" line in /boot/armbianEnv.txt. /dts-v1/; / { compatible = "allwinner,sun8i-h3","allwinner,sun50i-h5","friendlyarm,nanopi-neo2"; fragment@0 { target-path = "/soc"; __overlay__ { hdmi@1ee0000 { status = "disabled"; }; mixer@1100000 { status = "disabled"; }; sound { status = "disabled"; }; }; }; fragment@1 { target-path = [2f 00]; __overlay__ { sound { status = "disabled"; }; }; }; };