DoubleHP Posted November 17, 2018 Posted November 17, 2018 Took me over a year to isolate this bug. Randomly, I have found that /boot/armbianEnv.txt had unconsistent content. Examples: .tty1tty1LOGIN.8 ¹[fU7ttyS0tyS0LOGINusbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u network={ ssid="Demaine_40_Grd_Rue_Chambre-BP" #psk="1234567890" psk=693d195a3bf2ab6defbcc0b54604ced4286a067996138db6dc793d8648f67d79 } usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u network={ ssid="fabste" #psk="1234567890" psk=95342031a4b28398045e1761f9c64f8454902e80d37102e5476fa79abd59e144 } 2 } usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u At last, I have found that "fabste" is the SSID of the network of my neighbour. How did this get here ? The bug occurs only on Orange Pi Zero boards ... which need to perform wifi scan. Does not happen on boards that operate as WIFI AP, or use ethernet. The wifi scan is required to connect to the AP. The scan seem to produce a temp file somewhere (maybe in /tmp, not sure). Then, later, /etc/init.d/armhwinfo runs this, which is very suspicious to me: read USBQUIRKS <${TMPFILE} sed -i '/^usbstoragequirks/d' /boot/armbianEnv.txt echo "usbstoragequirks=${USBQUIRKS}" >>/boot/armbianEnv.txt and if this is followed by a power failure, you end up with a corrupted file: https://lists.denx.de/pipermail/u-boot/2018-June/332375.html https://github.com/ConnectBox/connectbox-pi/issues/220 It's a rare issue from Google point of view; but it's known, and it makes most of my opi0 unusable. Sooner or later, it affects 100% of my opi0 that work as wifi clients. It rendends my pis unusable because I need very specific features which are set in /boot/armbianEnv.txt , usually w1-gpio. When the file is corrupted, and after reboot, I loose control over my 1W devices, and the whole project got broken. Here is my fstab, in case it matters: UUID=ebe9dacf-124f-486c-b6c1-08749e209374 / ext4 defaults,noatime,nodiratime,commit=600,errors=remount-ro 0 1 tmpfs /tmp tmpfs defaults,nosuid 0 0 I am not sure how to fix this at the distribution level: - use something different than sed in armhwinfo to alter /boot/armbianEnv.txt - perform a sync - change ext4 features - change mount options For now, I am trying to write a personnal fix: record a backup copy of /boot/armbianEnv.txt and restaure it if critical words like "overlay_prefix=sun8i-h3" disappear. Note that, when the file is heavily broken, for all 3 examples above, the pi can boot, establish network connexion, and remains reachable, mostly. I only loose the 1W or GPIO feature. There seem to be very good default settings set as fallback somewhere else. Here are unaffected opis: verbosity=1 logo=disabled console=both disp_mode=1920x1080p60 overlay_prefix=sun8i-h3 overlays=cir usbhost2 usbhost3 w1-gpio param_w1_pin=PA15 rootdev=UUID=ebe9dacf-124f-486c-b6c1-08749e209374 rootfstype=ext4 usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u verbosity=1 logo=disabled console=both disp_mode=1920x1080p60 overlay_prefix=sun8i-h3 overlays=usbhost2 usbhost3 w1-gpio param_w1_pin=PA15 rootdev=UUID=ebe9dacf-124f-486c-b6c1-08749e209374 rootfstype=ext4 usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u I am not sure why one has an empty line. But in both case, the usbstoragequirks line is doubled, and identical to my eyes. This means, /etc/init.d/armhwinfo tries to add already existing data. This script lacks checks !!! .
DoubleHP Posted November 17, 2018 Author Posted November 17, 2018 I was wrong ... I just found a corruption on my wifi master AP opi, an opi that never does scanning, since it's always in master mode: # cat /boot/armbianEnv.txt.backup.2018-11-17_22-14-09 verbosity=1 logo=disabled console=both disp_mode=1920x1080p60 overlay_prefix=sun8i-h3 overlays=usbhost2 usbhost3 w1-gpio param_w1_pin=PA15 rootdev=UUID=ebe9dacf-124f-486c-b6c1-08749e209374 rootfstype=ext4 usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u # cat /boot/armbianEnv.txt | hexdump 0000000 6576 6272 736f 7469 3d79 0a31 6f6c 6f67 0000010 643d 7369 6261 656c 0a64 6f63 736e 6c6f 0000020 3d65 6f62 6874 640a 7369 5f70 6f6d 6564 0000030 313d 3239 7830 3031 3038 3670 0a30 766f 0000040 7265 616c 5f79 7270 6665 7869 733d 6e75 0000050 6938 682d 0a33 766f 7265 616c 7379 753d 0000060 6273 6f68 7473 2032 7375 6862 736f 3374 0000070 7720 2d31 7067 6f69 700a 7261 6d61 775f 0000080 5f31 6970 3d6e 4150 3531 720a 6f6f 6474 0000090 7665 553d 4955 3d44 6265 3965 6164 6663 00000a0 312d 3432 2d66 3834 6336 622d 6336 2d31 00000b0 3830 3437 6539 3032 3339 3437 720a 6f6f 00000c0 6674 7473 7079 3d65 7865 3474 000a 0000 00000d0 0000 0000 0000 0000 0000 0000 0000 0000 * 0000120 0000 0000 0000 0000 0000 0000 0000 7500 0000130 6273 7473 726f 6761 7165 6975 6b72 3d73 0000140 7830 3532 3733 303a 3178 3630 3a36 2c75 0000150 7830 3532 3733 303a 3178 3630 3a38 0a75 0000160 7375 7362 6f74 6172 6567 7571 7269 736b 0000170 303d 3278 3335 3a37 7830 3031 3636 753a 0000180 302c 3278 3335 3a37 7830 3031 3836 753a 0000190 000a 0000191 Tricky one ... was not visible in cat, only in vim ...
Magnets Posted March 18, 2019 Posted March 18, 2019 I have just noticed the same issue when trying to use armbian-config on pi pc2, as have others here: https://github.com/armbian/config/issues/33 My armbianEnv.txt looks like this /var/log.hdd/aptitude { rotate 6 monthly compress missingok notifempty } ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u How can I regenerate armbianEnv.txt without doing a reinstall? edit: pulled it from the .img
pzw Posted March 22, 2019 Posted March 22, 2019 Filesystem corruption it can be... Power supply / SD cards are all the same? Have you tried one node with another SD card / power supply?
DoubleHP Posted March 25, 2019 Author Posted March 25, 2019 On 3/18/2019 at 5:11 PM, Magnets said: How can I regenerate armbianEnv.txt without doing a reinstall? edit: pulled it from the .img Backup the whole system after install. For this specific file, I keep local backups with dates, and diff them after each reboot. When it looks corrupted, the script restaures the last BU.
DoubleHP Posted March 25, 2019 Author Posted March 25, 2019 On 3/22/2019 at 2:31 AM, pzw said: Filesystem corruption it can be... Power supply / SD cards are all the same? Have you tried one node with another SD card / power supply? It could by a bad SD, or broken ext4 ... but fact is ... I had this issue on 100% opis, and, only on this specific file. So, yes there is an issue with ext4; but there is an issue with the way this file is modified. This file is savagely altered by boot or initscripts, without any check. To the point the things introduce twice the same line. The issue may even be in how temp files is handled.
DoubleHP Posted March 25, 2019 Author Posted March 25, 2019 Here are core parts of my service: Makefile: TARGET = /etc/cron.d/armbian_env_txt_checker all: install install: $(TARGET) $(TARGET): Makefile bash -c "echo -e \"# file generated by $$(pwd)/$<\" >$@" bash -c "echo -e \"@reboot\troot\t/srv/doublehp/share/armbianEnv_txt/armbianEnv_txt_checker.sh\" >>$@" touch $@ /srv/doublehp/share/armbianEnv_txt/armbianEnv_txt_checker.sh #!/bin/bash [ ! -f /usr/sbin/armbian-config ] && exit 0 sleep 90 ref=/srv/doublehp/share/armbianEnv_txt/armbianEnv.txt.ref file=/boot/armbianEnv.txt loc="${file}.local" backup="${file}.backup.$(/bin/date +%Y-%m-%d_%H-%M-%S)" broken="${file}.broken.$(/bin/date +%Y-%m-%d_%H-%M-%S)" previous="$("ls" "${file}.backup."* | sort | tail -n1)" # is current file valid ? #cat "$file" | grep -e "overlay_prefix=sun8i-h3" >/dev/null && { cat "$file" | grep -e "rootdev=UUID=" >/dev/null && { #file is good. # create local ref if none [ ! -e "$loc" ] && { cp -a "$file" "$loc" } # create backup if none [ "$previous" = "" ] && { cp -a "$file" "$backup" } diff "$file" "$previous" >/dev/null || cp -a "$file" "$backup" true } || { # File is bad [ -e "$previous" ] && { cp -a "$file" "$broken" cp -a "$previous" "$file" reboot exit 0 } [ -e "$loc" ] && { cp -a "$file" "$broken" cp -a "$loc" "$file" reboot exit 0 } # Fallback on ref ... cp -a "$file" "$broken" cp -a "$ref" "$file" reboot }
pzw Posted March 25, 2019 Posted March 25, 2019 29 minutes ago, DoubleHP said: Backup the whole system after install. For this specific file, I keep local backups with dates, and diff them after each reboot. When it looks corrupted, the script restaures the last BU. What program did you use to write the image to the SD card? I hope Etcher or another program which actually verifies the data? IF there would be an issue with the OS/scripts provided by Armbian, for sure you would not be the only one experiencing it... Better look at what you use, like common parts, how you do things, like writing the image to the SD card, etc. I have a number of Opi0 / other Opis and now a few NanoPi Duo2 boards, which all run without any issue. For the SD cards I am only using the Samsung EVO cards, (at the moment I just buy Samsung EVO select cards from Amazon), use only the power supply from the Orange Pi, or a supply which can deliver at least 2A to the board. (Stable, tested on an oscilloscope to ensure it is stable without funny spikes etc) I have not experienced a problem in this setup so far..
DoubleHP Posted April 15, 2019 Author Posted April 15, 2019 dd I am not. Many other people have issues with armbianEnv.txt, and with other files dues to sync issues. After fixing the issue for this file using my scripts, I don't encontour any other issue on oPi0.
zdroyer Posted June 17, 2019 Posted June 17, 2019 I can confirm the bug. Using 20+ OrangePi Zero Plus with the extension board usb ports I noticed the same strange content of /boot/armbianEnv.txt as Magnet described. The file system seems to be fine, except this file. Since my systems use the USB from the extension board, once the port is not present, my device doesn't work. My workaround is based on DoubleHP. I added a simple service that checks the content of the armbianEnv.txt file and react accordingly to the result. Her are the three files. It works well on my systems, but please feel free to criticise it an point out any possible errors I made. /root/bootenv_check/bootenv_check.service [Unit] Description=Checking the validity of /boot/armbianEnv.txt file. After=sysinit.target [Service] Type=oneshot User=root ExecStart=/bin/bash /root/bootenv_check/bootenv_check.sh RemainAfterExit=yes [Install] WantedBy=basic.target /root/bootenv_check/bootenv_check.sh #!/bin/bash file=/boot/armbianEnv.txt overlays="overlays=usbhost0 usbhost1 usbhost2 usbhost3" content="verbosity=1 console=serial overlay_prefix=sun50i-h5 "$overlays" rootdev=UUID=d59ef23a-f2e0-418c-85e9-e21611283218 rootfstype=ext4 usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u" cat "$file" | grep -x -F "$overlays" >/dev/null && { #file is good. echo "$file seems to be valid (includes '$overlays')" } || { # File is bad echo -e "$file content is invalid: \n----" cat "$file" echo -e "----" # Fallback on the predefined content ... echo -e "$content" > "$file" echo "$file overwritten with default content" echo "REBOOT NOW" reboot } /root/bootenv_check/install_service.sh #!/bin/bash #echo "The script you are running has basename `basename "$0"`, dirname `dirname "$0"`" echo "Installing bootenv_check.service in the system..." file="bootenv_check.service" target=`dirname "$(readlink -f "$0")"`"/"$file echo "Service file: $target" ln -s -T -v "$target" /etc/systemd/system/"$file" systemctl enable $file echo "... done" 1
Ugo Riboni Posted January 9, 2020 Posted January 9, 2020 Just for the record, I just had this issue happening to me on a nanopi M1 plus board, with kernel 4.5.6 The /boot/armbianEnv.txt had entries from syslog pasted into it and missing most of the default content that enabled the right device tree overlays. Is there any way to prevent this from happening rather than detecting the issue and restoring the file ?
djjerdog Posted January 13, 2020 Posted January 13, 2020 I've been seeing the same issue happening randomly on the NEO2 with kernel 4.19, when I've analyzed the corrupt armbianEnv.txt it was full of what appeared to be output from an apache log. The workaround that @zdroyer provided seems to be working, but would be nice to get to the bottom of this.
Igor Posted January 13, 2020 Posted January 13, 2020 19 minutes ago, djjerdog said: but would be nice to get to the bottom of this. Perhaps try to disable this:https://github.com/armbian/build/blob/master/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization#L281-L318 and if that helps, RFC to force fsck before changing the file. 1
Recommended Posts