aprayoga Posted November 12, 2020 Posted November 12, 2020 To continue discussion from Helios64 Support 17 hours ago, giddy said: I have a Helios Kobol64, I kept it off for a few hours. Now powering on, it stays stuck in "maintenance". No services start up. Before this, I followed the guide on https://wiki.kobol.io/helios64/install/emmc/ and installed the Armbian_20.08.13_Helios64_buster_current_5.8.16.img everything worked fine, and I have OMV on it, rebooted it several times.. no issues. Today, I started it, and nothing.. I managed to connect to it via serial (usbc - COMPORT SERIAL) below is the output: ``` Hide contents SF: Detected w25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB MMC: mmc@fe320000: 1, sdhci@fe330000: 0 Loading Environment from MMC... *** Warning - bad CRC, using default environment In: serial Out: serial Err: serial Model: Helios64 Revision: 1.2 - 4GB non ECC Net: eth0: ethernet@fe300000 scanning bus for devices... Hit any key to stop autoboot: 0 Card did not respond to voltage select! switch to partitions #0, OK mmc0(part 0) is current device Scanning mmc 0:1... Found U-Boot script /boot/boot.scr 3185 bytes read in 18 ms (171.9 KiB/s) ## Executing script at 00500000 Boot script loaded from mmc 0 166 bytes read in 15 ms (10.7 KiB/s) 16002825 bytes read in 1539 ms (9.9 MiB/s) 27331072 bytes read in 2610 ms (10 MiB/s) 79946 bytes read in 44 ms (1.7 MiB/s) 2698 bytes read in 39 ms (67.4 KiB/s) Applying kernel provided DT fixup script (rockchip-fixup.scr) ## Executing script at 09000000 ## Loading init Ramdisk from Legacy Image at 06000000 ... Image Name: uInitrd Image Type: AArch64 Linux RAMDisk Image (gzip compressed) Data Size: 16002761 Bytes = 15.3 MiB Load Address: 00000000 Entry Point: 00000000 Verifying Checksum ... OK ## Flattened Device Tree blob at 01f00000 Booting using the fdt blob at 0x1f00000 Loading Ramdisk to f4fa3000, end f5ee5ec9 ... OK Loading Device Tree to 00000000f4f27000, end 00000000f4fa2fff ... OK Starting kernel ... Give root password for maintenance (or press Control-D to continue): ``` I am not able to login with any of the passwords (my own or defaults). I type `1234` return, it says incorrect login, I assume it only asking for root password not the username. I appreciate if anyone could help me please. Regards, Could you boot from micro SD card? Please use Armbian 20.08.10 image, version 20.08.13 known to have stability problem on many users. After setting up Armbian on the micro SD, reboot (to finish the setup) and login as root, you can check eMMC for filesystem error fsck -p /dev/mmcblk1p1 You can try to change the password of Armbian on the eMMC: Spoiler mkdir -p /mnt/emmc mount /dev/mmcblk1p1 /mnt/emmc mount -o bind /proc /mnt/emmc/proc mount -o bind /sys /mnt/emmc/sys mount -o bind /dev /mnt/emmc/dev mount -o bind /dev/pts /mnt/emmc/dev/pts chroot /mnt/emmc/ inside the chroot'd system, you can run passwd and set your password. You can exit the chroot, and do clean up umount /mnt/emmc/{dev/pts,dev,sys,proc} umount /mnt/emmc After finished you can poweroff the system, remove the micro SD card and power on the system again. See if you can boot to your system on eMMC. 1 Quote
giddy Posted November 12, 2020 Posted November 12, 2020 thank you @aprayoga that was helpful. TL:DR my problem was, docker was starting BEFORE my mergefs volumes mounted, in one of my containers I am mounting a docker volume. because docker does not see the volume on disk, it "helpfully" creates that volume on disk. now my mergefs cannot mount because there are files created by docker in that location. Assumptions: you have a computer connected via USB-SERIAL and you can see the Helios console. inline code/commands important variables are wrapped with `` - I am used to markdown. I assume you know that `vim` and `nano` are text editors and you know how to exit `vim`. How I found out and investigated: 1. I followed the above advice to boot from sd-card (balena etcher + latest stable armbian buster). 2. I fsck emmc = no issues. 3. I mounted the emmc, chrooted in, changed root password. Now I can login past the `Give root password for maintenance (or press Control-D to continue):` 4. I cannot start services, system is dead, `journalctl` does not have any useful info. `/var/log/messages` looks boring. 5. I found some instructions to edit `/boot/armbianEnv.txt` (while still chrooted to emmc install) I bumped up the verbosity to `9` (make sure you only edit verbosity and do not change other things in your armbianEnv.txt). `vim /boot/armbianEnv.txt` verbosity=9 bootlogo=false overlay_prefix=rockchip rootdev=UUID=b31229b9-40ab-441c-95be-66666 rootfstype=ext4 console=serial usbstoragequirks=0x2537:0x1066:u,0x2537:0x1068:u 6. I reboot, and I notice this in the boot logs (I am able to see boot logs because I am still connected via USBC SERIAL cable): [FAILED] Failed to mount /srv/f95ca…b-439d-450e-b700-4444. See 'systemctl status "srv-f95ca73b\\x2…0\\x2d4444.mount"' for details. After this the system "hangs" with the message we saw before: Starting kernel ... Give root password for maintenance (or press Control-D to continue): 7. Because I changed the password, I am able to get in to recovery mode on the emmc install. I do a `cat /etc/fstab` and notice that `f95ca…b-439d-450e-b700-4444` is my mergefs volume. I do an `ls -alsht /srv/f95ca…b-439d-450e-b700-4444` and I see some directories there, both of these cannot be true, it's either mounted and files or NOT mounted and NO directories. These directories match up with the docker volume mounts I specified for one of my containers. 8. I do a `systemctl docker stop` `systemctl docker disable` so docker does not do a mess again (for now). I do a `du -hs /srv/f95ca…b-439d-450e-b700-4444` to make sure it is only empty directories created by docker, not my actual data. The output shows only empty dirs, (I am expecting gigabytes). Only AFTER I verified there is no data to lose, I do a `rm -rf /srv/f95ca…b-439d-450e-b700-4444`. 9. Now I need to make mergefs mount the volume BEFORE docker starts. I run `systemctl list-units --type=mount` this shows me ALL THE MOUNTS, for simplicity I am only including the drives we care about. srv-dev\x2ddisk\x2dby\x2dlabel\x2dsda.mount loaded active mounted /srv/dev-disk-by-label-sda srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdb.mount loaded active mounted /srv/dev-disk-by-label-sdb srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdc.mount loaded active mounted /srv/dev-disk-by-label-sdc srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdd.mount loaded active mounted /srv/dev-disk-by-label-sdd srv-f95ca73b\x2d439d\x2d450e\x2db700\x2d4444.mount loaded active mounted /srv/f95ca73b-439d-450e-b700-4444 these are the disks and volumes matching up with the failed mount in the boot logs. 10. Now I edit the systemd docker override with `systemctl edit docker` and add this block: [Unit] After=srv-dev\x2ddisk\x2dby\x2dlabel\x2dsda.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdb.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdc.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdd.mount srv-f95ca73b\x2d439d\x2d450e\x2db700\x2d4444.mount I save, I exit nano. I want to check if systemctl sees my changes... I run this: `systemctl cat docker` the output shows the override (look at the last three lines, one of them includes my override to wait until mounts are done): # /lib/systemd/system/docker.service [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com BindsTo=containerd.service After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=docker.socket [Service] Type=notify # the default is not to use systemd for cgroups because the delegate issues still # exists and systemd currently does not support the cgroup feature set required # for containers run by docker ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always # Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229. # Both the old, and new location are accepted by systemd 229 and up, so using the old location # to make them work for either version of systemd. StartLimitBurst=3 # Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230. # Both the old, and new name are accepted by systemd 230 and up, so using the old name to make # this option work for either version of systemd. StartLimitInterval=60s # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity # Comment TasksMax if your systemd version does not support it. # Only systemd 226 and above support this option. TasksMax=infinity # set delegate yes so that systemd does not reset the cgroups of docker containers Delegate=yes # kill only the docker process, not all processes in the cgroup KillMode=process [Install] WantedBy=multi-user.target # /etc/systemd/system/docker.service.d/mount-disks-before-docker.conf [Unit] After=srv-dev\x2ddisk\x2dby\x2dlabel\x2dsda.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdb.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdc.mount srv-dev\x2ddisk\x2dby\x2dlabel\x2dsdd.mount srv-f95ca73b\x2d439d\x2d450e\x2db700\x2d4444.mount 11. I start docker, I enable the service `systemctl start docker` `systemctl enable docker`. I reboot, and it is all working. My filesystems mount properly BEFORE docker starts, ensuring docker does not create docker volumes because my filesystem is not ready. 1 Quote
BinaryWaves Posted September 21, 2023 Posted September 21, 2023 Necro revive but @giddy you saved my butt a ton on this one! Thanks! <3 0 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.