I’m running ARMBIAN 5.38 stable on a Banana Pro and having the board go unresponsive regularly – meaning I can't login via SSH (getting a "port 22: Host is down" error) and applications stop running.
I’ve used the Banana Pro board with an external 4TB harddrive via eSATA as a media/backup server with a similar software set up for several years (running regular updates). Occasionally it would go non-responsive, but I’d generally be able to run it for weeks without a restart. The need for a restart became more frequent until now it won't run for more than 12 hours without trying to log in via ssh and seeing “port 22: Host is down” and the programs like plex and syncthing not running.
Here's what I've done so far:
I did some searching through various logs and wasn’t able to find anything that stood out to me. Though that may mean nothing – I have some experience but I'm self taught and haven't had to track down this kind of issue before.
Next, I tried to rule out a corrupted SD card or inadequate power supply. I tested the power supply when it was running under load and it seemed ok. Below are the results of the monitoring from armbian-config. I saw the CPU go up and the voltage stay fairly consistent so I figured it was fine.
Time CPU load %cpu %sys %usr %nice %io %irq CPU PMIC DC-IN C.St.
12:18:00: 960MHz 0.65 27% 7% 4% 13% 0% 0% 41.6°C 50.8°C 4.94V 0/6
12:18:06: 528MHz 0.60 46% 10% 1% 32% 0% 0% 41.5°C 50.4°C 4.96V 0/6
12:18:11: 528MHz 0.55 12% 8% 0% 2% 0% 0% 41.0°C 50.3°C 4.97V 0/6
12:18:17: 960MHz 0.50 45% 10% 0% 33% 0% 0% 41.8°C 50.6°C 4.94V 0/6
12:18:22: 528MHz 0.54 27% 9% 0% 18% 0% 0% 41.0°C 50.4°C 4.94V 0/6
12:18:28: 960MHz 0.90 88% 17% 17% 27% 23% 1% 42.0°C 50.7°C 4.94V 0/6
12:18:34: 960MHz 1.59 99% 12% 63% 15% 6% 0% 43.4°C 51.8°C 4.92V 0/6
12:18:40: 960MHz 1.54 89% 16% 48% 16% 6% 0% 43.9°C 51.9°C 4.93V 0/6
12:18:45: 960MHz 1.74 96% 8% 70% 17% 0% 0% 44.4°C 52.2°C 4.93V 0/6
12:18:51: 960MHz 1.92 96% 7% 67% 21% 0% 0% 44.6°C 52.3°C 4.93V 0/6
12:18:57: 960MHz 2.01 95% 6% 67% 21% 0% 0% 45.1°C 52.7°C 4.93V 0/6
12:19:03: 960MHz 2.01 94% 8% 65% 20% 0% 0% 45.1°C 52.9°C 4.93V 0/6
12:19:08: 960MHz 2.33 93% 7% 67% 18% 0% 0% 45.2°C 53.7°C 4.94V 0/6
12:19:14: 960MHz 2.59 96% 9% 71% 14% 0% 0% 45.7°C 53.9°C 4.93V 0/6
Time CPU load %cpu %sys %usr %nice %io %irq CPU PMIC DC-IN C.St.
12:19:20: 960MHz 2.78 100% 7% 75% 16% 0% 0% 45.7°C 53.7°C 4.93V 0/6
12:19:26: 960MHz 3.04 100% 7% 76% 16% 0% 0% 45.9°C 54.1°C 4.93V 0/6
12:19:33: 960MHz 3.27 100% 7% 78% 13% 0% 0% 46.1°C 54.4°C 4.94V 0/6
12:19:39: 960MHz 3.91 100% 8% 69% 22% 0% 0% 46.1°C 54.5°C 4.93V 0/6
12:19:45: 960MHz 4.32 100% 8% 62% 28% 0% 0% 46.2°C 54.6°C 4.93V 0/6
12:19:50: 960MHz 4.69 100% 7% 60% 30% 0% 0% 46.6°C 54.5°C 4.93V 0/6
12:19:56: 960MHz 5.04 100% 6% 63% 29% 0% 0% 46.4°C 54.8°C 4.93V 0/6
12:20:02: 960MHz 5.11 100% 7% 58% 32% 0% 0% 46.6°C 55.1°C 4.93V 0/6
12:20:08: 960MHz 5.35 100% 6% 59% 33% 0% 0% 46.8°C 55.2°C 4.93V 0/6
12:20:14: 960MHz 6.06 100% 6% 63% 29% 0% 0% 46.8°C 55.2°C 4.93V 0/6
12:20:21: 960MHz 5.98 99% 15% 52% 31% 0% 0% 46.5°C 55.2°C 4.94V 0/6
12:20:26: 960MHz 5.66 99% 11% 0% 87% 0% 0% 46.8°C 55.3°C 4.93V 0/6
12:20:32: 960MHz 5.44 99% 14% 0% 83% 0% 0% 47.0°C 55.5°C 4.93V 0/6
12:20:38: 960MHz 5.33 99% 13% 0% 85% 0% 0% 47.1°C 55.6°C 4.93V 0/6
12:20:43: 960MHz 5.30 99% 13% 0% 84% 0% 0% 46.8°C 55.7°C 4.93V 0/6
Time CPU load %cpu %sys %usr %nice %io %irq CPU PMIC DC-IN C.St.
12:20:49: 960MHz 5.02 98% 10% 0% 86% 0% 0% 47.0°C 55.7°C 4.94V 0/6
12:20:55: 960MHz 5.25 85% 14% 0% 70% 0% 0% 46.7°C 55.6°C 4.93V 0/6
12:21:00: 960MHz 4.83 44% 9% 11% 22% 1% 0% 44.8°C 54.4°C 4.96V 0/6
12:21:06: 864MHz 4.45 12% 8% 0% 3% 0% 0% 43.6°C 54.0°C 4.97V 0/6
12:21:12: 960MHz 4.17 12% 7% 0% 3% 0% 0% 43.0°C 53.7°C 4.97V 0/6
12:21:18: 720MHz 3.84 16% 8% 0% 6% 0% 0% 42.8°C 52.9°C 4.97V 0/6
12:21:23: 864MHz 3.77 17% 8% 0% 7% 0% 0% 42.5°C 52.8°C 4.97V 0/6
12:21:29: 960MHz 3.41 16% 9% 2% 3% 1% 0% 42.6°C 52.9°C 4.94V 0/6
12:21:35: 528MHz 3.14 36% 11% 0% 24% 0% 0% 42.2°C 52.5°C 4.97V 0/6
12:21:41: 720MHz 2.97 12% 8% 0% 3% 0% 0% 41.8°C 52.2°C 4.96V 0/6
To rule out the SD card, I created a partition on the external harddrive I use, and successfully moved the OS to the HDD. It’s run off the HD since. I hoped this would solve the issue, but I’m still seeing the board go un-responsive periodically.
I ran diagnostics in armbian-config. This is the most recent:
http://ix.io/DKd
And these are from previous days:
http://ix.io/15rL http://ix.io/15rW
I'm wondering what would be causing the failures.
Are there logs I can post that would be helpful? Which? Anything in particular I can search for?
I moved the OS to the HDD, but I know the SD card is still used. Could that still cause these issues? Should I replace it entirely?
Thanks in advance for your help.