Sonikku Posted March 9, 2020 Posted March 9, 2020 Greetings everyone. I have sucessfully used Armbian Bionic in a production environment. Its really great. I am now evaluating Armbian Buster and I am seeing a strange issue perhaps you can direct me where to look so I can do troubleshooting. With the board connected to our corporate network, it takes about 48 hours when the whole board (OrangePi+ 2E) will crash with the CPU burning up. The second time it happened I had htop running to try and diagnose the problem. Investigation revealed that rngd became unstable due to the nm-applet. This led me to re-running the known conditions when this happens, with the network disconnected and now it doesn't crash. Below is a screenshot of the crashed state. Any ideas of where to look would be appreciated. My gut feel tells me this is network related and indeed, by disabling the network interface the problem goes away.
Pol Isidor Posted March 9, 2020 Posted March 9, 2020 did u try to fix the cpu frequency to for example on one speed?i had sam problem, andnafter this it is perfectly stabile.Sent from my MI 6 using Tapatalk
Sonikku Posted March 9, 2020 Author Posted March 9, 2020 For the CPU speed I keep everything as standard as possible. I haven't done that no. To put it this way, this is stock Armbian with one item installed... RabbitMQ. We thought we had a problem with RabbitMQ and we found that Buster addressed it somehow, so that's unrelated.
Pol Isidor Posted March 9, 2020 Posted March 9, 2020 try what i suggested you..its 1mij of job..Sent from my MI 6 using Tapatalk
Ordon Posted July 3, 2020 Posted July 3, 2020 I have had a similar experience with NetworkManager. The solution was to set a fixed IP. All problems solved! Some history. ============= My Orange PI PC was running an old Armbian 5.11 (Debian 8.11). Had been running rock sold since installation date, with a fixed IP. A WD 500GB SATA disk is attached (USB docking station: JMicron, JMS579) to store files (approx. 50 GB/day). I did a fresh install: "Armbian 20.05.2 Orangepipc Debian buster" with DHCP enabled (IP reservation on the router). Result: random halting of the system (runs headless in a remote location). Sometimes after an hour, sometimes after a few hours, sometimes a bit longer. Even checksums of the files were sometimes (1 à 2%) calculated wrong! Files got sometimes corrupted (5-10%) when transferred over the wired connection. Recent syslog etc were missing when the system halted. Installing watchdog was not a real solution of course but would most of the time reboot the system (not always). Setting a fixed IP solved all the problems. The Orange PI PC was and is running at max 1008MHz (SoC runs between 480 and 1008MHz using conservative governor). I also installed a fan to keep things cooler: max 54°C under full load (stress @ 4cpus). But this did not solve the problems. Only setting a fixed IP solved al other issues.
Sonikku Posted July 25, 2020 Author Posted July 25, 2020 Hi I have some feedback regarding this issue. I am using Armbian Focal (current version) and the issue persisted, until, I found out where it was coming from. I have been able to repeatedly generate the system crash at will, it takes a few hours but it can be guaranteed to happen. The cause is a particular Windows 7 host that has not been updated in years. It runs Windows 7 SP1 and the Windows Update was last run 3 years ago. Other Windows 7 machines have been updated and they have no effect on this board running Armbian Focal. So I guess there is some network bug in old Windows 7 versions (which Microsoft has since patched, most likely) such that when the host browses the SMB share or even does a host lookup on the Armbian host, it sends something that upsets the network stack, because in this condition, while the OrangePi is technically unresponsive, it still, surprisingly responds to ICMP requests. What I have done is to remove the particular Windows 7 host from the network (unplugged the LAN cable) and the problem has disappeared. I am now on 6 days of uptime on Armbian. Therefore I am fully convinced the Windows 7 host has a network stack bug, sends garbage to the OrangePi board and this somehow upsets the TCP stack in Armbian. Neither another machine running the same Windows 7 copy (but it is updated until Microsoft stopped rolling out patches in March 2020), or a Windows 10 machine, nor several Ubuntu machines, nor a Macbook Pro affect this OrangePi. Solid as a rock. I will investigate the issue further if I have time, to pinpoint the exact cause but honestly, its easier to get rid of Windows at this point. The machine in question is used for DTP uses i.e. Corel Draw and other software that's not available on Mac or Linux.
Sonikku Posted July 29, 2020 Author Posted July 29, 2020 I think I know what the problem is The Windows machine also runs NordVPN, which is used for accessing censored content outside of my country (certain necessary software updates are blocked by our government) Recently, NordVPN changed the underlying network transport driver, which has not only broken a few things on the machine itself, it seems that when NordVPN is allowed to run on the machine and the machine has access to the OrangePi, that's when things go horribly wrong. Other devices accessed in a similar way have also shown odd symptoms.. an Android media player locked up and froze... so some combination of my software is either no good or there's a bug in Windows 7 that has since been patched. I am enjoying 9 days of uptime on the OrangePi and have elected to rather format the Windows 7 machine, reinstall all the software and keep it updated this time. NordVPN stuff will be done on a different machine now.
xwiggen Posted August 9, 2020 Posted August 9, 2020 On 7/29/2020 at 12:21 PM, Sonikku said: I think I know what the problem is The Windows machine also runs NordVPN, which is used for accessing censored content outside of my country (certain necessary software updates are blocked by our government) Recently, NordVPN changed the underlying network transport driver, which has not only broken a few things on the machine itself, it seems that when NordVPN is allowed to run on the machine and the machine has access to the OrangePi, that's when things go horribly wrong. Other devices accessed in a similar way have also shown odd symptoms.. an Android media player locked up and froze... so some combination of my software is either no good or there's a bug in Windows 7 that has since been patched. I am enjoying 9 days of uptime on the OrangePi and have elected to rather format the Windows 7 machine, reinstall all the software and keep it updated this time. NordVPN stuff will be done on a different machine now. CVE-2020-10730 in samba possibly culprit.
Sonikku Posted August 9, 2020 Author Posted August 9, 2020 1 minute ago, xwiggen said: CVE-2020-10730 in samba possibly culprit. I am inclined to agree... but whatever I did the system is still up.
Solution Sonikku Posted October 6, 2020 Author Solution Posted October 6, 2020 Final feedback and solution The issue happened again, and at the same time I was seated at my desk: Setup: Macbook Pro /w Kensington Desktop Hub to provide DisplayPort and Ethernet HP Desktop PC with Windows 7 Polycom Desktop Phone HP LaserJet Pro Wifi Access point All the above are on the same network switch Event I disconnected the Kensington device from my Mac (USB-C) I noticed something was wrong when my phone reported no network (It makes a chime when this happens) My Windows 7 box was doing a download and then said the internet was disconnected with high CPU usage.... hmmm.... My phone complained about no internet (Whatsapp went offline). Had to use the regular mobile LTE network. Root Cause When the Kensington unit is disconnected, it goes into a state where it jams the upstream network switch, it appears to flood the LAN with rubbish packets... the switch falls over or locks up. Also other devices see (and receive) these packets. Left to happen long enough, the OrangePi will crash as I have described. My colleagues have been able to reproduce this, and we've decided to throw the Kensington unit out. It is not a bug! The desktop phone also eventually crashes, and it takes out my Ubuntu 18.04 LTS file server eventually. The HP printer locks up completely and I have to power cycle it at the mains socket. So there we have it.. finally. Rogue device.
Recommended Posts