Jump to content

Recommended Posts

Posted

So, I've been struggling to figure out the reason behind this problem.

 

I have 4 Le Potatoes, all connected to a 100 mbps switch. Each of them have a 32 GB Samsung SD Card EVO.

 

Initially, the problem was that my SSH session would hang at random usually when I download something. I can't SSH again unless I reboot the PI (I didn't bother waiting longer than 5min)

 

Initially, I thought SD Cards were bad, I moved them around, done reinstalls. I'm also using SD Formatter to format them and Etcher to install them.

Still, this freezing problem had no consistent behavior according to PI hardware or the SD cards..

 

For Power, I'm using Anker 40W 4-Port USB Wall Charger, which it should (I think) be enough power. I've checked all the cables using my phone to verify.

 

I plugged HDMI into them, and see if they were still running after those SSH crashes. (why didn't I do this earlier) I ssh'd into them and see what happened when the "crash" behavior happened.

 

After SSH hangs, I check the PI, and the computer is still running as usual.

 

I reboot the PI, and I start doing some downloads, and then now I see the network hang I get from ssh. To me it seems like the ethernet driver is crashing or something.

I'm not sure what else to try for debugging / what the problem is.  Send Help.

 

Version: ARMBIAN 5.34.171017 nightly Ubuntu 16.04.3 LTS 4.13.7-meson64
Using Le Potato 2GB Version.

Posted

to help u look in the documentation section, what these thingy can report.

 

/sent from mobile phone /

Posted
29 minutes ago, Tido said:

to help u look in the documentation section, what these thingy can report.

 

I think you forgot to link something.

Posted

I'm not sure this carries over to the s905x from the s905, as the s905x can only officially go 1.5 GHz in the first place.  If he has time, @Neil Armstrong may be able to answer better, I don't know all of the differences between the C2 and K2 GXBB and the Le Potato GXL.

 

I haven't seen this issue on my board, but of course I haven't done too many long-term tests.

 

@NinjaKitty, are you using the latest kernel image?  File download was the simplest way for me to trip the memory fault associated with 

 

The board didn't typically die immediately after the hang, it took some other activity first.  

 

If you can give me the output of armbianmonitor -u maybe the fault left some evidence. 

 

My Le Potato was left 24 hours without incident last night, I will repeat but give it some sort of activities to see if it fails.

Posted

@TonyMac32

 

I'm on version 4.13, which I downloaded yesterday. From your post, I'm assuming there's a 4.15 version?

 

I'll try to reproduce it (it happened again as I was downloading the python2.7 docker container) and see.

 

Posted

No, 4.15 doesn't exist yet, I pulled a patch that is scheduled to be included in a future kernel.  I apologize for any confusion. 

 

I've moved multi-GB files with this image, that's the only reason I'm curious.  the armbianmonitor link will provide basic info about the machine and dmesg info if you provide it.

Posted

http://sprunge.us/ORAj

 

Also, I've noticed with this particular version, static IPs aren't working on boot and I have to do ifdown/ifup to make it work.

 

It used to work fine with the previous versions. I'm not sure what's wrong with that either.

Posted

Sorry, I should've checked before posting.
ARMBIAN 5.33.171011 nightly Ubuntu 16.04.3 LTS 4.13.5-meson64

 

Currently, I'm on 4.13.7-meson64

Posted

### Group membership of dan : dan dialout sudo audio video plugdev systemd-journal netdev bluetooth docker

 

Was docker installed at the same time the issues appeared?

 

21.620963] docker_gwbridge: port 1(vethe7075d6) entered forwarding state [ 21.621367] docker_gwbridge: port 1(vethe7075d6) entered disabled state [ 21.797479]

eth1: renamed from vethc6ce16b

 

I have not tried Docker on this image, @Igor    could this be related?

Posted
15 minutes ago, TonyMac32 said:

I have not tried Docker on this image, @Igor    could this be related?


Neither did I. It could be a Docker (dependencies) related issue. Unfortunately, I am not in deep familiar with possible troubles.

Posted

I've ran into this problem a few times without docker for things like sudo apt-get update, but most of the time that I've ran into this was something involved with docker. Either running the get.docker script, installing docker, or downloading a container (i.e. python2.7-slim). I can test later today and see if other workloads cause this to happen.

Posted

Did you comile your own kernel? Just asking as I had similar issues when compiling a custom kernel with any other CPU frequency governor than Performance. This is how it should be:

$ gunzip -c /proc/config.gz | grep CPU_FREQ_GOV
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set

 

Posted

@V10lator Nope, just downloaded from armbian's site.

 

Mine looks like this

 

CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

 

Posted

I'm going to uninstall docker-ce and just use them and see if I still crash.

 

For my other question of not being able to access the internet on boot, am I doing something wrong here?

 

/etc/network/interfaces

# Wired adapter #1
allow-hotplug eth0
no-auto-down eth0
iface eth0 inet static
    address 192.168.5.200
    netmask 255.255.255.0
    gateway 192.168.5.6
    dns-nameservers 8.8.8.8 192.168.5.6

 

For more context,

 

I have it setup to static IP, but every time on boot, i have to do sudo ifdown eth0; sudo ifup eth0 every single time to get internet access. I can SSH it just fine, and I can ping 8.8.8.8, but I can't resolve any hostnames like nslookup google.com

 

Edit: I uninstalled docker-ce and so far, it hasn't crashed. I've done some stress tests, iperfs, and it seems to be working fine... Will keep checking. Maybe something wrong with docker with amlogic?

Posted
4 hours ago, NinjaKitty said:

Mine looks like this


CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

 

That's weird and might be why you see these errors. Could you check the outut of cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor and if it says anything other than performance do sudo echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor (please note that this setting won't survive a reboot) and then redo your testing?

BTW: Is wget https://git.kernel.org/torvalds/t/linux-4.14-rc5.tar.gz a good way to reproduce the issue for you?
 

 

1 hour ago, NinjaKitty said:

For my other question of not being able to access the internet on boot, am I doing something wrong here?

What's ifconfig and cat /etc/resolv.conf telling before and after you do ifdown/ifup?

Posted
9 hours ago, V10lator said:

That's weird and might be why you see these errors. Could you check the outut of cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor and if it says anything other than performance do sudo echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor (please note that this setting won't survive a reboot) and then redo your testing?

BTW: Is wget https://git.kernel.org/torvalds/t/linux-4.14-rc5.tar.gz a good way to reproduce the issue for you?

 

1) cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor returns ondemand

2) I'll do some testing when I get back home from work.

3) I just did the wget command, and it works fine (didn't change scaling_governor)

4) I'll try changing scaling_governor to performance and then reinstall docker-ce, and then redo the same stuff I was doing before.

Posted
On 10/19/2017 at 11:33 PM, V10lator said:

What's ifconfig and cat /etc/resolv.conf telling before and after you do ifdown/ifup?

Before:

eth0      Link encap:Ethernet  HWaddr 8e:fc:0c:bb:94:16
          inet addr:192.168.5.200  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::8cfc:cff:febb:9416/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:86915 errors:0 dropped:42 overruns:0 frame:0
          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:13818114 (13.8 MB)  TX bytes:88602 (88.6 KB)
          Interrupt:17
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:116878 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116878 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9303074 (9.3 MB)  TX bytes:9303074 (9.3 MB)

 

After

 

eth0      Link encap:Ethernet  HWaddr 8e:fc:0c:bb:94:16
          inet addr:192.168.5.200  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::8cfc:cff:febb:9416/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:87076 errors:0 dropped:42 overruns:0 frame:0
          TX packets:731 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:13833694 (13.8 MB)  TX bytes:96799 (96.7 KB)
          Interrupt:17

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:116962 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116962 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9309598 (9.3 MB)  TX bytes:9309598 (9.3 MB)

 

Posted

I changed /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor to performance

 

I did curl https://get.docker.com | sh 

 

dan@lepotato1:~$ curl  https://get.docker.com | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11070  100 11070    0     0  13792      0 --:--:-- --:--:-- --:--:-- 13785
# Executing docker install script, commit: 49ee7c1
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c apt-get install -y -qq apt-transport-https ca-certificates curl software-properties-common >/dev/null
+ sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
+ sudo -E sh -c echo "deb [arch=arm64] https://download.docker.com/linux/ubuntu xenial edge" > /etc/apt/sources.list.d/docker.list
+ [ ubuntu = debian ]
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
E: Failed to fetch https://download.docker.com/linux/ubuntu/dists/xenial/pool/edge/arm64/docker-ce_17.10.0~ce-0~ubuntu_arm64.deb  Operation too slow. Less than 10 bytes/sec transferred the last 120 seconds

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

Immediately after this, SSH hangs and becomes unaccessible. 

 

I went onto the computer directly, reset the ethernet, and did armbianmonitor -u. (Ethernet was dead and couldn't do apt-get update)

 

http://sprunge.us/EFfg

Posted

I have the Libre Le Potato, running the Armbian Ubuntu image [Ubunutu desktop - mainline kernel] from the download page. 2GB model, exactly as the OP has above. I see the same issue - ethernet connectivity reliably just stops after about 15 minutes or so of use, just running turbovncserver and one remote session, nothing too strenuous.

 

Apart from that, it's a wonderful little board and Armbian on it is really excellent.

 

So, just adding more weight to this observation by others.

 

Posted

Thanks @Andro for the report, and @Da Xue for the information.

 

I've now observed the same thing, however in my case it takes up to a day to appear, although admittedly I'm not doing server activities (I set the device up to stream an entire Youtube channel, which it did successfully overnight, setting it up where I would ping it periodically eventually *seems* to have knocked it out)  It seems as though incoming requests may be to blame, rather than outgoing traffic?  I'll set up a server with it and see how differently that behaves.

Posted

Hi @TonyMac32 now confirming that I can indeed confirm that a constant stream of inbound traffic causes the ethernet to lock up after 15 minutes to an hour or so. This test is repeatable and reliable.

 

As to how to provide logs or any further detail, I don't really know.

 

Posted

Alright, Da Xue says Amlogic is looking into it, I'll build another image and see if I can get your results.  (I am not doubting you, I'm just curious why I'm seeing a different behavior)

Posted

I built the latest 5.34 Armbian image for the Le Potato and it still locks up after about an hour or so of ethernet traffic (using turbovnc), rendering this very nice little board unusable. I see others confirming the problem, but are there any people able to run this board successfully? A bit of a puzzle.

 

Posted
On 11/5/2017 at 12:49 AM, Da Xue said:

This is being investigated at Amlogic.

Hi Da Xue,

 

Any progress?

 

 

Posted
From 91c030a615bc1bcc500cfd63d19ea5a61179f5e1 Mon Sep 17 00:00:00 2001
From: Yizhou Jiang <yizhou.jiang@amlogic.com>
Date: Tue, 27 Jun 2017 14:05:12 +0800
Subject: [PATCH] PD#146205: eth: fix eth stop

Change-Id: I7f8ad51dacd207a804377e71340fe15f547bbae0
Signed-off-by: Yizhou Jiang <yizhou.jiang@amlogic.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 0fe9ed86aa3f..4d58ed6d44fa 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3390,6 +3390,15 @@ static void moniter_tx_handler(struct work_struct *work)
 		priv = netdev_priv(c_phy_dev->attached_dev);
 		if (priv) {
 			if (c_phy_dev->link) {
+				if (priv->dev->stats.tx_packets > 100) {
+					if (priv->dev->stats.rx_packets == 0) {
+						pr_info("rx stop, recover eth\n");
+						stmmac_release(priv->dev);
+						stmmac_open(priv->dev);
+					}
+				}
+				priv->dev->stats.tx_packets++;
+				priv->dev->stats.tx_packets++;
 				if (priv->dirty_tx != priv->cur_tx &&
 						check_tx == 0) {
 					pr_info("tx queueing\n");
-- 
2.13.6

This is a patch to the stmmac as a workaround to the issue for now. Hopefully a better fix is coming. @Andro

Guest
This topic is now closed to further replies.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines