Le Potato Ethernet Problems

NinjaKitty · October 17, 2017

So, I've been struggling to figure out the reason behind this problem.

I have 4 Le Potatoes, all connected to a 100 mbps switch. Each of them have a 32 GB Samsung SD Card EVO.

Initially, the problem was that my SSH session would hang at random usually when I download something. I can't SSH again unless I reboot the PI (I didn't bother waiting longer than 5min)

Initially, I thought SD Cards were bad, I moved them around, done reinstalls. I'm also using SD Formatter to format them and Etcher to install them.

Still, this freezing problem had no consistent behavior according to PI hardware or the SD cards..

For Power, I'm using Anker 40W 4-Port USB Wall Charger, which it should (I think) be enough power. I've checked all the cables using my phone to verify.

I plugged HDMI into them, and see if they were still running after those SSH crashes. (why didn't I do this earlier) I ssh'd into them and see what happened when the "crash" behavior happened.

After SSH hangs, I check the PI, and the computer is still running as usual.

I reboot the PI, and I start doing some downloads, and then now I see the network hang I get from ssh. To me it seems like the ethernet driver is crashing or something.

I'm not sure what else to try for debugging / what the problem is. Send Help.

Version: ARMBIAN 5.34.171017 nightly Ubuntu 16.04.3 LTS 4.13.7-meson64
Using Le Potato 2GB Version.

Tido · October 17, 2017

to help u look in the documentation section, what these thingy can report.

/sent from mobile phone /

Igor · October 17, 2017

@TonyMac32

This is fixed on C2 while here it looks it's not (if I am looking into the correct file). Related?

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts?id=f7bcd4b6f6983d668b057dc166799716690423a4

NinjaKitty · October 17, 2017

29 minutes ago, Tido said:

to help u look in the documentation section, what these thingy can report.

I think you forgot to link something.

TonyMac32 · October 17, 2017

I'm not sure this carries over to the s905x from the s905, as the s905x can only officially go 1.5 GHz in the first place. If he has time, @Neil Armstrong may be able to answer better, I don't know all of the differences between the C2 and K2 GXBB and the Le Potato GXL.

I haven't seen this issue on my board, but of course I haven't done too many long-term tests.

@NinjaKitty, are you using the latest kernel image? File download was the simplest way for me to trip the memory fault associated with

The board didn't typically die immediately after the hang, it took some other activity first.

If you can give me the output of armbianmonitor -u maybe the fault left some evidence.

My Le Potato was left 24 hours without incident last night, I will repeat but give it some sort of activities to see if it fails.

TonyMac32 · October 17, 2017

14 minutes ago, NinjaKitty said:

I think you forgot to link something.

https://docs.armbian.com/

There you'll find all sorts of good info.

NinjaKitty · October 17, 2017

@TonyMac32

I'm on version 4.13, which I downloaded yesterday. From your post, I'm assuming there's a 4.15 version?

I'll try to reproduce it (it happened again as I was downloading the python2.7 docker container) and see.

TonyMac32 · October 17, 2017

No, 4.15 doesn't exist yet, I pulled a patch that is scheduled to be included in a future kernel. I apologize for any confusion.

I've moved multi-GB files with this image, that's the only reason I'm curious. the armbianmonitor link will provide basic info about the machine and dmesg info if you provide it.

NinjaKitty · October 17, 2017

http://sprunge.us/ORAj

Also, I've noticed with this particular version, static IPs aren't working on boot and I have to do ifdown/ifup to make it work.

It used to work fine with the previous versions. I'm not sure what's wrong with that either.

Igor · October 17, 2017

5 minutes ago, NinjaKitty said:

It used to work fine with the previous versions.

Which previous versions? Which kernel?

Another possible related problem - not sure if it manifests here:
https://github.com/armbian/build/commit/71ac70f93e4cf2af99f9ead6297f6b79f0c0529c

NinjaKitty · October 17, 2017

Sorry, I should've checked before posting.
ARMBIAN 5.33.171011 nightly Ubuntu 16.04.3 LTS 4.13.5-meson64

Currently, I'm on 4.13.7-meson64

TonyMac32 · October 17, 2017

### Group membership of dan : dan dialout sudo audio video plugdev systemd-journal netdev bluetooth docker

Was docker installed at the same time the issues appeared?

21.620963] docker_gwbridge: port 1(vethe7075d6) entered forwarding state [ 21.621367] docker_gwbridge: port 1(vethe7075d6) entered disabled state [ 21.797479]

eth1: renamed from vethc6ce16b

I have not tried Docker on this image, @Igor could this be related?

Igor · October 17, 2017

15 minutes ago, TonyMac32 said:

I have not tried Docker on this image, @Igor could this be related?

Neither did I. It could be a Docker (dependencies) related issue. Unfortunately, I am not in deep familiar with possible troubles.

NinjaKitty · October 17, 2017

I've ran into this problem a few times without docker for things like sudo apt-get update, but most of the time that I've ran into this was something involved with docker. Either running the get.docker script, installing docker, or downloading a container (i.e. python2.7-slim). I can test later today and see if other workloads cause this to happen.

V10lator · October 19, 2017

Did you comile your own kernel? Just asking as I had similar issues when compiling a custom kernel with any other CPU frequency governor than Performance. This is how it should be:

$ gunzip -c /proc/config.gz | grep CPU_FREQ_GOV
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set

NinjaKitty · October 20, 2017

@V10lator Nope, just downloaded from armbian's site.

Mine looks like this

CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

NinjaKitty · October 20, 2017

I'm going to uninstall docker-ce and just use them and see if I still crash.

For my other question of not being able to access the internet on boot, am I doing something wrong here?

/etc/network/interfaces

# Wired adapter #1
allow-hotplug eth0
no-auto-down eth0
iface eth0 inet static
    address 192.168.5.200
    netmask 255.255.255.0
    gateway 192.168.5.6
    dns-nameservers 8.8.8.8 192.168.5.6

For more context,

I have it setup to static IP, but every time on boot, i have to do sudo ifdown eth0; sudo ifup eth0 every single time to get internet access. I can SSH it just fine, and I can ping 8.8.8.8, but I can't resolve any hostnames like nslookup google.com

Edit: I uninstalled docker-ce and so far, it hasn't crashed. I've done some stress tests, iperfs, and it seems to be working fine... Will keep checking. Maybe something wrong with docker with amlogic?

V10lator · October 20, 2017

4 hours ago, NinjaKitty said:

Mine looks like this


CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

That's weird and might be why you see these errors. Could you check the outut of cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor and if it says anything other than performance do sudo echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor (please note that this setting won't survive a reboot) and then redo your testing?

BTW: Is wget https://git.kernel.org/torvalds/t/linux-4.14-rc5.tar.gz a good way to reproduce the issue for you?

1 hour ago, NinjaKitty said:

For my other question of not being able to access the internet on boot, am I doing something wrong here?

What's ifconfig and cat /etc/resolv.conf telling before and after you do ifdown/ifup?

NinjaKitty · October 20, 2017

9 hours ago, V10lator said:

That's weird and might be why you see these errors. Could you check the outut of cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor and if it says anything other than performance do sudo echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor (please note that this setting won't survive a reboot) and then redo your testing?

BTW: Is wget https://git.kernel.org/torvalds/t/linux-4.14-rc5.tar.gz a good way to reproduce the issue for you?

1) cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor returns ondemand

2) I'll do some testing when I get back home from work.

3) I just did the wget command, and it works fine (didn't change scaling_governor)

4) I'll try changing scaling_governor to performance and then reinstall docker-ce, and then redo the same stuff I was doing before.

NinjaKitty · October 26, 2017

On 10/19/2017 at 11:33 PM, V10lator said:

What's ifconfig and cat /etc/resolv.conf telling before and after you do ifdown/ifup?

Before:

eth0      Link encap:Ethernet  HWaddr 8e:fc:0c:bb:94:16
          inet addr:192.168.5.200  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::8cfc:cff:febb:9416/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:86915 errors:0 dropped:42 overruns:0 frame:0
          TX packets:658 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:13818114 (13.8 MB)  TX bytes:88602 (88.6 KB)
          Interrupt:17
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:116878 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116878 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9303074 (9.3 MB)  TX bytes:9303074 (9.3 MB)

After

eth0      Link encap:Ethernet  HWaddr 8e:fc:0c:bb:94:16
          inet addr:192.168.5.200  Bcast:192.168.5.255  Mask:255.255.255.0
          inet6 addr: fe80::8cfc:cff:febb:9416/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:87076 errors:0 dropped:42 overruns:0 frame:0
          TX packets:731 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:13833694 (13.8 MB)  TX bytes:96799 (96.7 KB)
          Interrupt:17

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:116962 errors:0 dropped:0 overruns:0 frame:0
          TX packets:116962 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:9309598 (9.3 MB)  TX bytes:9309598 (9.3 MB)

NinjaKitty · October 26, 2017

I changed /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor to performance

I did curl https://get.docker.com | sh

dan@lepotato1:~$ curl  https://get.docker.com | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11070  100 11070    0     0  13792      0 --:--:-- --:--:-- --:--:-- 13785
# Executing docker install script, commit: 49ee7c1
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c apt-get install -y -qq apt-transport-https ca-certificates curl software-properties-common >/dev/null
+ sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
+ sudo -E sh -c echo "deb [arch=arm64] https://download.docker.com/linux/ubuntu xenial edge" > /etc/apt/sources.list.d/docker.list
+ [ ubuntu = debian ]
+ sudo -E sh -c apt-get update -qq >/dev/null
+ sudo -E sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
E: Failed to fetch https://download.docker.com/linux/ubuntu/dists/xenial/pool/edge/arm64/docker-ce_17.10.0~ce-0~ubuntu_arm64.deb  Operation too slow. Less than 10 bytes/sec transferred the last 120 seconds

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

Immediately after this, SSH hangs and becomes unaccessible.

I went onto the computer directly, reset the ethernet, and did armbianmonitor -u. (Ethernet was dead and couldn't do apt-get update)

http://sprunge.us/EFfg

Andro · November 4, 2017

I have the Libre Le Potato, running the Armbian Ubuntu image [Ubunutu desktop - mainline kernel] from the download page. 2GB model, exactly as the OP has above. I see the same issue - ethernet connectivity reliably just stops after about 15 minutes or so of use, just running turbovncserver and one remote session, nothing too strenuous.

Apart from that, it's a wonderful little board and Armbian on it is really excellent.

So, just adding more weight to this observation by others.

Da Xue · November 4, 2017

This is being investigated at Amlogic.

TonyMac32 · November 4, 2017

Thanks @Andro for the report, and @Da Xue for the information.

I've now observed the same thing, however in my case it takes up to a day to appear, although admittedly I'm not doing server activities (I set the device up to stream an entire Youtube channel, which it did successfully overnight, setting it up where I would ping it periodically eventually *seems* to have knocked it out) It seems as though incoming requests may be to blame, rather than outgoing traffic? I'll set up a server with it and see how differently that behaves.

Andro · November 5, 2017

Hi @TonyMac32 now confirming that I can indeed confirm that a constant stream of inbound traffic causes the ethernet to lock up after 15 minutes to an hour or so. This test is repeatable and reliable.

As to how to provide logs or any further detail, I don't really know.

TonyMac32 · November 5, 2017

Alright, Da Xue says Amlogic is looking into it, I'll build another image and see if I can get your results. (I am not doubting you, I'm just curious why I'm seeing a different behavior)

Andro · November 13, 2017

I built the latest 5.34 Armbian image for the Le Potato and it still locks up after about an hour or so of ethernet traffic (using turbovnc), rendering this very nice little board unusable. I see others confirming the problem, but are there any people able to run this board successfully? A bit of a puzzle.

Andro · November 14, 2017

On 11/5/2017 at 12:49 AM, Da Xue said:

This is being investigated at Amlogic.

Hi Da Xue,

Any progress?

Da Xue · November 20, 2017

From 91c030a615bc1bcc500cfd63d19ea5a61179f5e1 Mon Sep 17 00:00:00 2001
From: Yizhou Jiang <yizhou.jiang@amlogic.com>
Date: Tue, 27 Jun 2017 14:05:12 +0800
Subject: [PATCH] PD#146205: eth: fix eth stop

Change-Id: I7f8ad51dacd207a804377e71340fe15f547bbae0
Signed-off-by: Yizhou Jiang <yizhou.jiang@amlogic.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 0fe9ed86aa3f..4d58ed6d44fa 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3390,6 +3390,15 @@ static void moniter_tx_handler(struct work_struct *work)
 		priv = netdev_priv(c_phy_dev->attached_dev);
 		if (priv) {
 			if (c_phy_dev->link) {
+				if (priv->dev->stats.tx_packets > 100) {
+					if (priv->dev->stats.rx_packets == 0) {
+						pr_info("rx stop, recover eth\n");
+						stmmac_release(priv->dev);
+						stmmac_open(priv->dev);
+					}
+				}
+				priv->dev->stats.tx_packets++;
+				priv->dev->stats.tx_packets++;
 				if (priv->dirty_tx != priv->cur_tx &&
 						check_tx == 0) {
 					pr_info("tx queueing\n");
-- 
2.13.6

This is a patch to the stmmac as a workaround to the issue for now. Hopefully a better fix is coming. @Andro

TonyMac32 · November 20, 2017

Thanks @Da Xue. If no one gets to it before I do, I'll test it this this evening (Eastern US)

Sign In

Le Potato Ethernet Problems

Recommended Posts

NinjaKitty

Tido

Igor

NinjaKitty

TonyMac32

TonyMac32

NinjaKitty

TonyMac32

NinjaKitty

Igor

NinjaKitty

TonyMac32

Igor

NinjaKitty

V10lator

NinjaKitty

NinjaKitty

V10lator

NinjaKitty

NinjaKitty

NinjaKitty

Andro

Da Xue

TonyMac32

Andro

TonyMac32

Andro

Andro

Da Xue

TonyMac32

Forums

My Activity Streams

Download

Store

Important Information