0
RaX

odroid-c2 kernel panic on adding/removing veth interfaces quickly

Recommended Posts

Not sure if this belongs here or upstream but I figure I'd post here first.

 

I'm running kubernetes on a bunch of odroid c2s and randomly pairs or sometimes single ones will crash randomly. I was able to capture the logs from the crash, but I can't seem to get a core dump. I was, however, able to determine that the panic occurs when rapidly removing and adding virtual ethernet devices. Some sort of collision seems to occur and the kernel oopsies into oblivion. It happens constantly, so much that I've had to put my project on hold =(

 

This happens with the stock image from hardkernel, the latest stable armbian image for c2, and the latest nightly for c2 on 4.14-28 with beta firmware and all that.

 

I'm wondering if this is a known issue or what's going on. I've searched all the keywords I can think of. I also had to build custom images to switch to btrfs because ext4 kept corrupting itself with all the crashes, even when journaled.

Share this post


Link to post
Share on other sites

Patch would need to be backported to 4.14.y ... perhaps this would do:
 

Spoiler

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index bb44f0c..9e9202b
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2155,13 +2155,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		ndst = &rt->dst;
-		if (skb_dst(skb)) {
-			int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
-
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
-		}
-
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
 		ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
 		err = vxlan_build_skb(skb, ndst, sizeof(struct iphdr),
@@ -2197,13 +2190,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 				goto out_unlock;
 		}
 
-		if (skb_dst(skb)) {
-			int mtu = dst_mtu(ndst) - VXLAN6_HEADROOM;
-
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
-		}
-
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
 		ttl = ttl ? : ip6_dst_hoplimit(ndst);
 		skb_scrub_packet(skb, xnet); 

 

 

 

Share this post


Link to post
Share on other sites

I've added that to my userpatches. It will take me a bit to build another k8s worker base image but I'll post back when I finish testing.

Share this post


Link to post
Share on other sites

That definitely solved the crash issues. I have all of my nodes up and stable =) Thanks for the help!

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
0