squat/azure-vxlan.md

Last active April 23, 2021 04:27

Star (4) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/squat/1c2799c3565c383fe4b1499c101bfc49.js"></script>
Save squat/1c2799c3565c383fe4b1499c101bfc49 to your computer and use it in GitHub Desktop.

Download ZIP

Debugging VXLAN issues on Azure

Raw

azure-vxlan.md

Azure VXLAN Issues

Introduction

Between June 24-25, the nodes in Tectonic clusters running on Azure automatically updated the OS from Container Linux 1353.8.0 to 1409.2.0. After this upgrade, the nodes began to experience increased latency and failure rate in requests. Interestingly, we found that the size of the HTTP request played a role in determining the success of requests to services running on the Kubernetes cluster. We found that setting the client's interface's MTU to 1370 made all requests succeed; incrementing the MTU to 1371 caused the failure of large HTTP requests to resurface. Additionally, enabling TCP MTU probing on the clients ensured all requests would succeed, albeit with increased latency. In order to identify the minimum set of circumstances needed to reproduce the issue, I ran several tests involving different network topologies. I found that the following elements were required to trigger failure:

VXLAN;
MTU path discovery; and
the Linux kernel version present in Container Linux 1409.2.0 or later.

The following document details how to configure a minimal enviroment to reproduce the issue and test different kernels.

Topology

Reproducing the VXLAN issues requires two Container Linux virtual machines (VMs) running in Azure cloud and connected with a VXLAN. Note: for simplicity, the VMs can be placed in the same Azure virtual network and subnet, however this is not strictly necessary. The setup is as follows:

a Linux client, e.g. a laptop, makes an HTTP request to node 1
node 1 NATs the request:
- node 1 NATs the packet's destination to the IP of the container running on node 2
- node 1 NATs the packet's source to the IP of its VXLAN interface
node 1 forwards the request over the VXLAN to node 2
node 2 either receives or does not receive the packets on its VXLAN interface and routes them to the container

Visually:

Node 1 <---NAT---> <---VXLAN---> Node 2 <---> container
  ^
  |
  |
  v
Client

Getting Started

Launch two Container Linux VMs on Azure. Ensure both VMs are reachable over TCP ports 22, 80, and UDP port 8472, the default VXLAN port on Linux. Additionally, both VMs should be configured with public IP addresses. This exercise will assume the nodes share a subnet, however as long as UDP port 8472 is reachable via the public IPs, this should not matter. Visually, the cluster should look like:

------------------------10.0.1.0/24-----------------------
|       Node 1                             Node 2        |
|10.0.1.4/24 (eth0)                 10.0.1.5/24 (eth0)   |
----------------------------------------------------------

VXLAN Configuration

SSH onto node 1 and create a unicast VXLAN interface:

ip link add vxlan0 type vxlan id 10 dev eth0 dstport 0

Configure the VXLAN connection to node 2:

bridge fdb add 00:00:00:00:00:00 dev vxlan0 dst 10.0.1.5

Bring up the VXLAN and give it an IP address:

ip addr add 10.0.0.1/24 dev vxlan0
ip link set up dev vxlan0

Repeat the same process for node 2:

ip link add vxlan0 type vxlan id 10 dev eth0 dstport 0
bridge fdb add 00:00:00:00:00:00 dev vxlan0 dst 10.0.1.4
ip addr add 10.0.0.2/24 dev vxlan0
ip link set up dev vxlan0

Verify that the two VXLAN interfaces can communicate with simple pings. From node 1:

ping 10.0.0.2

From node 2:

ping 10.0.0.1

Visually, the cluster should now look like:

------------------------10.0.1.0/24-----------------------
|       Node 1                             Node 2        |
|10.0.1.4/24 (eth0)                 10.0.1.5/24 (eth0)   |
|10.0.0.1/24 (vxlan0) <---VXLAN---> 10.0.0.2/24 (vxlan0) |
----------------------------------------------------------

Container Configuration

For purposes of simplicity, run a simple nginx container on node 2, and find its IP address:

# node 2
CID=$(docker run -d nginx)
CIP=$(docker inspect --format '{{ .NetworkSettings.IPAddress }}' $CID)

Verify that the container is reachable via its IP address from node 2:

# node 2
curl $CIP

On node 1, configure a route to the nginx container:

# node 1
export CIP=<nginx-container-ip>
ip route add $CIP via 10.0.0.2

Verify that the container is reachable via its IP address from node 1:

# node 1
curl $CIP

IPTables Configuration

In order for the container to be reachable by clients making requests to node 1, that node must forward requests to the nginx container running on node 2. On node 1, configure the following NAT and FORWARD rules:

# node 1
iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -j DNAT --to-destination $CIP:80
iptables -t nat -A POSTROUTING -o vxlan0 -j MASQUERADE
iptables -A FORWARD -s $CIP -j ACCEPT
iptables -A FORWARD -d $CIP -j ACCEPT

Test VXLAN

Finally, now that the infrastructure is suitably configured, we can test the connection from a client that has TCP MTU probing disabled. HTTP requests shorter than 1378 bytes should succeed. The following should work as expected:

# client
curl <node-1-public-ip>

HTTP requests longer than 1378 bytes should fail on kernels affected by the regression. The following request should fail:

# client
curl <node-1-public-ip> -H "Authorization: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

simongottschlag commented Feb 12, 2019

Hi!

I'm running Kubernetes clusters in Azure, deployed using Kubespray. I'm using CoreOS+Canal (Flannel+Calico)+Istio. I've seen intermittent issues where three different cluster (in the same region - west europe - using all availability zones) where internal communication stops working as expected between PODs.

For example will istio-injection stop working with the following error: (when looking at the replicaset using kubectl -n <namespaceName> describe rs <rsName>)
Error creating: Internal error occurred: failed calling admission webhook "sidecar-injector.istio.io": Post https://istio-sidecar-injector.istio-system.svc:443/inject?timeout=30s: dial tcp <clusterIP>:443: i/o timeout

The only solutions, right now, are to reboot the whole cluster or - which I just found out - restart kube-proxy like this:
kubectl -n kube-system delete pods -l k8s-app=kube-proxy

I'm not sure this is related to the issue you are describing, since it can take about a week until it happens. But when it happens, it does seem to look like your issue. And since you've seen it using the same technologies (CoreOS+VXLAN) in Azure, maybe it is.

Do you know if the issue you've found has been resolved?

Author

squat commented May 13, 2019

@simongottschlag, the underlying issue here ended up being on Azure's side. It turns out that Azure was rolling out a new hypervisor version that was not correctly performing checksum offloading. The solution for us was to disable tx checksum offloading: https://github.com/coreos/tectonic-installer/blob/0ec6b27c6d4ba56f03eef6425f52292aec20cb1c/modules/ignition/resources/services/tx-off.service

This is an undesirable workaround, since checksumming in the guest OS will be slower than when it is done by the host's hardware but it was the only way around.

JohnCMcDonough commented Aug 1, 2019

@simongottschlag, @squat we are having the exact same issues on both RancherOS and CoreOS. We experience issues when we create too many TCP/UDP connections. It's similar :443: i/o timeout problems. In some cases, the Kubernetes HTTP Health Checks even fail, and cause pods to restart.

I looked at the connection tracking events, and I see a bunch of un-responded VXLan connections. I'm not sure that the un-replied connections are bad, but I'd have to assume that they are.

How did you go about diagnosing these issues?

CoreOS is: Linux 4.19.50-coreos-r1 #1 SMP Mon Jul 1 19:07:03 -00 2019
RancherOS is: Linux 4.14.122-rancher #1 SMP Tue May 28 01:50:21 UTC 2019

 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=40736 dport=53 src=10.42.7.227 dst=10.42.8.220 sport=53 dport=40736
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54872 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=54872
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=35164 dport=53 [UNREPLIED] src=10.42.8.54 dst=10.42.8.216 sport=53 dport=35164
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=35164 dport=53 src=10.42.8.54 dst=10.42.8.216 sport=53 dport=35164
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.216 dst=<APISERVER>		 sport=41720 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42151
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.216 dst=<APISERVER>		 sport=41722 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42152
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.216 dst=<APISERVER>		 sport=41724 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42153
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.216 dst=<APISERVER>		 sport=41726 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42155
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=55019 dport=53 [UNREPLIED] src=10.42.8.54 dst=10.42.8.220 sport=53 dport=55019
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=55019 dport=53 src=10.42.8.54 dst=10.42.8.220 sport=53 dport=55019
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=38000 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=38000
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=51417 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=51417
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=36690 dport=53 [UNREPLIED] src=10.42.8.54 dst=10.42.8.216 sport=53 dport=36690
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=47599 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=47599
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=36690 dport=53 src=10.42.8.54 dst=10.42.8.216 sport=53 dport=36690
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=58351 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=58351
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=48317 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=48317
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=38000 dport=53 src=10.42.6.102 dst=10.42.8.220 sport=53 dport=38000
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=47599 dport=53 src=10.42.6.102 dst=10.42.8.220 sport=53 dport=47599
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=35151 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=35151
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=50255 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=50255
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=33510 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.216 sport=53 dport=33510
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=33287 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=33287
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=47903 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=47903
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=35151 dport=53 src=10.42.6.102 dst=10.42.8.220 sport=53 dport=35151
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54884 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=54884
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54886 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=54886
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=58307 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=58307
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=33510 dport=53 src=10.42.6.102 dst=10.42.8.216 sport=53 dport=33510
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54888 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42156
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=35140 dport=53 [UNREPLIED] src=10.42.6.103 dst=10.42.8.216 sport=53 dport=35140
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=44448 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=44448
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=38103 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=38103
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=42411 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=42411
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=55589 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=55589
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=37625 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.216 sport=53 dport=37625
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=35140 dport=53 src=10.42.6.103 dst=10.42.8.216 sport=53 dport=35140
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=44600 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=44600
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=44577 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=44577
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=42411 dport=53 src=10.42.6.102 dst=10.42.8.220 sport=53 dport=42411
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=37625 dport=53 src=10.42.6.102 dst=10.42.8.216 sport=53 dport=37625
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=57863 dport=53 [UNREPLIED] src=10.42.7.227 dst=10.42.8.220 sport=53 dport=57863
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=49304 dport=53 [UNREPLIED] src=10.42.6.103 dst=10.42.8.216 sport=53 dport=49304
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE0>		 sport=37426 dport=8472 [UNREPLIED] src=<NODE0>		 dst=<NODE1>		 sport=8472 dport=37426
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=51444 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=51444
    [NEW] udp      17 30 src=<NODE0>		 dst=<NODE1>		 sport=51148 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE0>		 sport=8472 dport=51148
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=57863 dport=53 src=10.42.7.227 dst=10.42.8.220 sport=53 dport=57863
    [NEW] udp      17 30 src=<NODE2>		 dst=<NODE1>		 sport=40052 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE2>		 sport=8472 dport=40052
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=49304 dport=53 src=10.42.6.103 dst=10.42.8.216 sport=53 dport=49304
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=56494 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=56494
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=48773 dport=53 [UNREPLIED] src=10.42.6.103 dst=10.42.8.216 sport=53 dport=48773
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=40656 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=40656
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54890 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=54890
 [UPDATE] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=56494 dport=53 src=10.42.6.102 dst=10.42.8.220 sport=53 dport=56494
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54892 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=54892
    [NEW] tcp      6 120 SYN_SENT src=10.42.8.220 dst=<APISERVER>		 sport=54894 dport=443 [UNREPLIED] src=<APISERVER>		 dst=<NODE1>		 sport=443 dport=42157
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=43494 dport=53 [UNREPLIED] src=10.42.7.227 dst=10.42.8.216 sport=53 dport=43494
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE0>		 sport=47951 dport=8472 [UNREPLIED] src=<NODE0>		 dst=<NODE1>		 sport=8472 dport=47951
    [NEW] udp      17 30 src=<NODE0>		 dst=<NODE1>		 sport=34967 dport=8472 [UNREPLIED] src=<NODE1>		 dst=<NODE0>		 sport=8472 dport=34967
 [UPDATE] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=43494 dport=53 src=10.42.7.227 dst=10.42.8.216 sport=53 dport=43494
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=55356 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=55356
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=57268 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=57268
    [NEW] udp      17 30 src=10.42.8.216 dst=10.43.0.10 sport=43318 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.216 sport=53 dport=43318
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=53870 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=53870
    [NEW] udp      17 30 src=10.42.8.220 dst=10.43.0.10 sport=53445 dport=53 [UNREPLIED] src=10.42.6.102 dst=10.42.8.220 sport=53 dport=53445
    [NEW] udp      17 30 src=<NODE1>		 dst=<NODE2>		 sport=43681 dport=8472 [UNREPLIED] src=<NODE2>		 dst=<NODE1>		 sport=8472 dport=43681
    [NEW] tcp      6 120 SYN_SENT src=<NODE1>		 dst=10.42.8.58 sport=40322 dport=3000 [UNREPLIED] src=10.42.8.58 dst=<NODE1>		 sport=3000 dport=40322

cpu=0   	found=400313 invalid=12616 ignore=5074737 insert=0 insert_failed=305053 drop=72 early_drop=0 error=0 search_restart=675768 
cpu=1   	found=256589 invalid=11834 ignore=2806813 insert=0 insert_failed=309626 drop=68 early_drop=0 error=0 search_restart=517878 
cpu=2   	found=408442 invalid=11884 ignore=5173739 insert=0 insert_failed=298185 drop=78 early_drop=0 error=0 search_restart=688967 
cpu=3   	found=261973 invalid=11870 ignore=3532019 insert=0 insert_failed=324700 drop=61 early_drop=0 error=1 search_restart=516916 
cpu=4   	found=409286 invalid=19503 ignore=5782260 insert=0 insert_failed=446986 drop=63 early_drop=0 error=0 search_restart=706798 
cpu=5   	found=267990 invalid=3978 ignore=3006542 insert=0 insert_failed=3480 drop=80 early_drop=0 error=0 search_restart=542639 
cpu=6   	found=403117 invalid=12264 ignore=5398374 insert=0 insert_failed=267665 drop=87 early_drop=0 error=2 search_restart=683459 
cpu=7   	found=256236 invalid=12384 ignore=2752017 insert=0 insert_failed=307132 drop=68 early_drop=0 error=0 search_restart=611942

simongottschlag commented Aug 1, 2019

Hi @JohnCMcDonough

If I remember correctly, our issue was related to kube-proxy and not VXLAN: kubernetes/kubernetes#71071

Author

squat commented Aug 1, 2019 via email

Hi @JohnCMcDonough, The problem for us ended up being due to an issue in the underlying hyper visor. Azure had just rolled out a pool of nodes running a new hyper visor and due to cache locality of the Container Linux image, all (most) of the clusters running on the latest release we’re being scheduled on the nodes with the hyper visor issue. El El jue, ago. 1, 2019 a las 20:37, Simon Gottschlag < [email protected]> escribió:

…

Hi @JohnCMcDonough <https://github.com/JohnCMcDonough> If I remember correctly, our issue was related to kube-proxy and not VXLAN: kubernetes/kubernetes#71071 <kubernetes/kubernetes#71071> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://gist.github.com/1c2799c3565c383fe4b1499c101bfc49?email_source=notifications&email_token=AE4JAPY3RINRORLDIK3BHYLQCMUPVA5CNFSM4HMSHJGKYY3PNVWWK3TUL52HS4DFVNDWS43UINXW23LFNZ2KUY3PNVWWK3TUL5UWJTQAFWKXG#gistcomment-2987379>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE4JAP5LVDMN5MEGA6BW5STQCMUPVANCNFSM4HMSHJGA> .

CoaxVex commented Dec 18, 2019

I believe I just ran into this while troubleshooting networking issues with OpenShift 4 on Azure. Running "ethtool --offload eth0 tx off" on the node fixes it.

squat/azure-vxlan.md

Azure VXLAN Issues

Introduction

Topology

Getting Started

VXLAN Configuration

Container Configuration

IPTables Configuration

Test VXLAN

simongottschlag commented Feb 12, 2019

Uh oh!

squat commented May 13, 2019

Uh oh!

JohnCMcDonough commented Aug 1, 2019

Uh oh!

simongottschlag commented Aug 1, 2019

Uh oh!

squat commented Aug 1, 2019 via email

Uh oh!

CoaxVex commented Dec 18, 2019

Uh oh!