I wanted to play Age of Empires II with my sister, and wanted to use QEMU to get this working. Using a VM wouldn't clobber my sister's laptop. And perhaps unsurprisingly, this excellent old game runs very smooth in a guest.
I wanted my setup to look like this:
,--- internet ---
wifi `- LAN
/ |--- desktop 1 -- QEMU VM2
laptop -- QEMU VM1 `--- desktop 2 -- QEMU VM3
I wanted all VMs to be connected to the same LAN, but not the internet. This took me a while to get working, so I though I'd do a quick writeup here. This is basically a rewrite of mcastelino's gist, along with my own thoughts.
QEMU has many networking options. TAP devices with a Linux bridge is
probably the most common setup. But it requires a lot of iptables
,
sudo
-ing and networing-foo to get working. It would also require
messing with the laptop's host network config which I didn't want to
do. So here's a way to do this using only user-space tools and UDP.
- Relatively unintrusive setup and simple to set up
- No useruser needed (unless bridging to TAP device)
- A bit slow (see performance below)
- Requires multiple running
socat
processes - Traffic unencrypted
QEMU's socket
and dgram
options simply wrap the Ethernet frames in
UDP packets. I don't think the documentation mentions that. Perhaps
it's obvious to everyone but me. This allows for some nice tricks like
bridging a TAP interface on the host with a multicast group as
mcastelino
describes in his gist, and bridging
UDP unicast connections from the outside world.
QEMU'S socket,mcast=<multicast-address>:<port>
obtion is a simple
way of making a VLAN for your guests that works across hosts (provided
they are multicast-reachable, usually on the same LAN). It's probably
not an accident that UDP Multicast was chosen to transport Ethernet
frames: It is also stateless and connectionless. It also requires no
extra infrastructure besides a working multicast setup, it's host
OS-independent and requires fewer privileges and less configuration on
the host.
As far as I can tell, the mcast
option is the only option that will
let you connect multiple (more than 2) VMs into the same VLAN without
using a host bridge or a similar mechanism. At least the documentation
explicitly mentions this option has that ability.
Since my networking skills are limited, I didn't really know how to debug when I encountered problems. So I will try to describe what I did to get this working here.
First, launch a VM with the mcast setup. QEMU's manpage suggests this:
qemu-system-x86_64 linux.img \
-device e1000,netdev=n1,mac=52:54:00:12:34:56 \
-netdev socket,id=n1,mcast=230.0.0.1:33000
Here's what that means:
Bind a UDP socket to listen for multicast traffic on 230.0.0.1 on the specified port. All ethernet frames coming from the guest will be wrapped in a UDP packet and sent off to this multicast address. All UDP packets coming from this socket will be sent directly to the guest VM's NIC. The host will be subscribed to this multicast group.
What's nice (and unique?) about multicast addresses is that they can
have multiple processes bound to the same address and port - and all
bound processes receive a copy of all packets. In my experience, this
wasn't always reliable with non-multicast addresses when you had
multiple processes listening on the same port (reuseaddr
). That part
still confuses me. Maybe it's intended for load-balancing?
But multicast is a better option anyhow, because it allows VMs to join from other hosts and VLAN traffic will only reach hosts which subscribe to this multicast address. So if only one host is doing this, the traffic will stop at my first switch. At least that's how I think multicast works.
Now with your first guest up and running, we can make a silly test.
Even with nothing properly configured on the guest we can see how the
underlying mechanism works. Start tcpdump
on your guest, making sure
it tcpdump
listens on the correct NIC. In my case, that's ens3
:
root@guest # tcpdump
listening on ens3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
Now, from your host, run this silly command:
user@host $ echo 1111111111111111111111111111111 | socat -x - \
UDP4-DATAGRAM:230.0.0.1:33000,reuseaddr,ip-add-membership=230.0.0.1:127.0.0.1
tcpdump
should output something like this:
13:21:56.020303 31:31:31:31:31:31 (oui Unknown) > 31:31:31:31:31:31 (oui Unknown), ethertype Unknown (0x3131), length 32:
0x0000: 3131 3131 3131 3131 3131 3131 3131 3131 1111111111111111
0x0010: 310a 1.
See? Isn't that fun? Here we're seeing what I was hoping to demonstrate: the raw Ethernet frames inside UDP packets. We just made a bogus Ethernet frame and sent it to this multicast groups. All QEMU guests on this mcast group will see this Ethernet frame. Once I was able to test like this, I found it easier to debug.
Since TAP devices also expect Ethernet
frames¹,
it is straight-forward to make a TAP device which bridges QEMU
multicast VLANs. That's why mcastelino's socat
command works.
Note that if you want to go "bidirectional" with socat
, its socket
must be bound to the right port. Your socat
socket should have the
same properties as QEMU's socket. You can check using ss -lunp
, for
example. I eventually got that working with the right bind
option. I
don't know why it worked without that for mcastelino. sourceport
has
no effect for me.
user@host $ socat -x - UDP4-DATAGRAM:230.0.0.1:33000,reuseaddr,ip-add-membership=230.0.0.1:127.0.0.1,bind=230.0.0.1:33000 &
user@host $ sudo ss -lunp | grep 33000
UNCONN 0 0 230.0.0.1:33000 0.0.0.0:* users:(("socat",pid=9746,fd=5))
UNCONN 0 0 230.0.0.1:33000 0.0.0.0:* users:(("qemu-system-x86",pid=6652,fd=11))
Now with these working socat
options, I was able to bridge a TAP
device with my multicast VLAN. Here's a considerably more useful
example than hand-written Ethernet frames using echo
:
user@host $ sudo socat \
TUN:10.0.3.1/24,tun-type=tap,iff-no-pi,iff-up,iff-debug,tun-name=vmvlan0 \
UDP4-DATAGRAM:230.0.0.1:33000,reuseaddr,ip-add-membership=230.0.0.1:10.0.3.1,bind=230.0.0.1:33000
This is basically mcastelino's socat
command but with the bind
option. A vmvlan0
interface pops up on the host and can be used to
run dnsmasq
or tcpdump
directly on the host.
You can run a DHCP server on one of the guests or you can run one from
the host using the socat
bridge above. The dnsmasq
command from
mcastelino's gist, worked out of the box for me. Here, rewritten
with CLI arguments instead of a config file:
sudo -E dnsmasq -C /dev/null -d --bind-dynamic --interface=vmvlan0 --dhcp-range=10.0.3.100,10.0.3.200 --leasefile-ro
See
this
SO thread. systemd
defaults to using /etc/machine-id
, and not the
MAC-address, so you'll need to consider that before launching multiple
guests based off the same image.
If you want your VLAN to reach beyond your multicast range, there are many options. A proper VPN solution is usually the best choice for robust permanent setups.
However, QEMU can wrap Ethernet frames in both multicast and unicast UDP packets. The latter can be used for VLANs across WAN, for example. where guest 1 could open firewalls and listen and guest 2 could connect. This would require no TAP devices or external tools. It would limit the VLAN to 2 guests, however.
With our existing multicast setup though, it's fairly straight-forward to bridge a UDP unicast connection to our multicast group. You can have 1 WAN guest can join the network of multiple LAN guests.
user@host $ socat -x UDP4-LISTEN:33001 UDP4-DATAGRAM:230.0.0.1:33000,reuseaddr,ip-add-membership=230.0.0.1:127.0.0.1,bind=230.0.0.1:33000
socat
's UDP-LISTEN
uses the receiver's address of the first packet
as it's peer. This means we can only have 1 peer per invokation, which
may be a good thing considering there is no security involved. For
this reason, this must be restarted if the remote is changed.
I often find myself suddenly no longer receiving the multicast packets I expect.
The interface (or listening address) in ip-add-membership
must be
correct, othersize socat
won't be able to receive any packets -
including ones coming from localhost. This informs the kernel and LAN
that we want to receive packets for this multicast group.
However, if another process adds a membership, it will apply to the
host as a whole so things will appear to work. You can use ip maddr
to see current multicast subscriptions.
I ran a guest with 5 NICs to test their speeds:
qemu-system-x86_64 -accel kvm -m 1G -drive if=virtio,file=arch.qcow2 -nographic \
-nic user,model=virtio-net-pci,id=n2 \
-netdev tap,id=n3,ifname=tap0,script=no \ -device virtio-net-pci,netdev=n3,mac=52:54:00:11:22:03
-netdev socket,id=n4,mcast=230.0.0.1:12345 \ -device virtio-net-pci,netdev=n4,mac=52:54:00:11:22:04
-netdev dgram,id=n5,remote.type=inet,remote.port=12345,remote.host=127.0.0.1,local.type=inet,local.host=127.0.0.1,local.port=12346 -device virtio-net-pci,netdev=n5,mac=52:54:00:11:22:05
I'm running iperf3 -c 10.0.x.2
on the guest where x
is the
appropriate subnet. I get:
type | speed | helpers | |
---|---|---|---|
n2 | user |
1Gbit/s | |
n3 | tap |
30Gbit/s | ip a add 10.0.3.2 dev tap0 |
n4 | mcast |
500Mbit/s | socat UDP4-DATAGRAM:230.0.0.1:12345,bind=230.0.0.1:12345,… TUN:10.0.4.2/24,tun-type=tap,… |
n5 | dgram |
600Mbit/s | socat UDP4-DATAGRAM:127.0.0.1:12346,bind=127.0.0.1:12345,… TUN:10.0.5.2/24,tun-type=tap,… |
With this combined, I have the setup I wanted. I don't know how
performant my VLAN is with all these socat
bridges, but it's more
than enough for a bit of Age of Empires multipayer.
What I've come to realize is that since multicast packets are sent and received by every member, it becomes an "Ethernet hub" onto which everything else is connected:
,- laptop QEMU VM1
/
/ (Internet)
/
,- socat mcast <=> UDP-LISTEN:33001
/
======================================= multicast 230.0.0.1:33000
\ \ \
\ \ `- desktop2 QEMU VM3
\ `- desktop1 QEMU VM2
`- socat mcast <=> TAP
`- desktop1 dnsmasq
I didn't know multicast could be used like that. It's nice and flexible, and relatively efficient in that only multicast group members will see be exposed to all the traffic.