I'm writing this up as a gist, because I'm not sure I'll pursue any of it, but it seems worth writing down.
A basic problem with IPv6-only LANs at the moment is that you still need support for legacy protocol servers, which implies you doing some kind of DNS64 and NAT64 (IPv6-to-IPv4 stateful translation).
Currently, NAT64 only exists outside the kernel, as userspace software. This is not ideal because it limits available performance (since you have to keep bouncing between user space and kernel space), requires reimplementing a lot of things that already exist in the kernel (e.g. conntrack logic), and isn't composable with other existing kernel subsystems to implement flexible, mad network things.
Implement a virtual interface type that does SIIT. Combining such an interface with existing NAT66 and connmark logic, you can implement all kinds of fancy NAT64 setups, all in-kernel.
The assumed topology here is that we're a dual-stack router, trying to provide legacy IP service to a v6-only LAN. In other words:
LAN (v6 only) <--> Router (dual-stack) <.-> Modern Internet
|
`-> Legacy Internet
Let's say the LAN is 2001:db8:cafe::/64
. The router is
2001:db8:cafe::1
on the LAN, and 2001:db8:deed::1729
and
192.0.2.16
on the WAN.
We assume that the LAN already has access to a DNS server that does
DNS64, translating A
records into AAAA
records using
IPv4-translated addresses. In other words, a response A 1.2.3.4
is
reaching the LAN clients as AAAA ::ffff:0:1.2.3.4
.
Packets arriving at the router from the LAN are therefore all native IPv6. Most will have native v6 source and destination addresses:
src: 2001:db8:cafe::f4bc
dst: 2001:db8:f00f::42
However, legacy sites will have been mapped by DNS64, and the packets will look like:
src: 2001:db8:cafe::1
dst: ::ffff:0:1.2.3.4
Our mission is to get the latter packets out to the WAN looking like this:
src: 192.0.2.16
dst: 1.2.3.4
And we need enough conntrack information to translate it all back to the v6 packet on the way back in.
This is the one new piece of code we need. It's a new virtual
interface type (similar to gre, ipip, tun, ...) for the linux
kernel. Call it siit0
. It acts as a zero-configuration hairpin SIIT
translator, doing the following to packets it receives:
- IPv4 packets are rewritten to IPv6 (using IPv4-translated addresses).
- IPv6 packets, whose source and destination IPs are both IPv4-translated addresses, are rewritten to IPv4 and input back into the network stack.
- Other IPv6 packets are dropped.
Of note is that this interface does not carry any state at all. Some packets, which match its constraints, are statelessly rewritten and returned to the kernel as-if transmitted by some remote host. Packets which cannot sensibly be translated are simply dropped. It's on the system administrator to only route "good" packets to this interface.
Turns out, this is the one missing piece that lets us implement a decent variety of NAT64 setups, under different operating conditions.
The basic home router case is: single IPv4 WAN IP, and some IPv6 subnet routed for LAN use.
We receive on lan0
the following packet:
src: 2001:db8:cafe::f4bc
dst: ::ffff:0:1.2.3.4
To get it out to the legacy internet, we need one additional thing: an
extra IPv4 address that will remain entirely local to the router. Some
locally-unused RFC1918 address will do the trick, say
192.168.254.254
. We add the following configuration:
ip -6 route add ::ffff:0:0:0/96 dev siit0
ip6tables -t nat -A POSTROUTING -o siit0 -j SNAT --to-source 192.168.254.254
ip route add 192.168.254.254/32 dev siit0
ip route add default via 192.0.2.1
iptables -t nat -A POSTROUTING -o wan0 -j MASQUERADE
(If you already have IPv4 configured and routable, the latter parts might already be done. I'm assuming you have no IPv4 routing configuration at all, so the router can speak IPv4 but nothing else can)
With this configuration, the outbound packet path is as follows:
- Packet arrives on
lan0
, source2001:db8:cafe::f4bc
, destination::ffff:0:1.2.3.4
. - Route lookup matches
::ffff:0:0:0/96
, forwards the packet tosiit0
. - Netfilter rule does NAT66, changes the packet source to
::ffff:0:192.168.254.254
. siit0
translates the packet from IPv6 to IPv4, and bounces it back to the kernel.- Packet arrives on
siit0
, source192.168.254.254
and destination1.2.3.4
. - Route lookup matches
0.0.0.0/0
, forwards the packet towan0
. - Netfilter rule does NAT44, changes the packet source to
192.0.2.16
.
The return packet path is:
- Packet arrives on
wan0
, source1.2.3.4
, destination192.0.2.16
. - Conntrack rewrites the destination IP to
192.168.254.254
. - Route lookup matches
192.168.254.254/32
, forwards the packet tosiit0
. siit0
translates the packet from IPv4 to IPv6, and bounces it back to the kernel.- Packet arrives on
siit0
, source::ffff:0:1.2.3.4
, destination::ffff:0:192.168.254.254
. - Conntrack rewrites the destination IP to
2001:db8:cafe::f4bc
. - Route lookup matches
2001:db8:cafe::/64
, forwards the packet tolan0
.
This setup requires a double NAT: NAT66 before SIIT, and NAT44
afterwards. This is because the WAN IP 192.0.2.16
is assigned to
wan0
. If we didn't do NAT44, the first route lookup on the return
path would get a hit in the magic "local" routing table (ip route show table 0
), and divert the packet into the "destination is local"
codepath, rather than the forwarding codepath.
This setup is similar to the home router one, except that now we don't
have a single IPv4 WAN address, but two, 192.0.2.16
and
192.0.2.17
.
In this setup, we can simplify a little bit and dispense with the
NAT44 stage, by using 192.0.2.17
as the "intermediate" IP for
SIIT. In effect, we turn 192.0.2.17
into a SIIT translator and
gateway, from the ISP's point of view. The configuration is similar,
but simpler:
ip -6 route add ::ffff:0:0:0/96 dev siit0
ip6tables -t nat -A POSTROUTING -o siit0 -j SNAT --to-source 192.0.2.17
ip route add 192.0.2.17/32 dev siit0
ip route add default via 192.0.2.1
With this configuration, the outbound packet path is as follows:
- Packet arrives on
lan0
, source2001:db8:cafe::f4bc
, destination::ffff:0:1.2.3.4
. - Route lookup matches
::ffff:0:0:0/96
, forwards the packet tosiit0
. - Netfilter rule does NAT66, changes the packet source to
::ffff:0:192.0.2.17
. siit0
translates the packet from IPv6 to IPv4, and bounces it back to the kernel.- Packet arrives on
siit0
, source192.0.2.17
and destination1.2.3.4
. - Route lookup matches
0.0.0.0/0
, forwards the packet outwan0
.
The return packet path is:
- Packet arrives on
wan0
, source1.2.3.4
, destination192.0.2.17
. - Route lookup matches
192.0.2.17/32
, forwards the packet tosiit0
. siit0
translates the packet from IPv4 to IPv6, and bounces it back to the kernel.- Packet arrives on
siit0
, source::ffff:0:1.2.3.4
, destination::ffff:0:192.0.2.17
. - Conntrack rewrites the destination IP to
2001:db8:cafe::f4bc
. - Route lookup matches
2001:db8:cafe::/64
, forwards the packet tolan0
.
Again the setup is similar, but with more IPs: this time we own all of
192.0.2.128/25
on the WAN side.
In this setup, we can alter the NAT66 step to use the entire IP range as possible source addresses, to expand our conntrack abilities beyond a single IP:
ip -6 route add ::ffff:0:0:0/96 dev siit0
ip6tables -t nat -A POSTROUTING -o siit0 -j SNAT --to-source 192.0.2.128-192.0.2.255
ip route add 192.0.2.128/25 dev siit0
ip route add default via 192.0.2.1
The packet path and return path is similar to the above, just with more IPs.
All the setups above are notionally very similar: the siit0
interface acts like some remote SIIT 1:1 translator box. It's slightly
more constrained in that the incoming v6 packets must already be in
the v4-translated format (i.e. the SIIT box doesn't implement stateful
NAT), so we have to do the NAT layer ourselves. But it shows that a
small addition to the kernel would enable all major NAT64 scenarios
using existing linux network features.
Rather than implement the translation as an interface, maybe we could
do it as a netfilter module. Do NAT66 as above, and then as the very
last step in nat/POSTROUTING
, do the stateless translation right
before plopping the packet onto the wire.
Similarly on the input end, in raw/PREROUTING
, statelessly translate
back to IPv6 and then let the entire kernel live happily as v6-only.
This is cleaner in terms of how much of the system has to care about IPv4. However, it'll interfere with normal IPv4 operation on the WAN link, for things like DHCP and other ISP configuration protocols that are assigning the IPv4 in the first place. It'd work for the SMB and Enterprise cases above, but not for the minimal "single WAN IP" case.
Similar to the netfilter idea, the XFRM framework acts late in the packet processing stage, and could do the SIIT translation. The same problems exist as with the netfilter implementation.
The more programmable the kernel network stack, obviously the more fancy we can be here. I don't know enough about bpfilter to know.
by a chance i stumbled upon a bpf nat64 implementation https://github.com/xdp-project/bpf-examples/tree/master/nat64-bpf