I spend some time looking into Cloud NAT. Unfortunately, it doesn't look like we can use it short of migrating to private clusters.
The example GKE Setup document specifies a private cluster in its walkthrough, but doesn't really make it clear that it's a hard requirement.
This passage from the Cloud NAT Overview document's section on cases where NAT is not performed actually implies that it's usable on some non-private clusters:
Regular (non-private) GKE clusters assign each node an external IP address, so such clusters cannot use Cloud NAT to send packets from the node's primary interface. Pods can still use Cloud NAT if they send packets with source IP addresses set to the pod IP.
The wording there implies that we can use Cloud NAT if we can rely on the source IP being the pod IP, which some people - myself included - took to suggest it should work in a public cluster as long as VPC-native is enabled.
However, the troubleshooting page for Cloud NAT is quite clear on this:
Make sure your GKE is a private cluster. You can only use Cloud NAT with private clusters.
This contradiction made me hopeful, but I couldn't get internet traffic routing through Cloud NAT on a test cluster - my source IP was always the node's external IP, not an IP allocated to the Cloud NAT gateway.
There's some relevant discussion in this StackOverflow question:
The idea here is that if you use native VPC (Ip alias) for the cluster, your pods will not use SNAT when routing out of the cluster. With no SNAT, the pods will not use the node's external IP and thus should use the Cloud NAT.
Unfortunately, this is not currently the case. While Cloud NAT is still in Beta, certain settings are not fully in place and thus the pods are still using SNAT even with IP aliasing. Because of the SNAT to the node's IP, the pods will not use Cloud NAT.
It seems that the docs supported this use case at the time of those comments (late October 2018), but that it didn't work in practice. Evidently, they've now (mostly) cleaned up the docs to make it clear that you must run a private cluster in order to use Cloud NAT.
Regardless, this wouldn't be much help, since you can't enable VPC native on an existing cluster.
Let's discuss next steps in retro tomorrow.