Skip to content

Instantly share code, notes, and snippets.

@bb01100100
Created September 30, 2025 06:17
Show Gist options
  • Save bb01100100/f34600e0c815cfb5ba76689d4bfbca55 to your computer and use it in GitHub Desktop.
Save bb01100100/f34600e0c815cfb5ba76689d4bfbca55 to your computer and use it in GitHub Desktop.
Multi PrivateLink environment access from single VPC

Access multiple private-link networked Confluent Cloud environments from a single VPC

If you have a single VPC and use Private Link networking to access your Confluent Cloud clusters, then you'll quickly realise that you can only access a single Confluent Cloud network (Environment) because the PL attachment has a regional DNS domain.

In organisations that segregate dev/uat/test/prod networks from each other, this is almost never an issue. However, if you have a flat network with a mixture of different clients, then this will be a challenge.

In order to connect to multiple (different) Confluent Cloud environments via Private Link, you'll need to either:

  • Use split-horizon DNS (e.g. different clients resolve to different PL endpoints)
  • Use a proxy

Split horizon DNS is OK if individual clients do not need to access multiple Confluent Cloud environments; a proxy faciliates a single client having multiple environment access - we'll focus on that.

Assumptions

  • You’ve provisioned your clusters and privatelinks
  • You are in AWS and have a VPC with working security groups, a PHZ for your AWS region attached
  • You might already have one CC Env setup via PL and can hit one cluster from your VPC, but not the other - that’s fine.

Solution summary

  • Unless you restructure your network so that different groups of clients (in separate networks) get DNS to point the “right” CC environment (often referred to as Split Brain or Split Horizon DNS), then you need to introduce a routing component.
  • Even if you do have segmented networks, there might still be a valid use-case for a “multi-environment” client (e.g. terraform) to interact with multiple environments and the data-plane in each.
  • Our solution is to use a mixture of DNS and a transparent proxy that routes traffic to the right Private Link endpoint based on a regex of a hostname the client wants to connect to.

Solution details

  1. We have two Enterprise Clusters, each in a separate CC Environment (network) and each with a separate PrivateLink attachment.
  2. We have client apps that all run in one network and we want to ensure that if ClientA wants to connect to NonProd, it will be routed to the right Private Link endpoint. Also, if that same client uses a different set of configuration properties, it should be able to hit Prod if it wants. Whether we think that's a good idea or not is a topic for another day.
  3. This would be easy if all we had to deal with was some bootstrap cluster IDs, but with Enterprise Clusters we have a dynamic (sorta) broker hostname suffix, instead of a DNS domain name suffix.. so the "wildcard" rules of DNS domain names are insufficient: DNS doesn't allow us to use regexes like lkc-abc-[a-z][0-9]{3}.* so we need something else - a smart load balancer (F5) or proxy (nginx, haproxy)
  4. CC requires TLS encrypted traffic, so we can use the client's Server Name Indication to inspect the TLS header to get insight into which host the encrypted traffic was destinated for.
  5. SNI hostnames are... hostnames, so we need DNS to resolve the hostname.
  6. DNS should not resolve the broker hostname to our CC cluster because then we'd bypass the infrastructure (proxy) we're using to do our dynamic routing magic... So (this bit is crucial).. all hostnames for Confluent Cloud need to resolve to our proxy.
  7. In AWS we use a Private Hosted Zone (PHZ) and attach this to our VPC so that we can define the hostname to IP address mappings we need. Assume our (regional!) PHZ domain is ap-southeast-2.aws.private.confluent.cloud.
  8. We map the wildcard for our *.ap-southeast-2.aws.private.confluent.cloud PHZ to our proxy
  9. Any client in our VPC that wants to connect to a CC resource in ap-southeast-2.aws.private.confluent.cloud will first need to use AWS's DNS resolvers to determine the correct IP address to use.. but now, that IP address will be our proxy server.
  10. We use a wildcard PHZ entry for *.ap-southeast-2.aws.private.confluent.cloud with a CNAME that points to nginx-proxy.ap-southeast-2.aws.private.confluent.cloud and nginx-proxy.ap-southeast-2.aws.private.confluent.cloud has an A record that points to the private IP address of the EC2 instance that's running the proxy.
  11. With DNS sorted, the actual proxy stuff is simple and is mostly the same as all our other nginx proxy configs, except:
  12. We use a regex on cluster ID to map to an upstream server definitiion
  13. The upstream server definition points to the correct private-link endpoint. This abstraction isn’t stricly needed (e.g. our map can point straight to a server instead of an upstream) but it’s tidy:
map $ssl_preread_server_name $kafka_target {
        "~^lkc-abc123.*\.ap-southeast-2\.aws\.private\.confluent\.cloud$" non_prod_cluster;
        "~^lkc-def678.*\.ap-southeast-2\.aws\.private\.confluent\.cloud$" prod_cluster;
        default non_prod_cluster;
    }
    upstream non_prod_cluster {
        server vpce-01e2d3a44c57561b7-uewjm6ol.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com:9092;
    }
    upstream prod_cluster {
        server vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com:9092;
    }

With these bits in play, a client in our single VPC will be directed to the correct cluster based solely on the clusterId:

Non Prod client configs, hitting non-prod cluster:

$ cat java-nonprod.properties |sed -e '/^$/d; /^#/d'
bootstrap.servers=lkc-abc123.ap-southeast-2.aws.private.confluent.cloud:9092
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="awesome" password="cfltwin";
[ec2-user@ip-10-0-17-180 ~]$ ~/kafka_2.12-3.3.1/bin/kafka-topics.sh --command-config ~/java-nonprod.properties --bootstrap-server
$NONPROD_BS --list
nonprod_topic

Ditto for Prod:

$ cat java-prod.properties |sed -e '/^$/d; /^#/d'
bootstrap.servers=lkc-def678.ap-southeast-2.aws.private.confluent.cloud:9092
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="prodawesome" password="cfltwin";
[ec2-user@ip-10-0-17-180 ~]$ ~/kafka_2.12-3.3.1/bin/kafka-topics.sh --command-config ~/java-prod.properties --bootstrap-server $PR
OD_BS --list
prod_topic

And confirmation that requested Prod broker hostnames are in fact being proxied to the right PL endpoint IPs:

==> access.log <==
[22/Sep/2025:12:06:45 +0000] remote address 10.0.17.180 with SNI name "lkc-def678-g000.ap-southeast-2.aws.private.confluent.cloud"
 proxied to "10.0.18.244:9092" TCP 200 6561 995 0.145
[22/Sep/2025:12:06:45 +0000] remote address 10.0.17.180 with SNI name "lkc-def678.ap-southeast-2.aws.private.confluent.cloud" proxied to "10.0.1.71:9092" TCP 200 6483 990 0.680

We know that request was to the Prod cluster, so we hit DNS for the Prod PL endpoint to confirm IP address matches what nginx proxied to (10.0.1.71):

$ dig vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com
; <<>> DiG 9.18.33 <<>> vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30222
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com. IN A
;; ANSWER SECTION:
vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com. 60 IN A 10.0.1.71

We’ve confirmed that our routing works as expected - hurrah.

Summary

  • We use a Private Hosted Zone in our VPC that resolves all hostname lookups for ap-southeast-2.aws.private.confluent.cloud to our nginx proxy server.
  • Our Proxy server inspects the hostname provided as part of client TLS connection requests
  • Based on the hostname, nginx rewrites the destination hostname to the correct Private Link endpoint and connects to CC.
  • The proxy is transparent and no TLS termination occurs
  • No client application changes are needed.

Caveat

  • Your proxy is now a single point of failure so make sure you have an HA setup
  • AFAICT we can’t route traffic to the AZ-specific PL endpoint, it just hits the global one and round-robins, so that might be more expensive.

Sample nginx config file

This config demonstrates the config but isn't suitable for production usage - do not copy/paste...

include /usr/share/nginx/modules/*.conf;

daemon off;
pid nginx.pid;
error_log /home/ec2-user/error-nginx.log info;

events { }
stream {

    resolver 169.254.169.253;

    map $ssl_preread_server_name $kafka_target {
		    # Match clusterId plus zero (bootstrap) or more (broker) characters, then our AWS region.
        "~^lkc-abc123.*\.ap-southeast-2\.aws\.private\.confluent\.cloud$" non_prod_cluster;
        "~^lkc-def678.*\.ap-southeast-2\.aws\.private\.confluent\.cloud$" prod_cluster;
		    default non_prod_cluster;
    }

	upstream non_prod_cluster {
      server vpce-01e2d3a44c57561b7-uewjm6ol.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com:9092;
  }
  upstream prod_cluster {
      server vpce-07ed6925ef4eb3cf2-rffdxwe9.vpce-svc-0ab3bb0ce8d63ed8f.ap-southeast-2.vpce.amazonaws.com:9092;
  }

  server {
      listen 0.0.0.0:9092;

      proxy_pass $kafka_target;
      ssl_preread on; 
  }

  server {
      listen 0.0.0.0:443;

      proxy_pass $kafka_target:443;
		  ssl_preread on;
	}

  log_format stream_routing '[$time_local] remote address $remote_addr '
	    'with SNI name "$ssl_preread_server_name" '
	    'proxied to "$upstream_addr" '
	    '$protocol $status $bytes_sent $bytes_received '
	    '$session_time';

	access_log /home/ec2-user/access.log stream_routing;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment