Skip to content

Instantly share code, notes, and snippets.

@tehmoon
Last active October 27, 2024 17:30
Show Gist options
  • Save tehmoon/b1c3ae5e9a67d66186361d4728bed799 to your computer and use it in GitHub Desktop.
Save tehmoon/b1c3ae5e9a67d66186361d4728bed799 to your computer and use it in GitHub Desktop.
IPtables and docker reload!
#!/bin/sh
set -e
## SEE https://medium.com/@ebuschini/iptables-and-docker-95e2496f0b45
## You need to add rules in DOCKER-BLOCK AND INPUT for traffic that does not go to a container.
## You only need to add one rule if the traffic goes to the container
CWD=$(cd "$(dirname "${0}")"; pwd -P)
FILE="${CWD}/$(basename "${0}")"
chown root:root "${FILE}"
chmod o-rwx "${FILE}"
set -x
install_docker_block() {
## One time install rules for the DOCKER-BLOCK chain
/sbin/iptables -t nat -N DOCKER-BLOCK &&
## Deploy the new rules. After this, everything goes to DOCKER-BLOCK then to RETURN
/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -g DOCKER-BLOCK ||
true
}
## install the PREROUTING rules for the DOCKER chain in case docker starts after
/sbin/iptables -t nat -N DOCKER || true
## Block new connections while we restore the first PREROUTING RULES
/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -m state --state NEW -j RETURN
install_docker_block
## Delete installed rules, we need to ensure they always are at the top
## If rules were already installed, it would mean that the second and third rule
## are going to be deleted. We still have the RETURN on top.
while true; do
/sbin/iptables -t nat -D PREROUTING -m addrtype --dst-type LOCAL -j RETURN || break
done
while true; do
/sbin/iptables -t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER-BLOCK || break
done
## Re-deploy the right rules on the top. After this, the flow is restored to DOCKER-BLOCK
/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -g DOCKER-BLOCK
## Remove the blocking rule, which should be unreachable after deploy_docker_block anyway
while true; do
/sbin/iptables -t nat -D PREROUTING -m addrtype --dst-type LOCAL -m state --state NEW -j RETURN || break
done
## Only let established connections go through while we flush the rules
/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -m state --state ESTABLISHED -j DOCKER
## Flush the rules of DOCKER-BLOCK, at this point new connections will be blocked
/sbin/iptables -t nat -F DOCKER-BLOCK
## Add your new rules below, allowing new connections
## Don't forget the NEW and ESTABLISHED states
#/sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m tcp --dport 8080 -m state --state NEW,ESTABLISHED -j DOCKER
## Restore the flow
## Loop trying to delete the rule in case the script failed above, we don't want to add more than one rule
while true; do
/sbin/iptables -t nat -D PREROUTING -m addrtype --dst-type LOCAL -m state --state ESTABLISHED -j DOCKER || break
done
## The INPUT chain is set to drop, then we flush it and reinstall the rules.
## Finally we restore the policy on the chain
## Remember that those rules don't apply to docker
/sbin/iptables -t filter -P INPUT DROP
/sbin/iptables -t filter -F INPUT
/sbin/iptables -t filter -A INPUT -i lo -j ACCEPT
# Add your non docker rules here
#/sbin/iptables -t filter -A INPUT -p tcp -m tcp --dport 22 -m state --state NEW,ESTABLISHED -j ACCEPT
/sbin/iptables -t filter -A INPUT -m state --state ESTABLISHED -j ACCEPT
/sbin/iptables -t filter -A INPUT -j DROP
/sbin/iptables -t filter -P INPUT ACCEPT
@ptulpen
Copy link

ptulpen commented Aug 30, 2019

Hi, sorry for my confusion, but line 14 already states RETURN (and when I leave out that change I get blocked there as well)
I changed line 56 to:
/sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m tcp ! -s 192.168.65.11 --dport 9000 -m state --state NEW,ESTABLISHED -j RETURN
(since I only want 192.168.65.11 to access port 9000)
But currenlty this is still blocked as well

@tehmoon
Copy link
Author

tehmoon commented Aug 30, 2019

@ptulpen don't be sorry!
I think you should add:

/sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m tcp -s 192.168.65.11 --sport 9000 -m state --state ESTABLISHED -j RETURN

Because basically, since it looks like what you want is to block all outgoing connection to 192.168.65.11 to port 9000, you need to block all return packets. The way it works is that the application container's tcp stack will send a SYN packet, this is the NEW state. Then the remote server will send back a SYN, ACK, this is the ESTABLISHED state.
If you block all established state coming from port 9000, you should never fully enter the established state from iptables perspective.

As for why the traffic is still blocked by default, I don't really know without try/failure approach. I think what I said still would work, but I did not test it. Reach out on my twitter @moonbocal , perhaps I can take a look.

@defnull
Copy link

defnull commented Oct 9, 2019

## Remember that those [input chain] rules don't apply to docker

Yes they do! Docker not only installs DNAT rules to PREROUTING->DOCKER, but also starts docker-proxy processes that listen to the same port and forwards traffic to the container. These are usually shadowed by the DNAT rules and won't see any traffic, but now that we skip DNAT in some cases, traffic will flow through INPUT instead. If it is not blocked there, it will be forwarded to the container by the docker-proxy now. To actually block traffic to a container, we must skip the DNAT rule and block the traffic in INPUT.

Oh, and as a side note: The two rules -j DOCKER-BLOCK followed by -j RETURN could also be written as -g DOCKER-BLOCK. That would reduce the complexity a bit.

@tehmoon
Copy link
Author

tehmoon commented Jan 3, 2020

@defnull thank you for your feedback! Sorry for the very (very very) late update, I've been busy with a ton of other things.
I think I meant that the input rules don't apply when the packets comes from outside, but not entirely sure why.

Also, a big thanks for the -g option which looks really good! My understanding is I can replace this function:

deploy_docker_block() {
  /sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -j RETURN
  /sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -j DOCKER-BLOCK
}

With:

/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -g DOCKER-BLOCK

Will test this tonight.

@tehmoon
Copy link
Author

tehmoon commented Jan 4, 2020

@defnull, I tested it and it works good!
This is the diff:

index efe8a3e..30f74cd 100644
--- a/iptables-reload.sh
+++ b/iptables-reload.sh
@@ -10,9 +10,12 @@ chmod o-rwx "${FILE}"
 
 set -x
 
-deploy_docker_block() {
-  /sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -j RETURN
-  /sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -j DOCKER-BLOCK
+install_docker_block() {
+	## One time install rules for the DOCKER-BLOCK chain
+	/sbin/iptables -t nat -N DOCKER-BLOCK &&
+	## Deploy the new rules. After this, everything goes to DOCKER-BLOCK then to RETURN
+	/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -g DOCKER-BLOCK ||
+	true
 }
 
 ## install the PREROUTING rules for the DOCKER chain in case docker starts after
@@ -21,11 +24,7 @@ deploy_docker_block() {
 ## Block new connections while we restore the first PREROUTING RULES
 /sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -m state --state NEW -j RETURN
 
-## One time install rules for the DOCKER-BLOCK chain
-/sbin/iptables -t nat -N DOCKER-BLOCK && {
-  ## Deploy the new rules. After this, everything goes to DOCKER-BLOCK then to RETURN
-  deploy_docker_block
-} || true
+install_docker_block
 
 ## Delete installed rules, we need to ensure they always are at the top
 ## If rules were already installed, it would mean that the second and third rule
@@ -38,7 +37,7 @@ while true; do
 done
 
 ## Re-deploy the right rules on the top. After this, the flow is restored to DOCKER-BLOCK
-deploy_docker_block
+/sbin/iptables -t nat -I PREROUTING -m addrtype --dst-type LOCAL -g DOCKER-BLOCK
 
 ## Remove the blocking rule, which should be unreachable after deploy_docker_block anyway
 while true; do

Thanks for the trick!

@Sispheor
Copy link

It's work very well. Thanks a lot for your work !

@Sispheor
Copy link

I have actually a small issue. I have a port mapping in docker compose like

ports:
  - "2222:22"

I've added this line

/sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m tcp --dport 2222 -m state --state NEW,ESTABLISHED -j DOCKER

But the port is still blocked...

@Sispheor
Copy link

It's ok. I just learnt that docker-compose restart doen''t update port mapping if we change them. I had to stop and up again.

@tehmoon
Copy link
Author

tehmoon commented Jun 4, 2020

Glad you figured it out @Sispheor! Yes there are some differences between docker and docker-compose for the same API which can lead to frustration 😄

@PZ973
Copy link

PZ973 commented Aug 10, 2020

Works for me too with Docker Compose port mapping but every changes require service docker stop and start like @Sispheor. Thanks a lot for your work !

@rizebi
Copy link

rizebi commented Aug 24, 2020

Hi! Thank you so much for this! The script seems indeed as a very nice and clean solution. But for me I cannot get it to work, the port is still blocked...
I just added the new rule in your script to allow ssh. I have it configured on port 19972. This is the line I added:
/sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m multiport --dports 19972 -m comment --comment "SSH" -m state --state NEW,ESTABLISHED -j DOCKER

After running the script these are the iptables of my Pi:
https://github.com/rizebi/dummy_repo/blob/master/iptables_issue

Thank you in advance for the help. 👍

@tehmoon
Copy link
Author

tehmoon commented Aug 24, 2020

@rizebi sorry to hear that it did not work for you! The gist you sent me seems to have had an issue, I see some rules that are duplicated which I thought I had added logic to remove them.
But I think the problem is due to the fact that you are most likely running ssh not in a container, and therefor it needs to be added to this line https://gist.github.com/tehmoon/b1c3ae5e9a67d66186361d4728bed799#file-iptables-reload-sh-L73 which also was pretty stupid from me to not have removed the ssh one and not make that clearer.
I've added more comments on the top of the file specifying that it goes to INPUT if the traffic is going to not a container.

Thank you for the debug output, it was great. Could you also send me the file script you used to? Just making sure.

@rizebi
Copy link

rizebi commented Aug 24, 2020

Hi @tehmoon! Thanks for so quick response! No need for thank me for providing output. I am the one who needs help. This is the script I have used and failed (from my mistake): https://github.com/rizebi/dummy_repo/blob/master/reload_iptables.sh.
Now I have played with adding the rules at line 73, in FILTER table and INPUT chain and it works. The funny thing, is it allows ssh traffic, only by adding rule at the end and completely ignore NAT chain DOCKER-BLOCK. The same thing happens even with a service that runs in Docker. I just need to add the rule at FILTER INPUT, and the port is then usable. Maybe I did not understood completely the script, but my problem is solved :). My problem was that after installing Docker, both my host and Docker ports were completely exposed. Now with your script, I will add rules only on the end (line 73), and my problem is solved. Thank you so much for help :)

@mngnl
Copy link

mngnl commented Nov 6, 2020

Hi @tehmoon, this script will work with CSF? I used https://github.com/juli3nk/csf-post-docker with docker and everything seems to be working fine except that any blocked IP is still able to communicate directly with docker, I guess you explained the issue quite well. But now with CSF all blocks are present in ACCEPT / DENY chain, so can this script insert rule to check with deny chain before forwarding the traffic to docker?
Right now it is does not seem to be doing it.

@tehmoon
Copy link
Author

tehmoon commented Nov 6, 2020

@mngl Thank you for the feedback! Glad you find this gist useful.

I am not very familiar with the csf-post-docker project, I have to give it a try to understand what it does. That said, I would not recommend having two different software running that are trying to achieve the same goal.
I do not see a good reason to try to work with other firewall projects as this one is only targeting a system where you are running docker engine and don't have any other system in place.

If I get a good understanding of the project and with your help, I can see what I can do here to prove me wrong! It might be an easy and acceptable change, my take is only based on a 5-10 minutes look at the project.

@jonathanmmm
Copy link

jonathanmmm commented May 23, 2021

Hi,
Thanks for your nice script I finally found something that works easily :)
But I have some caviats about source ips:

I have a docker that uses -p 80:80 and I can access it via the internet, but the container itself then filters traffic based on the source ip and then forwards this (proxy server) to another port on an interface locally. As I dont want to hassle with the ips of the containers I use another local interface for that, that has nothing to do with docker.
Problem, with this script the proxy can't reach an exposed port from another container.

I even put /sbin/iptables -t nat -A DOCKER-BLOCK -p tcp -m tcp --dport 8080 -m state --state NEW,ESTABLISHED -j DOCKER (the port that the container should reach) but it didnt work, only after I exposed the port in the second place where we should add lines.
The proxy can't reach/does not get a response from the other container via his exposed port via 8080:80 (in another docker network).

What would be the best way to allow a container to access a port from another container without fixed ips, I thought about iptable through network names?

@tehmoon

@jonathanmmm
Copy link

jonathanmmm commented May 25, 2021

I think I got a way.
I can put at the place, where I place the normal stuff: /sbin/iptables -t filter -A INPUT -s 172.16.0.0/12 -p tcp -m tcp --dport 8080 -m state --state NEW,ESTABLISHED -j ACCEPT
to give every docker container with standard ips access.

In the future I will possibly make custom networks and name them and give access via -i interface_name

@jonathanmmm
Copy link

jonathanmmm commented May 25, 2021

Right now I dont understand the differences between the docker entries and the normal INPUT ones.
for external access it does not matter where I allow the port, both ways it is accessable from the outside.
The only difference is that only if I use the normal INPUT I can access ports from one container from another through a port bound to the host.

@tehmoon
Copy link
Author

tehmoon commented May 25, 2021

@jonathanmmm Thank you for using this! I just wanted to acknowledge that I saw your comments but I want to take time to read them in order for me to better reply to you. Unfortunately I am under the water at the moment. I'll try my best to help you debug this shortly, if you haven't figured it out sooner.

@jonathanmmm
Copy link

@jonathanmmm Thank you for using this! I just wanted to acknowledge that I saw your comments but I want to take time to read them in order for me to better reply to you. Unfortunately I am under the water at the moment. I'll try my best to help you debug this shortly, if you haven't figured it out sooner.

Thanks for your help, private stuff or work stuff is more important :-)
I am thinking about not using it, I have not figured a good way to let containers communicate with each other.

I wrote a script, will be posting it, it somehow has problems getting all ports, but should be a start.
https://gist.github.com/jonathanmmm/9a6192ec32588bb691ef6f082e33d7aa
You can include it via source

and use it like this

http_ports=(80, 443)
#interfaces or ips with or without submask like 127.0.0.1/8 that should have access to the above list of ports
mariadb_access_interfaces=(lo, 172.16.0.0/12)
allow_acess_to_docker_tcp_port http_ports http_access_interfaces

vpn_ports=(22)
vpn_access_interfaces=(tun0)
allow_acess_to_docker_tcp_port vpn_ports vpn_access_interfaces

also available as udp function.

right now the functions implement
/sbin/iptables -t filter -A INPUT -i $interface -p tcp -m tcp --dport $port -m state --state NEW,ESTABLISHED -j ACCEPT
which might not be the right one

@mahdi13
Copy link

mahdi13 commented Aug 1, 2021

Has anybody tried to run this script on a docker swarm cluster to handle DOCKER-INGRESS chain as well? I tried some ideas, but the ingress network got entirely blocked...

@tl87
Copy link

tl87 commented Feb 16, 2022

@tehmoon really nice, saved my well-being :-)

One question, how do you list rules in the chain DOCKER-BLOCK?

Trying to list rules with iptables, does not work:

$ iptables -L DOCKER-BLOCK
iptables v1.8.7 (nf_tables): chain `DOCKER-BLOCK' in table `filter' is incompatible, use 'nft' tool.

And using 'nft tool' (nftable) only show what iptables -L shows, and not your rules.

Please advise !

@tl87
Copy link

tl87 commented Feb 16, 2022

Update!

I found a way to display the rules:

$ nft list table nat

@fanuelsen
Copy link

Is it possible to use this to block outgoing connections? If i only want established input connections, but not allow the container to establish a new outgoing connections.

@tl87
Copy link

tl87 commented Mar 7, 2022

@fanuelsen, if you give the container a static IP when building it, like 172.17.0.100, then you might be able to block trafic from it by using:

$ /sbin/iptables -t nat -A DOCKER-BLOCK -s 172.17.0.100 -j DROP

Let me know if it worked for you? :-)

@jonathanmmm
Copy link

For everyone trying to figure an easier way out:
I switched to using the docker included firewall rules. I only publish ports when needed and like on localhost or 0.0.0.0. I use docker-compose and put the containers into internal networks, which makes the containers have no outside connection at all (no internet, just inside the internal network). That way you can put containers behind a proxy, that has outside access, when it is about network access or connect like a database container to another container.

As far I tried, docker blocks access to containers from outside, if they have no port published.

@tl87
Copy link

tl87 commented Apr 19, 2022

That's one way to fix it and I would also recommend a proxy or firewall. One could also look into setting up a geofence to limit incoming traffic or a cloud firewall if the containers are hosted with a cloud provider.

@egberts
Copy link

egberts commented Jul 18, 2022

To get rid of that libvirt error, my permanent workaround in Debian 11 (as a host) with libvirtd daemon is to block the loading of iptables-related modules:

Create a file in /etc/modprobe.d/nft-only.conf:


#  Source: https://www.gaelanlloyd.com/blog/migrating-debian-buster-from-iptables-to-nftables/
#
blacklist x_tables
blacklist iptable_nat
blacklist iptable_raw
blacklist iptable_mangle
blacklist iptable_filter
blacklist ip_tables
blacklist ipt_MASQUERADE
blacklist ip6table_nat
blacklist ip6table_raw
blacklist ip6table_mangle
blacklist ip6table_filter
blacklist ip6_tables

libvirtd daemon now starts without any error.

Post-analysis: Apparently, I had iptables module loaded alongside with many nft-related modules; once iptables was gone, the pesky error message went away.

@tl87
Copy link

tl87 commented Jul 31, 2022

I found a more layered solution for my use case to this "issue":

  1. Layer: is having Cloudflare's firewall to stand in front and route traffic
  2. Layer: is my cloud providers firewall
  3. Layer: lastly on the hosts, im using this script

@Biepa
Copy link

Biepa commented Oct 21, 2022

You need to add rules in DOCKER-BLOCK AND INPUT for traffic that does not go to a container.

I only have this in my nat table
-A DOCKER-BLOCK -p tcp -m tcp --dport 80 -m state --state NEW,ESTABLISHED -j DOCKER
But still can access my SSH port.

Edit: Just curious. Why not use the mange or raw table instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment