You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
For defence in depth it is useful to have iptables on top of your system.
Docker makes things hard because it manages iptables to create all those nat rules.
When you do something like this: docker -d -p 1234:80 nginx, it will bin a tcp socket on 0.0.0.0:1234 then use
a DNAT rule on iptables to change the destination ip and port to the ip of the container and the port you specified.
Obviously we can't really touch how docker behaves because it will mostly break a lot of stuff.
We know that docker creates some chains, so let's not touch those.
Instead we will do 2 things:
create a custom chain in the nat table
drop everything that is not declared in INPUT
Here's the default iptables file saved:
*nat
:PREROUTING ACCEPT [139:20205]
:INPUT ACCEPT [139:20205]
:DOCKER-BLOCK - [0:0]
:OUTPUT ACCEPT [702:52918]
:POSTROUTING ACCEPT [702:52918]
-D PREROUTING -m addrtype --dst-type LOCAL -j RETURN
-D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER-BLOCK
-D PREROUTING -m addrtype --dst-type LOCAL -m state --state ESTABLISHED -j DOCKER
-I PREROUTING -m addrtype --dst-type LOCAL -j RETURN
-I PREROUTING -m addrtype --dst-type LOCAL -j DOCKER-BLOCK
-I PREROUTING -m addrtype --dst-type LOCAL -m state --state ESTABLISHED -j DOCKER
COMMIT
*filter
:INPUT ACCEPT [0:0]
:FORWARD DROP [0:0]
:DOCKER-ISOLATION - [0:0]
:OUTPUT ACCEPT [245:27029]
-A INPUT -p tcp -m tcp --dport 22 -m state --state NEW,ESTABLISHED -j ACCEPT
-A INPUT -m state --state ESTABLISHED -j ACCEPT
-A INPUT -j DROP
As you can see, everything thing goes to DOCKER-BLOCKbefore going to INPUT -- because nat goes first.
Then inside that chain, based on some spec, we decide if we jump back to the regular DOCKER chain. If we do jump,
the packet destination is altered if needed, and then it is passed to the filter tables to the INPUT chain.
As for reload, the idea is to it in 2 steps:
first we flush the specified tables individually
secondly we restore only per chain without flushing everything
This ensure that the rules inside existing chains are preserved but we do have clean reload for temporary rules we might have injected.
If you want to allow traffic from a global listen address:
iptables -t nat -I DOCKER-BLOCK -p tcp -m tcp --dport 1234 -m state --state NEW -j DOCKER
It should be noted that for a short period of time, rules are all flushed then reloaded. This means that new connection
will get dropped, but existing will continue working.
Since application will periodically retry sending the sync packet, there might be a little latency increase for dropped packet the time the firewall reloads.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Parse each line then redirect output to file depending on first field
echo -e "toto tata first text is it cool?\ntiti toto second text " | grep --color -oE '[A-z]+\ [A-z]+.*' | awk '{print $0 >> ($1 ".txt")}'
Change mac address of interface:
ip link <device> set address <lladr>
Script to generate a mac address:
package main
import (
"fmt"
"crypto/rand"
"os"
"io"
)
func main() {
buff := make([]byte, 6)
i, err := io.ReadFull(rand.Reader, buff)
if err != nil {
fmt.Fprintf(os.Stderr, "Err reading from crypto/rand: %v\n", err.Error())
os.Exit(2)
}
if i != len(buff) {
fmt.Fprintf(os.Stderr, "Couldn't read %d bytes from crypto/rand\n", len(buff))
os.Exit(2)
}
buff[0] = 0
for i, b := range buff {
fmt.Printf("%02X", b)
if i != len(buff) - 1 {
fmt.Printf(":")
}
}
}
However that doesn't work in Virtualbox in bridged mode because on OSX it uses the same interface for outgoing packets. The filtering between which packet are for the host and the vm are done with virtualbox through hardware addresses.
It is also not possible to change the mac address from the virtualbox interface at runtime.
I think the only possible way would be to connect an interface -- eth or wifi -- through USB over virtualbox. Then I think changing the mac at runtime is possible.
Process groups
I had a problem using a custom init entrypoint in docker where when pid 1 would spaw a process to stop another process that wouldn't be watch by it but would be child, that process would stay in zombie state.
Flow:
init starts a process P1 that executes another process P2 and detach from it
P2 as init as parent
P1 terminates
init gets back to work
init at some point receives a sigterm and start shutting down services
P3 is started from init which tries to kill P2
after some attempts killing it P3 terminates
P2 is still there but in zombie state
The why is because a process always goes to the zombie states after executing. The parent process is responsable for cleaning, usually calling wait(). After the kernel returned the wait call, the forked process is finally done.
It goes fast so you don't really see the zombie state.
The way the init works in general is that they implement something called "process reaping" which will actually just executes wait whenever you get the sigchld signal. Because yes, when a child process terminates or stops it sends a sigchld to the parent.
The parent can then get listen for the signal and perform some logic.
In my case I need to collect all the signals and build something that is called a "process table" to make the distinction between processes that the init actually started and the processes that got associated to init because of detaching.
I guess being an init process is harder than I thought. And it funny to think that you have to build something on top of docker is you want to make sure you actually clean up behind you when docker stops the container. I just found about this [1] and [2]
tcp_keepalive_time
the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further
tcp_keepalive_intvl
the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime
tcp_keepalive_probes
the number of unacknowledged probes to send before considering the connection dead and notifying the application layer
This setting means that when the keepalive_time will reach 0, if no data has been sent from the connection, a keepalive_probe will be sent.
If this probe remains not ack, net.ipv4.tcp_keepalive_intvl kicks in. At the end of the intvl timeout, another probe is sent.
It continues like this until net.ipv4.tcp_keepalive_probes is reached. Then it eventually kills the connection.