This study focuses on the strategies used by the "xz backdoor", an extremely
complex piece of malware that contains its own x64 disassembler inside of it
to find critical locations in your code and hijacks it by swapping out your
code with its own as it runs. Because this a machine-code based attack,
all code written in any program language can be attacked and is vulnerable.
Instead of targeting sshd directly, the xz
backdoor injects itself in the parent systemd process then hijacks the
GNU Dynamic Linker (ld), before sshd is even started or libcrypto.so is
fully-loaded (needed by sshd), by creating an LD_AUDIT hook. This gives
the xz backdoor live information of what the linker is loading so it can
activate before memory pages containing code in it are locked.
In order to pull this off, the attacker:
- Had advanced knowledge of the GNU C compiler
- Was keenly aware of upcoming changes to ld.so (the GNU Dynamic Linker),
as the current ld.so present on your computer, home router and smartphone
behave slightly differently than the new release still undergoing testing.
To raise the bar even higher for the next attacker that comes, this
study shows you how to force an attacker to deal with the added complexity
of relocatable coroutines, which is a runtime feature available in Go.
I've been thinking a lot about this as a CGo/gccgo dev: "What can a HLL programmer do against the likes of Jia Tan? They're attacking from the foundation software."
I'm not settled on this one but wrapping calls to C libs in goroutines probably would raise the difficultly level of a direct hijack on your own Go code, as the rapid context switches and unpredictability introduced on where the Go runtime will move the jump calls happens.
After 129,000 lines of asm, here is printf("Hello World")
in Go down at the bottom:
Now, let's see what happens when we do this:
func main() {
hello := "Hello world!"
go func() {
print(hello)
}()
time.Sleep(1*time.Second)
}
Now we're asking the Go runtime to activate concurrency and main
itself gets split into two compact parts with an anonymous function that disappears into the goroutine ecosphere (to get this to fit I'm stripping symbols):
Notice how nice and compact the goroutine is! Not many things you can do here but try to intercept the ret
and call
instructions, but you will need to also make sure the runtime stack cleanup happens or things will start to go crashycrashy.
So now let's make a C lib call but push it down into a goroutine wrapper, yet make it synchronous. And for fun, the data to the function will be passed via a channel, which brings in the communication/sync areas of the runtime with its maze of runtime functions. And since we're here, let's make it a full Go wrapper, with two channels and a goroutine bridge, and a done signaler.
package main
// #include <stdio.h>
// #include <stdlib.h>
// void printFromC(const char* str) {
// printf("Received C string: %s\n", str);
// }
import "C"
import "unsafe"
func main() {
myPrint("Hello from Go!")
}
func myPrint(hellostring string) {
// Protect the C library call from Jia Tan and the NSA
sendchan := make(chan string)
recvchan := make(chan string)
done := make(chan bool)
go func(sender chan string, receiver chan string){ // This chan is send-only
go func(receive <-chan string) { // This one is recv-only
callCPrint(receive)
done <- true
}(receiver)
go func() {
strToSend := <- sender
receiver <- strToSend
close(sender)
return
}()
}(sendchan, recvchan)
sendchan <- hellostring
<-done
return
}
func callCPrint(str <-chan string) {
cStr := C.CString(<-str)
defer C.free(unsafe.Pointer(cStr)) // Deallocate memory when done
C.printFromC(cStr)
}
The main()
in asm representation gets shorter
But now there is some real fun going on in myPrint()
as it's acting as a traffic cop moving the string along its way into the chaos of pthread
, with its context switches and semaphores. myPrint
is split by the compiler into 6 asm functions (one for each launch context and its anonymous function), to allow for their dynamic reallocation to the runtime.
callCPrint
then has a thunk going on, which can't get back its data to myPrint
without going back through the runtime maze.
I'm still not sold on this approach but I'm definitely willing to change my own behavior to make these creeps go away if the difficulty is raised high enough. And throwing CGo calls through a goroutine bridge still makes readable code to me.
For anyone who is interested:
I am applying the context in this study to new development: a Go-based app+API to
liblzma.so
: safexzUnlike other CGo wrappers for
liblzma
, this one only allows the C code to send/receive data via one-way channels and strong-typed base types. No data coming from Go can be promoted into a pointer and misused from the C layer. So even if you had the5.6.2
version ofliblzma.so
on your setup, you would be fine.The Go-facing stubs are written and the underlying compress/decompress sequence is complete. I am now finishing manual stress testing of the compress/decompress in preparation for writing out the unit tests and a build sequence.
While I am still statically binding to
liblzma.so
I could (if I wished) set up the C library to load on invocation withdlopen()
and then unloadliblzma.so
. As there has been a lot of renewed interest in Lasse Collins' work lately and Jia Tan's backdoor has been extricated from the test binaries and build process, there isn't a compelling reason todlopen()
.Since the
xz
format does not lend itself well to multiple stream structures (it is supported, but no one makes use of it), to stay compatible with how most people usexz
I am debating throwing the standard/basic Go support fortar
functions into this Go lib so thatsafexz
can pack and unpack to.tar.xz
in one shot rather than running through atar | xz
pipe. I primarily first want to make sure that there is buffer support in front of and behind STDIN and STDOUT since compressors usually are fed data that way.Direct
xz
compression for strings and[]byte
for in-memory/database applications is quite easy, so I intend after this is over to "eat my own dog food" and usesafexz
in my own work at${DAY_JOB}