Problems & Solutions for Interaction Between C and Go

At Vimeo, on the transcoding team, we work a lot with Go, and a lot with C, for various tasks such as media ingest. This means we use CGO quite extensively, and consequently, have run into bits that are perhaps not very well documented, if at all. Below is my effort to document some of the problems we've run into, and how we fixed or worked around them.

Many of these are obviously wrong in retrospect, but hindsight is 20/20, and these problems do exist in many codebases currently.

Some are definitely ugly, and I much welcome better solutions! Tweet me at @daemon404 if you have any, or have your own CGO story/tips, please! I'd love to learn of them.

Table of Contents

Problems & Solutions
Debugging Tips
- GDB
- go trace & go profile
Further Reading

1. Problems & Solutions

Below are some problems we have hit, and how we "solved" them.

Lifetimes of Go-allocated Variables

This is one of those problems that will cause non-deterministic behavior and weird runtime crashes. Prior to the addition of concurrent GC in Go 1.5, we, and many others got away with passing Go-allocated variables into C, and keeping them around, perhaps for use in a callback, e.g.:

// #include <myheader.h>
import "C"

func myInit() {
    var myContext C.contextType
    var stashedThing C.stashType

    // Stashes a pointer to stashedThing in some private 
    // internal struct, for later use.
    C.init(&myContext, &stashedThing)

    // Stuff happens here

    // Probably runs fine in 1.4, but has a non-deterministic
    // stack-bounds panic in 1.5.
    C.process(&myContext)

    // ...
}

Now, the lifetimes of Go-allocated variables, and the behavior when passing into C-land is entirely undocumented. With stop-the-world GC in 1.4, this mostly worked fine, since GC would not run between the C calls. However, in 1.5, with the addition of concurrent GC, it may run in the background between the two C calls, and the address of stashedThing may change, leaving to an invalid memory access in the second C call.

The solution here is obviously to allocate stashedThing with C.calloc (calloc rather than malloc here in order to maintain the zeroed property that Go guarantees).

But you may have a more complicated case. Consider the following scenario, which is common amongst many C libraries, which wish to provide user-defined IO:

library.h

typedef struct IOContext {
    void *opaque;
    int (*read)(void *opaque, uint8_t *buf, int size);
    int64_t (*seek)(void *opaque, int64_t offset, int whence);
    int64_t (*size)(void *opaque);
} IOContext;

callback.c in your Go tree:

#include "_cgo_export.h"

#include <stdint.h>
#include <library.h>

void set_callbacks(IOContext *ioctx, void *opaque)
{
    ioctx->opaque        = opaque;
    ioctx->read          = ReadCallback;
    ioctx->seek          = SeekCallback;
    ioctx->size          = SizeCallback;
}

callback.go:

// #include <stdint.h>
import "C"

//export ReadCallback
func ReadCallback(r unsafe.Pointer, buf *C.uint8_t, size C.int) C.int {
    return someReadCallback((*package.Reader)(r), buf, size)
}

//export SeekCallback
func SeekCallback(r unsafe.Pointer, offset C.int64_t, whence C.int) C.int64_t {
    return someSeekCallback((*package.Reader)(r), offset, whence)
}

//export SizeCallback
func SizeCallback(r unsafe.Pointer) C.int64_t {
    sb   := (*package.Reader)(r)
    size := sb.Size()

    return C.int64_t(size)
}

and finally, program.go:

// #include <stdint.h>
// #Include <library.h>
// extern void set_callbacks(IOContext *ioctx, void *opaque);

func RunSomething() {
	ioctx := C.alloc_io_context()
    // checks

    r := package.NewReader()
    C.set_callbacks(ioctx, unsafe.Pointer(r)

    // Stuff Happens

    // Works in 1.4, random failure in 1.5.
    C.process()

    // ...
}

So, in this case, we implement the callbacks in Go, and we want to use a Go reader as it's userdata. We'll run into the same problem as above, except now we cannot actually allocate the memory/pointer in question in C-land, even if we wanted to. The least-bad solution we've come up with is to create a package-global map of keys to readers, protected by a sync.RWMutex, and pass the key into C-land. I'm not really happy with this solution, but it looks something like:

var readers = make(map[string]*package.Reader)
var readersMu sync.RWMutex

func addReader(key string, reader *package.Reader) {
    readersMu.Lock()
    readers[key] = reader
    readersMu.Unlock()
}

func delReader(key string) {
    readersMu.Lock()
    delete(readers, key)
    readersMu.Unlock()
}

func getReader(key string) (*package.Reader, error) {
    readersMu.RLock()
    ret, ok := readers[key]
    readersMu.RUnlock()

    var err error
    if !ok {
        err = fmt.Errorf("Try to use deleted reader!")
    }

    return ret, err
}

Mixing GC and Manual Memory Management (with defer)

A common idiom for error handling in C codebases is a a goto/fall-through path:

int canFail(int arg)
{
    someType *something;
    otherType *somethingElse;
    int err;

    something = allocSomething();
    if (!something) {
        err = ENOMEM;
        goto error;
    }

    somethingElse = allocSomethingElse(something);
    if (!somethingElse) {
        err = ENOMEM;
        goto error;
    }

    if (arg < 0) {
        err = EINVAL;
        goto error;
    }

    err = process(somethingElse, arg)

error:
    free(something);
    free(somethingElse);

    return err;
}

We have better tools in Go for this, namely defer. This works fine in Go-only code, due Go variables being tagged and tracked by the GC, but when using CGO, you may be tempted to do something akin to:

thing := allocThing()
defer C.free(unafe.Pointer(thing))

This usually works fine, but sometimes things need to be freed in a specific order, and stuff starts acting funky.

The key here is simply to know that Go will run defer'd operations in last-in-first-out order!

It can be tricky to write code this way, and though I am hesitant to say it, a goto pattern may cause less cognitave stress to write, and be easier to maintain. If someone has a better method to keep frees organized, please let me know!

Cleaning up C-allocated Memory "Automatically"

This is a short one. In the same vain as above, we thought we could be clever, and use runtime.SetFinalizer to automatically clean up C-allocated variables, using something like:

runtime.SetFinalizer(funcret, func(ctx *Ret) { C.free(ctx.bufptr) })

This actually works! Unfortunately, Go has no way of knowing about how much memory is being used by C-allocated variables, and thus cannot know to run GC when memory usage gets too high, and we ended up with very high memory usage, and very few methods (using standard Go tools) to figure out why we did.

The moral here is that you shouldn't try and be too clever. (It's also very un-gopher-like!)

Bonus Problem: uintptr

This one is not strictly related to CGO, but I feel it is worth mentioning here: Be VERY careful when using uintptr!

Once you cast a unsafe.Pointer to a uintptr, the Go runtime will not track that pointer address, and will not update it when running GC and moving stuff around on the heap. A uintptr is just a number.

As far as I know, pretty much the only valid uses of uintptr are to cast an unsafe.Pointer to one, add an offset, and immediately cast it back, or for use in reflection stuff, e.g. setting up slice headers.

2. Debugging Tips

This section is pretty short, since these are mostly covered in much greater detail elsewhere. This section is mostly to list some gotchas encountered when using these methods to debug CGO code.

GDB

In my opinion, GDB is absolutely invaluable in debugging complex Go code, and doubly invaluable when debugging interactions between C and Go code. It's a shame that it is so ignored by the core Go developers.

Debugging Go code itself is covered fairly extensively elsewhere, so the only but I want to mention about that is that you should not try and run a bt on goroutine 0, or "all", due to this bug.

For looking into CGO code, in addition to running all the goroutine sub-commands (goroutine 1 bt, etc.), you should also run the normal threads apply <N / all> bt <full>. This can allow you to gain a lot of insight into the C part of the code, where the Go GDB integration tends to fall short.

go trace & go profile

Simply put: Does not work very well when you use CGO, as it does not and/or cannot track Go interaction with C very well, or at all in some cases.

3. Further Reading

The source file src/cmd/go/doc.go has tons of info on CGO that is not part of the standard cmd/cgo documentation. I highly recommend you read it.
This issue thread is full of very valuable insight into some decisions, and inner workings of the Go runtime and CGO interactions.
The runtime and CGO source code. Seriously, it's mostly well-written, and OK to follow, even if it lacks some documentation.
Tweet me any more!

dwbuiten/CGO.md

Problems & Solutions for Interaction Between C and Go

1. Problems & Solutions

2. Debugging Tips

3. Further Reading

kfreezen commented Feb 26, 2019

Uh oh!

taosu1024 commented May 24, 2019

Uh oh!

fgm commented Oct 23, 2022

Uh oh!